Full Code of dandrino/terrain-erosion-3-ways for AI

master 58cdebf74bf9 cached

16 files

58.7 KB

15.6k tokens

41 symbols

1 requests

Download .txt

Repository: dandrino/terrain-erosion-3-ways
Branch: master
Commit: 58cdebf74bf9
Files: 16
Total size: 58.7 KB

Directory structure:
gitextract_5jiu_71x/

├── .gitignore
├── LICENSE.txt
├── README.md
├── domain_warping.py
├── download_ned_zips.py
├── extract_height_arrays.py
├── generate_ml_output.py
├── generate_training_images.py
├── make_grayscale_image.py
├── make_hillshaded_image.py
├── plain_old_fbm.py
├── requirements-pip3.txt
├── ridge_noise.py
├── river_network.py
├── simulation.py
└── util.py

================================================
FILE CONTENTS
================================================

================================================
FILE: .gitignore
================================================
# Python-related files
__pycache__/
*.py[cod]
*$py.class

# Training data files.
zip_files/
array_files/
training_images/
*.csv
*.pkl

# Generated terrain files.
ml_outputs/
sim_snaps/
*.np[yz]

# Misc files.
.DS_Store
*.swp


================================================
FILE: LICENSE.txt
================================================
MIT License

Copyright (c) 2018 Daniel Andrino

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.


================================================
FILE: README.md
================================================
# Three Ways of Generating Terrain with Erosion Features

## Background

Terrain generation has long been a popular topic in the procedural generation community, with applications in video games and movies. Some games use procedural terrain to generate novel environments on the fly for the player to explore. Others use procedural terrain as a tool for artists to use when crafting a believable world. 

The most common way of representing array is a 2D grid of height values. This type of terrain doesn't allow for overhangs and caves, but at large scales those features are not very apparent. The most popular terrain generation algorithms focus on adding together different layers of [coherent noise](http://libnoise.sourceforge.net/coherentnoise/index.html), which can be thought of as smoothed random noise. Several popular choices for coherent noise are:

* [**Perlin noise**](https://en.wikipedia.org/wiki/Perlin_noise) - A form of [gradient noise](https://en.wikipedia.org/wiki/Gradient_noise) on a rectangular lattice.
* [**Simplex noise**](https://en.wikipedia.org/wiki/Simplex_noise) - Like Perlin noise, but on a simplex lattice.
* [**Value noise**](https://en.wikipedia.org/wiki/Value_noise) - Basically just white noise that's been upscaled and interpolated.

If you take several layers of coherent noise, each at different levels of detail and with different amplitudes, you get a rough pattern frequently (and mostly inaccurately) called [**fBm**](https://en.wikipedia.org/wiki/Brownian_surface) (fractional Brownian motion). [This page](https://www.redblobgames.com/maps/terrain-from-noise/) provides a good overview for how this process works. 

In addition, there are other methods of generating fBm more directly, including:

* [**Diamond-square**](https://en.wikipedia.org/wiki/Diamond-square_algorithm) - A fast, but artifact-prone divide-and-conquer approach.
* **Power-law noise** - Created by filtering white noise in the frequency domain with a power-law function.

What you get from regular fBm terrain is something like this:

<p align="center">
  <img src="images/fbm_grayscale.png" width=40%>
  <img src="images/fbm_hillshaded.png" width=40%>
  <br>
  <em> Typical fBm-based terrain. Left is height as grayscale, right is with coloration and hillshading. </em>
</p>


This gives reasonable looking terrain at a quick glance. It generates distinguishable mountains and valleys, and has a general roughness one expects from rocky terrain.

However, it is also fairly boring. The fractal nature of fBm means everything more or less looks the same. Once you've seen one patch of land, you've basically seen it all. 

One method of adding a more organic look to terrain is to perform [domain warping](http://www.iquilezles.org/www/articles/warp/warp.htm), which is where you take regular fBm noise but offset each point by another fBm noise map. What you get is terrain that looks warped and twisted, somewhat resembling terrain that has been deformed by tectonic movement. The game No Man's Sky uses domain warping for its custom noise function called [uber noise](https://youtu.be/SePDzis8HqY?t=1547).

<p align="center">
  <img src="images/domain_warping_grayscale.png" width=40%>
  <img src="images/domain_warping_hillshaded.png" width=40%>
  <br><em> fBm with domain warping </em>
</p>

Another way of spicing up fBm is to modify each coherent noise layer before adding them together. For instance, if you take the absolute value of each coherent noise layer, and invert the final result you can get a mountain ridge effect: 

<p align="center">
  <img src="images/ridge_grayscale.png" width=40%>
  <img src="images/ridge_hillshaded.png" width=40%>
  <br><em> Modified fBm to create mountain ridges. </em>
</p>

These all look iteratively more convincing. However, if you look at actual elevation maps, you will notice that these look nothing like real life terrain:

<p align="center">
  <img src="images/real1_grayscale.png" width=40%>
  <img src="images/real1_hillshaded.png" width=40%>
  <img src="images/real2_grayscale.png" width=40%>
  <img src="images/real2_hillshaded.png" width=40%>

  <br><em> Elevation maps from somewhere in the continental United States (credit to the USGS). The right images uses the same coloration as above, for consistency. </em>
</p>

The fractal shapes you see in real life terrain are driven by **erosion**: the set of processes that describe terrain displacement over time. There are several types of erosion, but the one that most significantly causes those fractal shapes you see is **hydraulic erosion**, which is basically the process of terrain displacement via water. As water flows across terrain, it takes sediment with it and deposits it downhill. This has the effect of carving out mountains and creating smooth valleys. The fractal pattern emerges from smaller streams merging into larger streams and rivers as they flow downhill.

Unfortunately, more involved techniques are required to generate terrain with convincing erosion patterns. The following three sections will go over three distinct methods of generating eroded terrain. Each method has their pros and cons, so take that into consideration if you want to include them in your terrain project.


## Simulation

If real life erosion is driven by physical processes, couldn't we just simulate those processes to generate terrain with erosion? Then answer is, yes! The mechanics of hydraulic erosion, in particular, are well known and are fairly easily to simulate.

The basic idea of hydraulic erosion is that water dissolves terrain into sediment, which is then transported downhill and deposited. Programmatically, this means tracking the following quantities:

* **Terrain height** - The rock layer that we're interested in.
* **Water level** - How much water is at each grid point.
* **Sediment level** - The amount of sediment suspended in water.

When simulating, we make small changes to these quantities repeatedly until the erosion features emerge in our terrain.

To start off, we initiate the water and sediment levels to zero. The initial terrain height is seeded to some prior height map, frequently just regular fBm.

Each simulation iteration involves the following steps:

1. **Increment the water level** (as in via precipitation). For this I used a simple uniform random distribution, although some approaches use individual water "droplets".
1. **Compute the terrain gradient.** This is used to determine where water and sediment will flow, as well as the velocity of water at each point.
1. **Determine the sediment capacity** for each point. This is affected by the terrain slope, water velocity, and water volume. 
1. **Erode or deposit sediment**. If the sediment level is above the capacity, then sediment is deposited to terrain. Otherwise, terrain is eroded into sediment.
1. **Displace water and sediment downhill.** 
1. **Evaporate** some fraction of the water away.


Apply this process for long enough and you may get something like this:

<p align="center">
  <img src="images/simulation_grayscale.png" width=40%>
  <img src="images/simulation_hillshaded.png"" width=40%>
  <br><em>Terrain from simulated erosion. See <a href="https://drive.google.com/file/d/1iz3xl71qOVcPaSMZ95JyfXIU9exDy8TV/view?usp=sharing">here</a> for a time lapse.</em>
</p>

The results are fairly convincing. The tendril-like shape of ridges and cuts you see in real-life terrain are readily apparent. What also jumps out are the large, flat valleys that are the result of sediment deposition over time. If this simulation were left to continue indefinitely, eventually all mountains would be eroded into these flat sedimentary valleys.

Because of results like you see above, this method of generating terrain can be seen in professional terrain-authoring tools. The code for the terrain above is largely a vectorized implementation of the code found on [this page](http://ranmantaru.com/blog/2011/10/08/water-erosion-on-heightmap-terrain/). For a more theoretical approach, check out this [paper](https://hal.inria.fr/inria-00402079/document).


### Pros

* Lot of real-life terrain features simply emerge from running these rules, including stream downcutting, smooth valleys, and differential erosion.
* Instead of using global parameter values, different regions can be parameterized differently to develop distinct terrain features (e.g. deserts can evolve differently than forests).
* Fairly easy to parallelize given how straightforward vectorization is.

### Cons

* Parameter hell. There are around 10 constants that need to be set, in addition to other factors like the precipitation pattern and the initial terrain shape. Small changes to any of these can produce completely different results, so it can be difficult to find the ideal combination of parameters that produces good results.
* Fairly inefficient. Given an NxN grid, in order for changes on one side of the map to affect the opposite size you need O(N) iterations, which puts the overall runtime at O(N<sup>3</sup>). This means that doubling the grid dimension can result in 8x execution time. This performance cost further exacerbates the cost of parameter tweaking.
* Difficult to utilize to produce novel terrain. The results of simulation all look like reasonable approximations of real life terrain, however extending this to new types of terrain requires an understanding of the physical processes that would give way to that terrain, which can be prohibitively difficult. 


## Machine Learning

Machine learning is frequently uses as a panacea for all sorts of problems, and terrain generation is no exception. Machine learning can be effective so long as you have lots of compute power and a large, diverse dataset. Fortunately, compute power is easy to acquire, and lots of terrain elevation data is readily available to download.
 
The most suitable machine learning approach is to use a **Generative Adversarial Network (GAN)**. GANs are able to produce fairly convincing novel instances of a distribution described by training data. It works via two neural networks: one that produces new instances of the distribution (called the "generator"), and another whose job is to determine whether a provided terrain sample is real (i.e. from the training set), or fake (i.e. via the generator). For some more technical background, check out [these Stanford lectures](https://www.youtube.com/playlist?list=PL3FW7Lu3i5JvHM8ljYj-zLfQRF3EO8sYv).

Creating the right network and tuning all the different hyperparameters can be difficult and requires a lot of expertise to get right. Instead of creating the network from scratch, I will be building off of the work done for *Progressive Growing of GANs for Improved Quality, Stability, and Variation* by Karras, et al. ([paper](https://arxiv.org/pdf/1710.10196.pdf), [code](https://github.com/tkarras/progressive_growing_of_gans)). The basic approach of this paper is to train the network on lower resolution versions of the training samples while adding new layers for progressively higher resolutions. This makes the network converge quicker for high resolution images than it would if training from full resolution images to begin with.

### Training

Like with almost all machine learning projects, most effort is spent in data gathering, cleaning, validation, and training. 

The first step is to get real life terrain height data. For this demonstration, I used the [National Elevation Dataset (NED)](https://lta.cr.usgs.gov/NED) by the USGS. The dataset I used consists of ~1000 1x1 degree height maps with resolutions of 3600x3600 (i.e. pixel size of 1 arcsecond<sup>2</sup>).

From these height maps I will take 512x512 samples for use in training. In the source height arrays, each pixel is a square arcsecond, which means that each sample as-is will appear horizontally stretched, since a square arcsecond is spatially narrower than it is tall. After compensating for this, I also apply several heuristics to filter out what are likely sample unsuitable for training:

* Only accept samples who minimum and maximum elevation span a certain threshold. This approach prefers samples that are more "mountainous", and will therefore produce more noticeable erosion effects.
* Ignore samples if a certain percentage of grid points are within a certain margin of the sample's minimum elevation. This filters out samples that are largely flat, or ones that consist mostly of water.
* Ignore samples whose [Shannon entropy](https://en.wikipedia.org/wiki/Entropy_(information_theory)) is below a certain threshold. This helps filter out samples that have been corrupted (perhaps due to different libraries used to encode and decode the height data).

In addition, if we assume that terrain features do not have a directional preference, we can rotate each sample by 90° increments as well as flipping it to increase the dataset size by 8x. In the end, this nets us around 180,000 training samples.

These training samples are then used to train the GAN. Even using progressively grown GANs, this will still take quite a while to complete (expect around a week even with a beefy Nvidia Tesla GPU).

[Here](https://drive.google.com/file/d/1zdlgpkQu2zqWKJr23di73-lc3hJBAfqW/view?usp=sharing) is a timelapse video several terrain samples throughout the training process.


### Results

Once the network is trained, all we need to do is feed it a new random latent vector into the generator to create new terrain samples:

<p align="center">
  <img src="images/ml_generated_1_grayscale.png" width=40%>
  <img src="images/ml_generated_1_hillshaded.png" width=40%>
  <br>
  <img src="images/ml_generated_2_grayscale.png" width=40%>
  <img src="images/ml_generated_2_hillshaded.png" width=40%>
  <br>
  <img src="images/ml_generated_3_grayscale.png" width=40%>
  <img src="images/ml_generated_3_hillshaded.png" width=40%>
  <br>
  <img src="images/ml_generated_4_grayscale.png" width=40%>
  <img src="images/ml_generated_4_hillshaded.png" width=40%>
  <br>
  
  
  <em>ML-generated terrain.</em>
</p>


### Pros

* Generated terrain is basically indistinguishable from real-world elevation data. It captures not just erosion effects, but many other natural phenomena that shape terrain in nature.
* Generation is fairly efficient. Once you have a trained network, creating new terrain samples is fairly fast.

### Cons

* Training is *very* expensive (both in time and money). Lot of effort is required to acquire, clean, validate, and finally train the network. It took about 8 days to train the network used in the above examples.
* Very little control over the final product. The quality of generated terrain is basically driven by the training samples. Not only do you need a large number of training samples to generate good terrain, you also need good heuristics to make sure that each training sample is suitable. Because training takes so long, it isn't really practical to iterate on these heuristics to generate good results.
* Difficult to scale to higher resolutions. GANs are generally good a low resolution images. It gets much more expensive, both in terms of compute and memory costs, to scale up to higher resolution height maps.


## River Networks

In most procedural erosion techniques, terrain is carved out first and river placement happens afterward. An alternative method is to work backward: first generate where rivers and streams will be located, and from there determine how the terrain would be shaped to match the rivers. This eases the burden of creating river-friendly terrain by simply defining where the rivers are up front and working the terrain around them.

### Creating the River Network

Every stream eventually terminates somewhere, most frequently the ocean (they occasionally drain into inland bodies of water, but we will be ignoring those; these drainage basins are called [endorheic basins](https://en.wikipedia.org/wiki/Endorheic_basin)). Given that we need some ocean to drain into, this terrain will be generated as an island,

First we start off with what regions will be land or water. Using some simple fBm filtering, we get something like this:

<p align="center">
  <img src="images/land_mask.png" width=40%>
  <br><em>Land mask. Black is ocean, and white is land.</em>
</p>

The next step is to define the nodes on which the river network will be generated. A straightforward approach is to assign a node to each (x, y) coordinate of the image, however this has a tendency to create horizontal and vertical artifacts in the final product. Instead will we create out nodes by sampling some random points across the grid using [Poisson disc sampling](https://www.jasondavies.com/poisson-disc/). After that we use [Delaunay triangulation](https://en.wikipedia.org/wiki/Delaunay_triangulation) to connect the nodes.

<p align="center">
  <img src="images/poisson_disc_sampling.png" width=40%>
  <img src="images/delaunay_triangulation.png" width=40%>
  <br><em>Left are points via Poisson disc sampling. Right is their Delaunay triangulation.<br>
  The point spacing in these images is larger than what is used to generate the final terrain.
  </em>
</p>

Next, we generate the generic shape the terrain will have (which will later be "carved out" via erosion). Because endorheic basins are being avoided in this demo, this terrain is generated such that each point has a downhill path to the ocean (i.e. no landlocked valleys). Here is an example of such a shape:

<p align="center">
  <img src="images/initial_shape.png" width=40%>
  <br><em>Initial shape our terrain will take.</em>
</p>

The next step is to generate the river network. The general approach is to generate rivers starting from the mouth (i.e. where they terminate in the ocean) and growing the graph upstream one edge at a time until no more valid edges are left. A valid edge is one that:

* Moves uphill. Since we are growing the river graphs upstream, the end effect is only downhill-flowing rivers.
* Does not reconnect with an existing river graph. This results in rivers that only merge as they flow downhill, but never split.

Furthermore, we also prioritize which edge to add by how much it aligns with the previous edge in the graph. Without this, rivers will twist and turn in ways that don't appear natural. Furthermore, amount of "directional inertia" for each edge can be configured to get more twisty or straight rivers.

<p align="center">
  <img src="images/river_network_low_inertia.png" width=40%>
  <img src="images/river_network_high_inertia.png" width=40%>
  <br><em>River networks. Left and right have low and high directional inertia, respectively.</em>
</p>

After this, the water volume for each node in the river graph is calculated. This is basically done by giving each node a base water volume and adding the sum of all upstream nodes' volumes.

<p align="center">
  <img src="images/river_network_with_volume.png" width=40%>
  <br><em>River network with water volume.</em>
</p>


### Generating the Terrain

The next step is to generate the terrain height to match the river network. Each node of the river network graph will be assigned a height that will then be rendered via triangulation to get the final  height map as a 2D grid.

The graph traversal move uphill, starting from the water level. Each time an edge is traversed, the height of the next node will be proportional to the height difference in the initial terrain height generated earlier, scaled inversely by the volume of water along that edge. Furthermore, we will cap the height delta between any two nodes to give a thermal-erosion-like effect.

Traversing only the river network edges will produce discontinuities in the generated height, since no two distinct river "trees" can communicate with each other. When traversing, we will have to also allow traversing edges that span different river trees. For these edges, we simply assume the edge's water volume to be zero.

In the end, you get something like this:

<p align="center">
  <img src="images/river_network_grayscale.png" width=40%>
  <img src="images/river_network_hillshaded.png" width=40%>
  <br><em>Final terrain height map from river networks.</em>
</p>

If you're interested in an approach that blends river networks and simulation, check out [this paper](https://hal.inria.fr/hal-01262376/document).

### Pros

* Creates very convincing erosion-like ridges and cuts. The shape of the river network can easily be seen in the generated height map.
* Easy to add rivers if desired given the already-generated river network.
* Fairly efficient. Given an NxN height map, this algorithm takes O(N<sup>2</sup>log N) time.

### Cons

* This algorithm is good at carving out mountains, but needs work to generate other erosion effects like sediment deposition and differential erosion.
* Some of the algorithms used in this approach are a bit more difficult to parallelize (e.g. best first search).


## Running the Code

All the examples were generated with Python 3.6.0 using Numpy. I've gotten this code to work on OSX and Linux, but I haven't tried with Windows.

Most of the height maps above are generated by running a single python script, with the exception of machine learning which is a bit more involved (described farther down).

Here is a breakdown of all the simple terrain-generating scripts. All outputs are 512x512 grids.

| File | Output | Description
|:--- | :--- | :---
| `plain_old_fbm.py` | `fbm.npy` | Regular fBm
| `domain_warping.py` | `domain_warping.npy` | fBm with domain warping
| `ridge_noise.py` | `ridge.npy` | The noise with ridge-like effects seen above.
| `simulation.py` | `simulation.npy` | Eroded terrain via simulation.
| `river_network.py` | `river_network.npz` | Eroded terrain using river networks. The NPZ file also contains the height map

To generate the images used in this demo, use the `make_grayscale_image.py` and `make_hillshaded_image.py` scripts. Example: `python3 make_hillshaded_image.py input.npy output.png`


### Machine Learning

The machine learning examples are all heavily dependent on the [Progressive Growing of GANs](https://github.com/tkarras/progressive_growing_of_gans) project, so make sure to clone that repository. That project uses Tensorflow, and requires that you run on a machine with a GPU. If you have a GPU but Tensorflow doesn't see it, you probably have driver issues.

#### Creating the Training Data 

If you wish to train a custom network, you can use whatever source of data you want. For the above examples, I used the USGS.

The first step is to get the list of URLs pointing to the elevation data:

1. Go to the USGS [download application](https://viewer.nationalmap.gov/basic/)
1. Select the area from which you want to get elevation data.
1. On the left under **Data**, select **Elevation Product (3DEP)**, then **1 arc-second DEM**. You can choose other resolutions, but I found 1 arcsecond to be adequate.
1. Under **File Format**, make sure to select **IMG**.
1. Click on the **Find Products** button.
1. Click **Save as CSV**. If you wish to use your own download manager, also click **Save as Text**.

The next step is to download the actual elevation data. You can either use the `python3 download_ned_zips.py <downloaded CSV file>` which will download the files in the `zip_files/` directory. The USGS site gives this [guide](https://viewer.nationalmap.gov/uget-instructions/) to downloading the files via uGet.

The next step is to convert the elevation data from IMG files in a ZIP archive to Numpy array files. You can do this by calling `python3 extract_height_arrays.py <downloaded CSV file>`. This will write the Numpy arrays to the `array_files/` directory.

After this, run `python3 generate_training_images.py`, which will go through each array in the `array_files/` directory, and create 512x512 training sample images from it (written to the `training_samples/` directory). This script performs the validation and filtering described above. It also takes a long time to run, so brew a pot of coffee before you kick it off.

The next steps will require that you cloned the `progressive_growing_of_gans` project. First, you need to generate the training data in the `tfrecords` format. This can be done by calling:

`progressive_growing_of_gans/: python3 dataset_tool.py /path/to/erosion_3_ways datasets/terrain`

I chose `terrain` as the output directory, but you can use whatever you want (just make sure it's in the `datasets/` directory.

Almost there! The next step is to edit `config.py` and add the following line to the dataset section:

`desc += '-terrain'; dataset = EasyDict(tfrecord_dir='terrain')`

Make sure to uncomment/delete the "celebahq" line. 

Now, you can finally run `python3 train.py`. Even with a good graphics card, this will take days to run. For further training customizations, check out [this section](https://github.com/tkarras/progressive_growing_of_gans#preparing-datasets-for-training).

When you're done, the `results/` directory will contain all sorts of training outputs, including progress images, Tensorboard logs, and (most importantly) the PKL files containing the network weights.

#### Generating Terrain Samples

To generate samples, run the following script:

```python3 generate_ml_output.py path/to/progressive_growing_of_gans network_weights.pkl 10```

The arguments are:

1. The path to the cloned `progressive_growing_of_gans` repository.
1. The network weights file (the one used for this demo can be found [here](https://drive.google.com/file/d/1czHFcF2ZG_lki7TAQyYCoqtsVcJmdCUN/view?usp=sharing)).
1. The number of terrain samples to generate (optional, defaults to 20)

The outputs are written to the `ml_outputs` directory.


================================================
FILE: domain_warping.py
================================================
#!/usr/bin/python3

# A simple domain warping example.

import numpy as np
import sys
import util


def main(argv):
  shape = (512,) * 2

  values = util.fbm(shape, -2, lower=2.0)
  offsets = 150 * (util.fbm(shape, -2, lower=1.5) +
                   1j * util.fbm(shape, -2, lower=1.5))
  result = util.sample(values, offsets)
  np.save('domain_warping', result)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: download_ned_zips.py
================================================
#!/usr/bin/python3

# Script to hep in downloading files from USGS. The input file is a CSV
# of 3DEP IMG resources downloaded from https://viewer.nationalmap.gov/basic/
# This script is really a poor man's download manager; use any solution you 
# prefer.

import os
import shutil
import sys
import tempfile
import time
import urllib
import util


# Gets list of src ids of previously downloaded files.
def get_previously_downloaded_ids(dir_path):
  return set((os.path.splitext(file_path)[0]
              for file_path in os.listdir(dir_path)))


# Download the file for `src_id` from `url` to `output_dir`. Uses `tmp_dir` an
# intermediate so that aborted downloads are not included in
# get_previously_downloaded_ids above.
def download_file(src_id, url, output_dir, tmp_dir):
  output_file = src_id + '.zip'
  tmp_path = os.path.join(tmp_dir, output_file)

  # Download and save to temp dir
  try:
    urllib.urlretrieve(url, tmp_path)
  except IOError as e:
    print(e)
    return False

  # Move file in temp dir to final output dir
  shutil.move(tmp_path, output_dir)

  return True


def main(argv):
  my_dir = os.path.dirname(argv[0])
  output_dir = os.path.join(my_dir, 'zip_files')
  tmp_dir = '/tmp'

  if len(argv) != 2:
    print('Usage: %s <ned_file.csv>' % (argv[0]))
    sys.exit(-1)

  csv_path = argv[1]

  try: os.mkdir(output_dir)
  except: pass

  downloaded_ids = get_previously_downloaded_ids(output_dir)
  entries = util.read_csv(csv_path)

  for index, entry in enumerate(entries):
    src_id = entry['sourceId']
    download_url = entry['downloadURL']
    pretty_size = entry['prettyFileSize']
    data_format = entry['format']

    # Don't download the same file more than once.
    if src_id in downloaded_ids:
      print('Skipping %s' % src_id)
      continue

    print('(%d / %d) Processing %s of size %s from %s'
          % (index + 1, len(entries), src_id, pretty_size, download_url))

    # Simple data format sanity check.
    if entry['format'] != 'IMG':
      print('Unknown format %s, ignoring...' % (data_format,))
      continue

    # Download file.
    if not download_file(src_id, download_url, output_dir, tmp_dir):
      print('Failed to download from %s' % download_url)
      continue


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: extract_height_arrays.py
================================================
#!/usr/bin/python3

# Extracts the underlying heightmap in each file in zip_files/ and writes the 
# numpy array to array_files/

import csv
import json
import numpy as np
import os
from osgeo import gdal
import re
import shutil
import sys
import tempfile
import util
import zipfile


# Extracts the IMG file from the ZIP archive and returns a Numpy array, or None # if reading or parsing failed.
def get_img_array_from_zip(zip_file, img_name):
  with tempfile.NamedTemporaryFile() as temp_file:
    # Copy to temp file.
    with zip_file.open(img_name) as img_file:
        shutil.copyfileobj(img_file, temp_file)

    # Extract as numpy array.
    geo = gdal.Open(temp_file.name)
    return geo.ReadAsArray() if geo is not None else None


def main(argv):
  my_dir = os.path.dirname(argv[0])
  input_dir = os.path.join(my_dir, 'zip_files')
  output_dir = os.path.join(my_dir, 'array_files')

  if len(argv) != 2:
    print('Usage: %s <ned_file.csv>' % (argv[0]))
    sys.exit(-1)

  csv_path = argv[1]

  # Make the output directory if it doesn't exist yet.
  try: os.mkdir(output_dir)
  except: pass

  entries = util.read_csv(csv_path)
  for index, entry in enumerate(entries):
    src_id = entry['sourceId']
    print('(%d / %d) Processing %s' % (index + 1, len(entries), src_id))
    zip_path = os.path.join(input_dir, src_id + '.zip')

    try:
      # Go though each zip file.
      with zipfile.ZipFile(zip_path, mode='r') as zf:
        ext_names = [name for name in zf.namelist()
                     if os.path.splitext(name)[1] == '.img']
        # Check if EXT files.
        if len(ext_names) == 0:
          print('No IMG files found for %s' % (src_id))
          continue;

        # Warn if there is more than one IMG file
        if len(ext_names) > 1:
          print('More than one IMG file found for %s: %s' % (src_id, ext_names))

        # Get the bounding box. The string manipulation is required given that 
        # the provided dict is not proper JSON
        bounding_box_raw = entry['boundingBox']
        bounding_box_json = re.sub(r'([a-zA-Z]+):', r'"\1":', bounding_box_raw)
        bounding_box = json.loads(bounding_box_json)

        # Create numpy array from IMG file and write it to output
        array = get_img_array_from_zip(zf, ext_names[0])
        if array is not None:
          output_path = os.path.join(output_dir, src_id + '.npz')
          np.savez(output_path, height=array, **bounding_box)
        else:
          print('Failed to load array for %s' % src_id)
        

    except (zipfile.BadZipfile, IOError) as e:
      # Invalid or missing ZIP file.
      print(e)
      continue
   

if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: generate_ml_output.py
================================================
#!/usrb/bin/python3

# Prouces terrain samples from the trained generator network.

import numpy as np
import os
import pickle
import sys
import tensorflow as tf


def main(argv):
  # First argument is the path to the progressive_growing_of_gans clone. This
  # is needed to for proper loading of the weights via pickle.
  # Second argument is the network weights pickle file.
  # Third argument is the number of output samples to generate. Defaults to 20
  if len(argv) < 3:
    print('Usage: %s path/to/progressive_growing_of_gans weights.pkl '
           '[number_of_samples]' % argv[0])
    sys.exit(-1)
  my_dir = os.path.dirname(argv[0])
  pgog_path = argv[1]
  weight_path = argv[2]
  num_samples = int(argv[3]) if len(argv) >= 4 else 20

  # Load the GAN tensors.
  tf.InteractiveSession()
  sys.path.append(pgog_path)
  with open(weight_path, 'rb') as f:
    G, D, Gs = pickle.load(f)

  # Generate input vectors.
  latents = np.random.randn(num_samples, *Gs.input_shapes[0][1:])
  labels = np.zeros([latents.shape[0]] + Gs.input_shapes[1][1:])

  # Run generator to create samples.
  samples = Gs.run(latents, labels)

  # Make output directory
  output_dir = os.path.join(my_dir, 'ml_outputs')
  try: os.mkdir(output_dir)
  except: pass

  # Write outputs.
  for idx in range(samples.shape[0]):
    sample = (np.clip(np.squeeze((samples[idx, 0, :, :] + 1.0) / 2), 0.0, 1.0)
                 .astype('float64'))
    np.save(os.path.join(output_dir, '%d.npy' % idx), sample)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: generate_training_images.py
================================================
#!/usr/bin/python

# Reads the Numpy arrays in array_files/ and generates images for use in
# training. Please note that this script takes a long time to run.

import cv2
import numpy as np
import skimage.measure
import os
import sys
import util


# Filters and cleans the given sample. Uses rough heuristics to determine which
# samples are suitable for training via rough heuristics.
def clean_sample(sample):
  # Get rid of "out-of-bounds" magic values.
  sample[sample == np.finfo('float32').min] = 0.0

  # Ignore any samples with NaNs, for one reason or another.
  if np.isnan(sample).any(): return None

  # Only accept values that span a given range. This is to capture more
  # mountainous samples.
  if (sample.max() - sample.min()) < 40: return None
  
  # Filter out samples for which a significant portion is within a small 
  # threshold from the minimum value. This helps filter out samples that
  # contain a lot of water.
  near_min_fraction = (sample < (sample.min() + 8)).sum() / sample.size
  if near_min_fraction > 0.2: return None

  # Low entropy samples likely have some file corruption or some other artifact
  # that would make it unsuitable as a training sample.
  entropy = skimage.measure.shannon_entropy(sample)
  if entropy < 10.0: return None

  return util.normalize(sample)


# This function returns rotated and flipped variants of the provided array. This
# increases the number of training samples by a factor of 8.
def get_variants(a):
  for b in (a, a.T):  # Original and flipped.
    for k in range(0, 4):   # Rotated 90 degrees x 4
      yield np.rot90(b, k)


def main(argv):
  my_dir = os.path.dirname(argv[0])
  source_array_dir = os.path.join(my_dir, 'array_files')
  training_samples_dir = os.path.join(my_dir, 'training_samples')
  sample_dim = 512
  sample_shape = (sample_dim,) * 2
  sample_area = np.prod(sample_shape)

  # Create the training sample directory, if it doesn't already exist.
  try: os.mkdir(training_samples_dir)
  except: pass
  
  source_array_paths = [os.path.join(source_array_dir, path)
                        for path in os.listdir(source_array_dir)]

  training_id = 0
  for (index, source_array_path) in enumerate(source_array_paths):
    print('(%d / %d) Created %d samples so far'
          % (index + 1, len(source_array_paths), training_id))
    data = np.load(source_array_path)

    # Load heightmap and correct for latitude (to an approximation)
    source_array_raw = data['height']
    latitude_deg = (data['minY'] + data['maxY']) / 2
    latitude_correction = np.cos(np.radians(latitude_deg))
    source_array_shape = (
           int(np.round(source_array_raw.shape[0] * latitude_correction)),
           source_array_raw.shape[1])
    source_array = cv2.resize(source_array_raw, source_array_shape)

    # Determine the number of samples to use per source array.
    sampleable_area = np.subtract(source_array_shape, sample_shape).prod()
    samples_per_array = int(np.ceil(sampleable_area / sample_area))

    if len(source_array.shape) == 0:
      print('Invalid array at %s' % source_array_path)
      continue

    for _ in range(samples_per_array):
      # Select a sample from the source array.
      row = np.random.randint(source_array.shape[0] - sample_shape[0])
      col = np.random.randint(source_array.shape[1] - sample_shape[1])
      sample = source_array[row:(row + sample_shape[0]),
                            col:(col + sample_shape[1])]

      # Scale and clean the sample
      sample = clean_sample(sample)

      # Write the sample to a file
      if sample is not None:
        for variant in get_variants(sample):
            output_path = os.path.join(
                training_samples_dir, str(training_id) + '.png')
            util.save_as_png(variant, output_path)

            training_id += 1


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: make_grayscale_image.py
================================================
#!/usr/bin/python3

# Genreates a PNG containing the terrain height in grayscale.

import util
import sys


def main(argv):
  if len(argv) != 3:
    print('Usage: %s <input_array.np[yz]> <output_image.png>' % (argv[0],))
    sys.exit(-1)

  input_path = argv[1]
  output_path = argv[2]

  height, _ = util.load_from_file(input_path)
  util.save_as_png(height, output_path)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: make_hillshaded_image.py
================================================
#!/usr/bin/python3

# Genreates a PNG containing a hillshaded version of the terrain height.

import util
import sys


def main(argv):
  if len(argv) != 3:
    print('Usage: %s <input_array.np[yz]> <output_image.png>' % (argv[0],))
    sys.exit(-1)

  input_path = argv[1]
  output_path = argv[2]

  height, land_mask = util.load_from_file(input_path)
  util.save_as_png(util.hillshaded(height, land_mask=land_mask), output_path)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: plain_old_fbm.py
================================================
#!/usr/bin/python3

# A demo of just regular FBM noise

import numpy as np
import sys
import util


def main(argv):
  shape = (512,) * 2
  np.save('fbm', util.fbm(shape, -2, lower=2.0))


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: requirements-pip3.txt
================================================
# GDAL is a bit janky, so you may have to compile and install yourself.
# See https://trac.osgeo.org/gdal/wiki/DownloadSource
GDAL==2.3.2
matplotlib==3.0.0
numpy==1.15.2
opencv-python==3.4.3.18
Pillow==8.2.0
scikit-image==0.14.1
scipy==1.1.0
six==1.11.0
# You will also have to install CUDA and cuDNN drivers for tensorflow to work.
tensorboard==1.11.0
tensorflow==2.5.0
tensorflow-gpu==2.3.1
tensorflow-tensorboard==1.5.1
urllib3==1.26.5


================================================
FILE: ridge_noise.py
================================================
#!/usr/bin/python3

# A demo of ridge noise.

import numpy as np
import sys
import util

def noise_octave(shape, f):
  return util.fbm(shape, -1, lower=f, upper=(2 * f))

def main(argv):
  shape = (512,) * 2

  values = np.zeros(shape)
  for p in range(1, 10):
    a = 2 ** p
    values += np.abs(noise_octave(shape, a) - 0.5)/ a 
  result = (1.0 - util.normalize(values)) ** 2

  np.save('ridge', result)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: river_network.py
================================================
#!/usr/bin/python3

import collections
import heapq
import numpy as np
import matplotlib
import matplotlib.collections as mc
import matplotlib.pyplot as plt
import scipy as sp
import scipy.spatial
import skimage.measure
import sys
import util


# Returns the index of the smallest value of `a`
def min_index(a): return a.index(min(a))


# Returns an array with a bump centered in the middle of `shape`. `sigma`
# determines how wide the bump is.
def bump(shape, sigma):
  [y, x] = np.meshgrid(*map(np.arange, shape))
  r = np.hypot(x - shape[0] / 2, y - shape[1] / 2)
  c = min(shape) / 2
  return np.tanh(np.maximum(c - r, 0.0) / sigma)


# Returns a list of heights for each point in `points`.
def compute_height(points, neighbors, deltas, get_delta_fn=None):
  if get_delta_fn is None:
    get_delta_fn = lambda src, dst: deltas[dst]

  dim = len(points)
  result = [None] * dim
  seed_idx = min_index([sum(p) for p in points])
  q = [(0.0, seed_idx)]

  while len(q) > 0:
    (height, idx) = heapq.heappop(q)
    if result[idx] is not None: continue
    result[idx] = height
    for n in neighbors[idx]:
      if result[n] is not None: continue
      heapq.heappush(q, (get_delta_fn(idx, n) + height, n))
  return util.normalize(np.array(result))


# Same as above, but computes height taking into account river downcutting.
# `max_delta` determines the maximum difference in neighboring points (to
# give the effect of talus slippage). `river_downcutting_constant` affects how
# deeply rivers cut into terrain (higher means more downcutting).
def compute_final_height(points, neighbors, deltas, volume, upstream,
                         max_delta, river_downcutting_constant):
  dim = len(points)
  result = [None] * dim
  seed_idx = min_index([sum(p) for p in points])
  q = [(0.0, seed_idx)]

  def get_delta(src, dst):
    v = volume[dst] if (dst in upstream[src]) else 0.0
    downcut = 1.0 / (1.0 + v ** river_downcutting_constant) 
    return min(max_delta, deltas[dst] * downcut)

  return compute_height(points, neighbors, deltas, get_delta_fn=get_delta)


# Computes the river network that traverses the terrain.
#   Arguments:
#   * points: The (x,y) coordinates of each point
#   * neghbors: Set of each neighbor index for each point.
#   * heights: The height of each point.
#   * land: Indicates whether each point is on land or water.
#   * directional_interta: indicates how straight the rivers are
#       (0 = no directional inertia, 1 = total directional inertia).
#   * default_water_level: How much water is assigned by default to each point
#   * evaporation_rate: How much water is evaporated as it traverses from along
#       each river edge.
#  
#  Returns a 3-tuple of:
#  * List of indices of all points upstream from each point 
#  * List containing the index of the point downstream of each point.
#  * The water volume of each point.
def compute_river_network(points, neighbors, heights, land,
                          directional_inertia, default_water_level,
                          evaporation_rate):
  num_points = len(points)

  # The normalized vector between points i and j
  def unit_delta(i, j):
    delta = points[j] - points[i]
    return delta / np.linalg.norm(delta)

  # Initialize river priority queue with all edges between non-land points to
  # land points. Each entry is a tuple of (priority, (i, j, river direction))
  q = []
  roots = set()
  for i in range(num_points):
    if land[i]: continue
    is_root = True
    for j in neighbors[i]:
      if not land[j]: continue
      is_root = True
      heapq.heappush(q, (-1.0, (i, j, unit_delta(i, j))))
    if is_root: roots.add(i)

  # Compute the map of each node to its downstream node.
  downstream = [None] * num_points

  while len(q) > 0:
    (_, (i, j, direction)) = heapq.heappop(q)

    # Assign i as being downstream of j, assuming such a point doesn't
    # already exist.
    if downstream[j] is not None: continue
    downstream[j] = i

    # Go through each neighbor of upstream point j.
    for k in neighbors[j]:
      # Ignore neighbors that are lower than the current point, or who already 
      # have an assigned downstream point.
      if (heights[k] < heights[j] or downstream[k] is not None
          or not land[k]):
        continue

      # Edges that are aligned with the current direction vector are
      # prioritized.
      neighbor_direction = unit_delta(j, k)
      priority = -np.dot(direction, neighbor_direction)

      # Add new edge to queue.
      weighted_direction = util.lerp(neighbor_direction, direction,
                                     directional_inertia)
      heapq.heappush(q, (priority, (j, k, weighted_direction)))


  # Compute the mapping of each node to its upstream nodes.
  upstream = [set() for _ in range(num_points)]
  for i, j in enumerate(downstream):
    if j is not None: upstream[j].add(i)

  # Compute the water volume for each node.
  volume = [None] * num_points
  def compute_volume(i):
    if volume[i] is not None: return
    v = default_water_level
    for j in upstream[i]:
      compute_volume(j)
      v += volume[j]
    volume[i] = v * (1 - evaporation_rate)

  for i in range(0, num_points): compute_volume(i)

  return (upstream, downstream, volume)


# Renders `values` for each triangle in `tri` on an array the size of `shape`.
def render_triangulation(shape, tri, values):
  points = util.make_grid_points(shape)
  triangulation = matplotlib.tri.Triangulation(
      tri.points[:,0], tri.points[:,1], tri.simplices)
  interp = matplotlib.tri.LinearTriInterpolator(triangulation, values)
  return interp(points[:,0], points[:,1]).reshape(shape).filled(0.0)


# Removes any bodies of water completely enclosed by land.
def remove_lakes(mask):
  labels = skimage.measure.label(mask)
  new_mask = np.zeros_like(mask, dtype=bool)
  labels = skimage.measure.label(~mask, connectivity=1)
  new_mask[labels != labels[0, 0]] = True
  return new_mask


def main(argv):
  dim = 512
  shape = (dim,) * 2
  disc_radius = 1.0
  max_delta = 0.05
  river_downcutting_constant = 1.3
  directional_inertia = 0.4
  default_water_level = 1.0
  evaporation_rate = 0.2

  print ('Generating...')

  print('  ...initial terrain shape')
  land_mask = remove_lakes(
      (util.fbm(shape, -2, lower=2.0) + bump(shape, 0.2 * dim) - 1.1) > 0)
  coastal_dropoff = np.tanh(util.dist_to_mask(land_mask) / 80.0) * land_mask
  mountain_shapes = util.fbm(shape, -2, lower=2.0, upper=np.inf)
  initial_height = ( 
      (util.gaussian_blur(np.maximum(mountain_shapes - 0.40, 0.0), sigma=5.0) 
        + 0.1) * coastal_dropoff)
  deltas = util.normalize(np.abs(util.gaussian_gradient(initial_height))) 

  print('  ...sampling points')
  points = util.poisson_disc_sampling(shape, disc_radius)
  coords = np.floor(points).astype(int)


  print('  ...delaunay triangulation')
  tri = sp.spatial.Delaunay(points)
  (indices, indptr) = tri.vertex_neighbor_vertices
  neighbors = [indptr[indices[k]:indices[k + 1]] for k in range(len(points))]
  points_land = land_mask[coords[:, 0], coords[:, 1]]
  points_deltas = deltas[coords[:, 0], coords[:, 1]]

  print('  ...initial height map')
  points_height = compute_height(points, neighbors, points_deltas)

  print('  ...river network')
  (upstream, downstream, volume) = compute_river_network(
      points, neighbors, points_height, points_land,
      directional_inertia, default_water_level, evaporation_rate)

  print('  ...final terrain height')
  new_height = compute_final_height(
      points, neighbors, points_deltas, volume, upstream, 
      max_delta, river_downcutting_constant)
  terrain_height = render_triangulation(shape, tri, new_height)

  np.savez('river_network', height=terrain_height, land_mask=land_mask)


if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: simulation.py
================================================
#!/usr/bin/python3

# Semi-phisically-based hydraulic erosion simulation. Code is inspired by the 
# code found here:
#   http://ranmantaru.com/blog/2011/10/08/water-erosion-on-heightmap-terrain/
# With some theoretical inspiration from here:
#   https://hal.inria.fr/inria-00402079/document

import numpy as np
import scipy as sp
import matplotlib.pyplot as plt
import os
import sys
import util


# Smooths out slopes of `terrain` that are too steep. Rough approximation of the
# phenomenon described here: https://en.wikipedia.org/wiki/Angle_of_repose
def apply_slippage(terrain, repose_slope, cell_width):
  delta = util.simple_gradient(terrain) / cell_width
  smoothed = util.gaussian_blur(terrain, sigma=1.5)
  should_smooth = np.abs(delta) > repose_slope
  result = np.select([np.abs(delta) > repose_slope], [smoothed], terrain)
  return result


def main(argv):
  # Grid dimension constants
  full_width = 200
  dim = 512
  shape = [dim] * 2
  cell_width = full_width / dim
  cell_area = cell_width ** 2

  # Snapshotting parameters. Only needed for generating the simulation
  # timelapse.
  enable_snapshotting = False
  my_dir = os.path.dirname(argv[0])
  snapshot_dir = os.path.join(my_dir, 'sim_snaps')
  snapshot_file_template = 'sim-%05d.png'
  if enable_snapshotting:
    try: os.mkdir(snapshot_dir)
    except: pass

  # Water-related constants
  rain_rate = 0.0008 * cell_area
  evaporation_rate = 0.0005

  # Slope constants
  min_height_delta = 0.05
  repose_slope = 0.03
  gravity = 30.0
  gradient_sigma = 0.5

  # Sediment constants
  sediment_capacity_constant = 50.0
  dissolving_rate = 0.25
  deposition_rate = 0.001

  # The numer of iterations is proportional to the grid dimension. This is to 
  # allow changes on one side of the grid to affect the other side.
  iterations = int(1.4 * dim)

  # `terrain` represents the actual terrain height we're interested in
  terrain = util.fbm(shape, -2.0)

  # `sediment` is the amount of suspended "dirt" in the water. Terrain will be
  # transfered to/from sediment depending on a number of different factors.
  sediment = np.zeros_like(terrain)

  # The amount of water. Responsible for carrying sediment.
  water = np.zeros_like(terrain)

  # The water velocity.
  velocity = np.zeros_like(terrain)

  for i in range(0, iterations):
    print('%d / %d' % (i + 1, iterations))

    # Add precipitation. This is done by via simple uniform random distribution,
    # although other models use a raindrop model
    water += np.random.rand(*shape) * rain_rate

    # Compute the normalized gradient of the terrain height to determine where 
    # water and sediment will be moving.
    gradient = np.zeros_like(terrain, dtype='complex')
    gradient = util.simple_gradient(terrain)
    gradient = np.select([np.abs(gradient) < 1e-10],
                             [np.exp(2j * np.pi * np.random.rand(*shape))],
                             gradient)
    gradient /= np.abs(gradient)

    # Compute the difference between teh current height the height offset by
    # `gradient`.
    neighbor_height = util.sample(terrain, -gradient)
    height_delta = terrain - neighbor_height
    
    # The sediment capacity represents how much sediment can be suspended in
    # water. If the sediment exceeds the quantity, then it is deposited,
    # otherwise terrain is eroded.
    sediment_capacity = (
        (np.maximum(height_delta, min_height_delta) / cell_width) * velocity *
        water * sediment_capacity_constant)
    deposited_sediment = np.select(
        [
          height_delta < 0, 
          sediment > sediment_capacity,
        ], [
          np.minimum(height_delta, sediment),
          deposition_rate * (sediment - sediment_capacity),
        ],
        # If sediment <= sediment_capacity
        dissolving_rate * (sediment - sediment_capacity))

    # Don't erode more sediment than the current terrain height.
    deposited_sediment = np.maximum(-height_delta, deposited_sediment)

    # Update terrain and sediment quantities.
    sediment -= deposited_sediment
    terrain += deposited_sediment
    sediment = util.displace(sediment, gradient)
    water = util.displace(water, gradient)

    # Smooth out steep slopes.
    terrain = apply_slippage(terrain, repose_slope, cell_width)

    # Update velocity
    velocity = gravity * height_delta / cell_width
  
    # Apply evaporation
    water *= 1 - evaporation_rate

    # Snapshot, if applicable.
    if enable_snapshotting:
      output_path = os.path.join(snapshot_dir, snapshot_file_template % i)
      util.save_as_png(terrain, output_path)


  np.save('simulation', util.normalize(terrain))

  
if __name__ == '__main__':
  main(sys.argv)


================================================
FILE: util.py
================================================
# Various common functions.

from PIL import Image
import collections
import csv
from matplotlib.colors import LightSource, LinearSegmentedColormap
import matplotlib.pyplot as plt
import numpy as np
import scipy as sp
import scipy.spatial


# Open CSV file as a dict.
def read_csv(csv_path):
  with open(csv_path, 'r') as csv_file:
    return list(csv.DictReader(csv_file))


# Renormalizes the values of `x` to `bounds`
def normalize(x, bounds=(0, 1)):
  return np.interp(x, (x.min(), x.max()), bounds)


# Fourier-based power law noise with frequency bounds.
def fbm(shape, p, lower=-np.inf, upper=np.inf):
  freqs = tuple(np.fft.fftfreq(n, d=1.0 / n) for n in shape)
  freq_radial = np.hypot(*np.meshgrid(*freqs))
  envelope = (np.power(freq_radial, p, where=freq_radial!=0) *
              (freq_radial > lower) * (freq_radial < upper))
  envelope[0][0] = 0.0
  phase_noise = np.exp(2j * np.pi * np.random.rand(*shape))
  return normalize(np.real(np.fft.ifft2(np.fft.fft2(phase_noise) * envelope)))


# Returns each value of `a` with coordinates offset by `offset` (via complex 
# values). The values at the new coordiantes are the linear interpolation of
# neighboring values in `a`.
def sample(a, offset):
  shape = np.array(a.shape)
  delta = np.array((offset.real, offset.imag))
  coords = np.array(np.meshgrid(*map(range, shape))) - delta

  lower_coords = np.floor(coords).astype(int)
  upper_coords = lower_coords + 1
  coord_offsets = coords - lower_coords 
  lower_coords %= shape[:, np.newaxis, np.newaxis]
  upper_coords %= shape[:, np.newaxis, np.newaxis]

  result = lerp(lerp(a[lower_coords[1], lower_coords[0]],
                     a[lower_coords[1], upper_coords[0]],
                     coord_offsets[0]),
                lerp(a[upper_coords[1], lower_coords[0]],
                     a[upper_coords[1], upper_coords[0]],
                     coord_offsets[0]),
                coord_offsets[1])
  return result


# Takes each value of `a` and offsets them by `delta`. Treats each grid point
# like a unit square.
def displace(a, delta):
  fns = {
      -1: lambda x: -x,
      0: lambda x: 1 - np.abs(x),
      1: lambda x: x,
  }
  result = np.zeros_like(a)
  for dx in range(-1, 2):
    wx = np.maximum(fns[dx](delta.real), 0.0)
    for dy in range(-1, 2):
      wy = np.maximum(fns[dy](delta.imag), 0.0)
      result += np.roll(np.roll(wx * wy * a, dy, axis=0), dx, axis=1)

  return result


# Returns the gradient of the gaussian blur of `a` encoded as a complex number. 
def gaussian_gradient(a, sigma=1.0):
  [fy, fx] = np.meshgrid(*(np.fft.fftfreq(n, 1.0 / n) for n in a.shape))
  sigma2 = sigma**2
  g = lambda x: ((2 * np.pi * sigma2) ** -0.5) * np.exp(-0.5 * (x / sigma)**2)
  dg = lambda x: g(x) * (x / sigma2)

  fa = np.fft.fft2(a)
  dy = np.fft.ifft2(np.fft.fft2(dg(fy) * g(fx)) * fa).real
  dx = np.fft.ifft2(np.fft.fft2(g(fy) * dg(fx)) * fa).real
  return 1j * dx + dy


# Simple gradient by taking the diff of each cell's horizontal and vertical
# neighbors.
def simple_gradient(a):
  dx = 0.5 * (np.roll(a, 1, axis=0) - np.roll(a, -1, axis=0))
  dy = 0.5 * (np.roll(a, 1, axis=1) - np.roll(a, -1, axis=1))
  return 1j * dx + dy


# Loads the terrain height array (and optionally the land mask from the given 
# file.
def load_from_file(path):
  result = np.load(path)
  if type(result) == np.lib.npyio.NpzFile:
    return (result['height'], result['land_mask'])
  else:
    return (result, None)


# Saves the array as a PNG image. Assumes all input values are [0, 1]
def save_as_png(a, path):
  image = Image.fromarray(np.round(a * 255).astype('uint8'))
  image.save(path)


# Creates a hillshaded RGB array of heightmap `a`.
_TERRAIN_CMAP = LinearSegmentedColormap.from_list('my_terrain', [
    (0.00, (0.15, 0.3, 0.15)),
    (0.25, (0.3, 0.45, 0.3)),
    (0.50, (0.5, 0.5, 0.35)),
    (0.80, (0.4, 0.36, 0.33)),
    (1.00, (1.0, 1.0, 1.0)),
])
def hillshaded(a, land_mask=None, angle=270):
  if land_mask is None: land_mask = np.ones_like(a)
  ls = LightSource(azdeg=angle, altdeg=30)
  land = ls.shade(a, cmap=_TERRAIN_CMAP, vert_exag=10.0,
                  blend_mode='overlay')[:, :, :3]
  water = np.tile((0.25, 0.35, 0.55), a.shape + (1,))
  return lerp(water, land, land_mask[:, :, np.newaxis])


# Linear interpolation of `x` to `y` with respect to `a`
def lerp(x, y, a): return (1.0 - a) * x + a * y


# Returns a list of grid coordinates for every (x, y) position bounded by
# `shape`
def make_grid_points(shape):
  [Y, X] = np.meshgrid(np.arange(shape[0]), np.arange(shape[1])) 
  grid_points = np.column_stack([X.flatten(), Y.flatten()])
  return grid_points


# Returns a list of points sampled within the bounds of `shape` and with a
# minimum spacing of `radius`.
# NOTE: This function is fairly slow, given that it is implemented with almost
# no array operations.
def poisson_disc_sampling(shape, radius, retries=16):
  grid = {}
  points = []

  # The bounds of `shape` are divided into a grid of cells, each of which can
  # contain a maximum of one point.
  cell_size = radius / np.sqrt(2)
  cells = np.ceil(np.divide(shape, cell_size)).astype(int)
  offsets = [(0, 0), (0, -1), (0, 1), (-1, 0), (1, 0), (-1, -1), (-1, 1),
             (1, -1), (1, 1), (-2, 0), (2, 0), (0, -2), (0, 2)]
  to_cell = lambda p: (p / cell_size).astype('int')

  # Returns true if there is a point within `radius` of `p`.
  def has_neighbors_in_radius(p):
    cell = to_cell(p)
    for offset in offsets:
      cell_neighbor = (cell[0] + offset[0], cell[1] + offset[1])
      if cell_neighbor in grid:
        p2 = grid[cell_neighbor]
        diff = np.subtract(p2, p)
        if np.dot(diff, diff) <= radius * radius:
          return True
    return False      

  # Adds point `p` to the cell grid.
  def add_point(p):
    grid[tuple(to_cell(p))] = p
    q.append(p)
    points.append(p)

  q = collections.deque()
  first = shape * np.random.rand(2)
  add_point(first)
  while len(q) > 0:
    point = q.pop()

    # Make `retries` attemps to find a point within [radius, 2 * radius] from
    # `point`.
    for _ in range(retries):
      diff = 2 * radius * (2 * np.random.rand(2) - 1)
      r2 = np.dot(diff, diff)
      new_point = diff + point
      if (new_point[0] >= 0 and new_point[0] < shape[0] and
          new_point[1] >= 0 and new_point[1] < shape[1] and 
          not has_neighbors_in_radius(new_point) and
          r2 > radius * radius and r2 < 4 * radius * radius):
        add_point(new_point)
  num_points = len(points)

  # Return points list as a numpy array.
  return np.concatenate(points).reshape((num_points, 2))


# Returns an array in which all True values of `mask` contain the distance to
# the nearest False value.
def dist_to_mask(mask):
  border_mask = (np.maximum.reduce([
      np.roll(mask, 1, axis=0), np.roll(mask, -1, axis=0),
      np.roll(mask, -1, axis=1), np.roll(mask, 1, axis=1)]) * (1 - mask))
  border_points = np.column_stack(np.where(border_mask > 0))

  kdtree = sp.spatial.cKDTree(border_points)
  grid_points = make_grid_points(mask.shape)

  return kdtree.query(grid_points)[0].reshape(mask.shape)


# Generates worley noise with points separated by `spacing`.
def worley(shape, spacing):
  points = poisson_disc_sampling(shape, spacing)
  coords = np.floor(points).astype(int)
  mask = np.zeros(shape, dtype=bool)
  mask[coords[:, 0], coords[:, 1]] = True
  return normalize(dist_to_mask(mask))


# Peforms a gaussian blur of `a`.
def gaussian_blur(a, sigma=1.0):
  freqs = tuple(np.fft.fftfreq(n, d=1.0 / n) for n in a.shape)
  freq_radial = np.hypot(*np.meshgrid(*freqs))
  sigma2 = sigma**2
  g = lambda x: ((2 * np.pi * sigma2) ** -0.5) * np.exp(-0.5 * (x / sigma)**2)
  kernel = g(freq_radial)
  kernel /= kernel.sum()
  return np.fft.ifft2(np.fft.fft2(a) * np.fft.fft2(kernel)).real

Download .txt

gitextract_5jiu_71x/

├── .gitignore
├── LICENSE.txt
├── README.md
├── domain_warping.py
├── download_ned_zips.py
├── extract_height_arrays.py
├── generate_ml_output.py
├── generate_training_images.py
├── make_grayscale_image.py
├── make_hillshaded_image.py
├── plain_old_fbm.py
├── requirements-pip3.txt
├── ridge_noise.py
├── river_network.py
├── simulation.py
└── util.py

Download .txt

SYMBOL INDEX (41 symbols across 12 files)

FILE: domain_warping.py
  function main (line 10) | def main(argv):

FILE: download_ned_zips.py
  function get_previously_downloaded_ids (line 18) | def get_previously_downloaded_ids(dir_path):
  function download_file (line 26) | def download_file(src_id, url, output_dir, tmp_dir):
  function main (line 43) | def main(argv):

FILE: extract_height_arrays.py
  function get_img_array_from_zip (line 20) | def get_img_array_from_zip(zip_file, img_name):
  function main (line 31) | def main(argv):

FILE: generate_ml_output.py
  function main (line 12) | def main(argv):

FILE: generate_training_images.py
  function clean_sample (line 16) | def clean_sample(sample):
  function get_variants (line 43) | def get_variants(a):
  function main (line 49) | def main(argv):

FILE: make_grayscale_image.py
  function main (line 9) | def main(argv):

FILE: make_hillshaded_image.py
  function main (line 9) | def main(argv):

FILE: plain_old_fbm.py
  function main (line 10) | def main(argv):

FILE: ridge_noise.py
  function noise_octave (line 9) | def noise_octave(shape, f):
  function main (line 12) | def main(argv):

FILE: river_network.py
  function min_index (line 17) | def min_index(a): return a.index(min(a))
  function bump (line 22) | def bump(shape, sigma):
  function compute_height (line 30) | def compute_height(points, neighbors, deltas, get_delta_fn=None):
  function compute_final_height (line 53) | def compute_final_height(points, neighbors, deltas, volume, upstream,
  function compute_river_network (line 84) | def compute_river_network(points, neighbors, heights, land,
  function render_triangulation (line 158) | def render_triangulation(shape, tri, values):
  function remove_lakes (line 167) | def remove_lakes(mask):
  function main (line 175) | def main(argv):

FILE: simulation.py
  function apply_slippage (line 19) | def apply_slippage(terrain, repose_slope, cell_width):
  function main (line 27) | def main(argv):

FILE: util.py
  function read_csv (line 14) | def read_csv(csv_path):
  function normalize (line 20) | def normalize(x, bounds=(0, 1)):
  function fbm (line 25) | def fbm(shape, p, lower=-np.inf, upper=np.inf):
  function sample (line 38) | def sample(a, offset):
  function displace (line 61) | def displace(a, delta):
  function gaussian_gradient (line 78) | def gaussian_gradient(a, sigma=1.0):
  function simple_gradient (line 92) | def simple_gradient(a):
  function load_from_file (line 100) | def load_from_file(path):
  function save_as_png (line 109) | def save_as_png(a, path):
  function hillshaded (line 122) | def hillshaded(a, land_mask=None, angle=270):
  function lerp (line 132) | def lerp(x, y, a): return (1.0 - a) * x + a * y
  function make_grid_points (line 137) | def make_grid_points(shape):
  function poisson_disc_sampling (line 147) | def poisson_disc_sampling(shape, radius, retries=16):
  function dist_to_mask (line 202) | def dist_to_mask(mask):
  function worley (line 215) | def worley(shape, spacing):
  function gaussian_blur (line 224) | def gaussian_blur(a, sigma=1.0):

Download .json

Condensed preview — 16 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (62K chars).

[
  {
    "path": ".gitignore",
    "chars": 225,
    "preview": "# Python-related files\n__pycache__/\n*.py[cod]\n*$py.class\n\n# Training data files.\nzip_files/\narray_files/\ntraining_images"
  },
  {
    "path": "LICENSE.txt",
    "chars": 1071,
    "preview": "MIT License\n\nCopyright (c) 2018 Daniel Andrino\n\nPermission is hereby granted, free of charge, to any person obtaining a "
  },
  {
    "path": "README.md",
    "chars": 25731,
    "preview": "# Three Ways of Generating Terrain with Erosion Features\n\n## Background\n\nTerrain generation has long been a popular topi"
  },
  {
    "path": "domain_warping.py",
    "chars": 410,
    "preview": "#!/usr/bin/python3\n\n# A simple domain warping example.\n\nimport numpy as np\nimport sys\nimport util\n\n\ndef main(argv):\n  sh"
  },
  {
    "path": "download_ned_zips.py",
    "chars": 2284,
    "preview": "#!/usr/bin/python3\n\n# Script to hep in downloading files from USGS. The input file is a CSV\n# of 3DEP IMG resources down"
  },
  {
    "path": "extract_height_arrays.py",
    "chars": 2687,
    "preview": "#!/usr/bin/python3\n\n# Extracts the underlying heightmap in each file in zip_files/ and writes the \n# numpy array to arra"
  },
  {
    "path": "generate_ml_output.py",
    "chars": 1530,
    "preview": "#!/usrb/bin/python3\n\n# Prouces terrain samples from the trained generator network.\n\nimport numpy as np\nimport os\nimport "
  },
  {
    "path": "generate_training_images.py",
    "chars": 3857,
    "preview": "#!/usr/bin/python\n\n# Reads the Numpy arrays in array_files/ and generates images for use in\n# training. Please note that"
  },
  {
    "path": "make_grayscale_image.py",
    "chars": 419,
    "preview": "#!/usr/bin/python3\n\n# Genreates a PNG containing the terrain height in grayscale.\n\nimport util\nimport sys\n\n\ndef main(arg"
  },
  {
    "path": "make_hillshaded_image.py",
    "chars": 476,
    "preview": "#!/usr/bin/python3\n\n# Genreates a PNG containing a hillshaded version of the terrain height.\n\nimport util\nimport sys\n\n\nd"
  },
  {
    "path": "plain_old_fbm.py",
    "chars": 232,
    "preview": "#!/usr/bin/python3\n\n# A demo of just regular FBM noise\n\nimport numpy as np\nimport sys\nimport util\n\n\ndef main(argv):\n  sh"
  },
  {
    "path": "requirements-pip3.txt",
    "chars": 439,
    "preview": "# GDAL is a bit janky, so you may have to compile and install yourself.\n# See https://trac.osgeo.org/gdal/wiki/DownloadS"
  },
  {
    "path": "ridge_noise.py",
    "chars": 452,
    "preview": "#!/usr/bin/python3\n\n# A demo of ridge noise.\n\nimport numpy as np\nimport sys\nimport util\n\ndef noise_octave(shape, f):\n  r"
  },
  {
    "path": "river_network.py",
    "chars": 7782,
    "preview": "#!/usr/bin/python3\n\nimport collections\nimport heapq\nimport numpy as np\nimport matplotlib\nimport matplotlib.collections a"
  },
  {
    "path": "simulation.py",
    "chars": 4687,
    "preview": "#!/usr/bin/python3\n\n# Semi-phisically-based hydraulic erosion simulation. Code is inspired by the \n# code found here:\n# "
  },
  {
    "path": "util.py",
    "chars": 7791,
    "preview": "# Various common functions.\n\nfrom PIL import Image\nimport collections\nimport csv\nfrom matplotlib.colors import LightSour"
  }
]

About this extraction

This page contains the full source code of the dandrino/terrain-erosion-3-ways GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 16 files (58.7 KB), approximately 15.6k tokens, and a symbol index with 41 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo