Full Code of mapbox/dynamodb-replicator for AI

master 06835850d08e cached
38 files
114.8 KB
26.9k tokens
23 symbols
1 requests
Download .txt
Repository: mapbox/dynamodb-replicator
Branch: master
Commit: 06835850d08e
Files: 38
Total size: 114.8 KB

Directory structure:
gitextract_geww9p6m/

├── .eslintrc
├── .gitignore
├── CODEOWNERS
├── DESIGN.md
├── LICENSE.txt
├── README.md
├── backup.js
├── bin/
│   ├── backup-table.js
│   ├── diff-record.js
│   ├── diff-tables.js
│   ├── incremental-backfill.js
│   ├── incremental-backup-record.js
│   ├── incremental-diff-record.js
│   ├── incremental-record-history.js
│   ├── incremental-snapshot.js
│   └── replicate-record.js
├── cloudformation/
│   └── travis.template
├── diff.js
├── fastlog.js
├── index.js
├── package.json
├── parse-location.js
├── s3-backfill.js
├── s3-snapshot.js
└── test/
    ├── backup.test.js
    ├── diff-record.test.js
    ├── diff.test.js
    ├── fixtures/
    │   ├── events/
    │   │   ├── adjust-many.json
    │   │   ├── insert-buffer.json
    │   │   ├── insert-modify-delete.json
    │   │   ├── insert-modify.json
    │   │   └── insert.json
    │   ├── records.js
    │   └── table.js
    ├── incremental.test.js
    ├── index.test.js
    ├── live-test.backup-table.js
    └── table.json

================================================
FILE CONTENTS
================================================

================================================
FILE: .eslintrc
================================================
{
    "rules": {
        "indent": [2, 4],
        "quotes": [2, "single"],
        "no-console": [0]
    },
    "env": {
        "node": true
    },
    "globals": {
        "process": true,
        "module": true,
        "require": true
    },
    "extends": "eslint:recommended"
}


================================================
FILE: .gitignore
================================================
node_modules
.DS_Store


================================================
FILE: CODEOWNERS
================================================
# global owners
*       @mapbox/cloud-platform


================================================
FILE: DESIGN.md
================================================
# Design

## Replication

This replication system is built such that there is a **primary** table and a **replica** table. All writes are performed against the primary table, and changes made to the primary table are pushed to the replica table. Reads can be performed against either the primary or replica table.

[Dyno](https://github.com/mapbox/dyno), the client that we use for interactions with DynamoDB can be [configured to read from one table and write to another](https://github.com/mapbox/dyno#multi--kinesisconfig). The primary table has a DynamoDB stream associated with it, while the replica table does not need to.

Replication is performed via an [AWS Lambda function](https://github.com/mapbox/dynamodb-replicator/blob/master/index.js) which reads from the primary table's DynamoDB stream and duplicates changes onto the replica table.

```
 us-east-1                         eu-west-1
-----------                       -----------
|   api   |                       |   api   |
-----------                       -----------
  ↑     |                           |     ↑
  |     |                           |     |
  |     ↓                           |     |
  |   writes ←----------------------+     |
  |   |                                   |
reads |                                 reads
  |   |  +~~~~ DynamoDB stream ~~~~+      |
  |   |  |                         |      |
  |   |  |                       writes   |
  |   |  |                         |      |
  |   ↓  ↑                         ↓      |
-----------                       -----------
|         |                       |         |
| primary |                       | replica |
|  table  |                       |  table  |
|         |                       |         |
-----------                       -----------
```

## Consistency and repair

The [diff-tables script provided by dynamodb-replicator](https://github.com/mapbox/dynamodb-replicator/blob/master/bin/diff-tables.js) scans the primary table, and performs record-by-record checks that the replica table's data is up-to-date. If it encounters any discrepancies, the data in the replica table is updated to match the primary table. This script can also be used to backfill a brand new replica table.

```
        primary                                     replica
------------------------                    ------------------------
| hash | range | value |                    | hash | range | value |
------------------------ --+                ------------------------
|  10  |   1   |   1   |   |  get set   --→ |  10  |   1   |   1   | ✔
------------------------   | of primary     ------------------------
|  10  |   2   |   7   |   |  records   --→ |  10  |   2   |   7   | ✔
------------------------   |                ------------------------
|  11  |   1   |   3   |   | check each --→ |  11  |   1   |   3   | ✔
------------------------   | individual     ------------------------
|  12  |   1   |   4   |   | in replica --→ |  12  |   1   |   8   | ✘ Repair this!
------------------------ --+                ------------------------

                            ...repeat...

------------------------ --+                ------------------------
|  13  |   1   |   3   |   |  get set   --→ |  13  |   1   |   3   | ✔
------------------------   | of primary     ------------------------
|  13  |   2   |   5   |   |  records   --→ |  13  |   2   |   7   | ✘ Repair this!
------------------------   |                ------------------------
|  14  |   1   |   4   |   | check each --→ |  14  |   1   |   4   | ✔
------------------------   | individual     ------------------------
|  15  |   1   |   6   |   | in replica --→ |  15  |   1   |   6   | ✔
------------------------ --+                ------------------------
```

This gives us a system where changes to the primary table are rapidly implemented in the replica tables via DynamoDB stream + Lambda **replication**. The **consistency check** system gives us additional certainty that replication is doing its job, and provides a system by which we can recover a database in one region by reading the data out of a table in another region.

## Backup and Restore

Backups of DynamoDB tables are managed by the lambda [dynamodb-backups](https://github.com/mapbox/dynamodb-backups).


================================================
FILE: LICENSE.txt
================================================

ISC License

Copyright (c) 2017, Mapbox

Permission to use, copy, modify, and/or distribute this software for any
purpose with or without fee is hereby granted, provided that the above
copyright notice and this permission notice appear in all copies.

THE SOFTWARE IS PROVIDED "AS IS" AND THE AUTHOR DISCLAIMS ALL WARRANTIES
WITH REGARD TO THIS SOFTWARE INCLUDING ALL IMPLIED WARRANTIES OF
MERCHANTABILITY AND FITNESS. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR
ANY SPECIAL, DIRECT, INDIRECT, OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT OF
OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.<Paste>


================================================
FILE: README.md
================================================
# dynamodb-replicator

[dynamodb-replicator](https://github.com/mapbox/dynamodb-replicator) offers several different mechanisms to manage redundancy and recoverability on [DynamoDB](http://aws.amazon.com/documentation/dynamodb) tables.

- A **replicator** function that processes events from a DynamoDB stream, replaying changes made to the primary table and onto a replica table. The function is designed to be run as an [AWS Lambda function](http://aws.amazon.com/documentation/lambda/).
- An **incremental backup** function that processes events from a DynamoDB stream, replaying them as writes to individual objects on S3. The function is designed to be run as an [AWS Lambda function](http://aws.amazon.com/documentation/lambda/).
- A **consistency check** script that scans the primary table and checks that each individual record in the replica table is up-to-date. The goal is to double-check that the replicator is performing as is should, and the two tables are completely consistent.
- A **table dump** script that scans a single table, and writes the data to a file on S3, providing a snapshot of the table's state.
- A **snapshot** script that scans an S3 folder where incremental backups have been made, and writes the aggregate to a file on S3, providing a snapshot of the backup's state.

## Design

Managing table redundancy and backups involves many moving parts. Please read [DESIGN.md](https://github.com/mapbox/dynamodb-replicator/blob/master/DESIGN.md) for an in-depth explanation.

## Utility scripts

[dynamodb-replicator](https://github.com/mapbox/dynamodb-replicator) provides several CLI tools to help manage your DynamoDB table.

### diff-record

Given two tables and an item's key, this script looks up the record in both tables and checks for consistency.

```
$ npm install -g dynamodb-replicator
$ diff-record --help

Usage: diff-record <primary region/table> <replica region/table> <key>

# Check for discrepancies between an item in two tables
$ diff-record us-east-1/primary eu-west-1/replica '{"id":"abc"}'
```

### diff-tables

Given two tables and a set of options, performs a complete consistency check on the two, optionally repairing records in the replica table that differ from the primary.

```
$ npm install -g dynamodb-replicator
$ diff-tables --help

Usage: diff-tables primary-region/primary-table replica-region/replica-table

Options:
  --repair     perform actions to fix discrepancies in the replica table
  --segment    segment identifier (0-based)
  --segments   total number of segments
  --backfill   only scan primary table and write to replica

# Log information about discrepancies between the two tables
$ diff-tables us-east-1/primary eu-west-2/replica

# Repair the replica to match the primary
$ diff-tables us-east-1/primary eu-west-2/replica --repair

# Only backfill the replica. Useful for starting a new replica
$ diff-tables us-east-1/primary eu-west-2/new-replica --backfill --repair

# Perform one segment of a parallel scan
$ diff-tables us-east-1/primar eu-west-2/replica --repair --segment 0 --segments 10
```

### replicate-record

Given two tables and an item's key, this script insures that the replica record is synchronized with its current state in the primary table.

```
$ npm install -g dynamodb-replicator
$ replicate-record --help

Usage: replicate-record <primary tableinfo> <replica tableinfo> <recordkey>
 - primary tableinfo: the primary table to replicate from, specified as `region/tablename`
 - replica tableinfo: the replica table to replicate to, specified as `region/tablename`
 - recordkey: the key for the record specified as a JSON object

# Copy the state of a record from the primary to the replica table
$ replicate-record us-east-1/primary eu-west-1/replica '{"id":"abc"}'
```

### backup-table

Scans a table and dumps the entire set of records as a line-delimited JSON file on S3.

```
$ npm install -g dynamodb-replicator
$ backup-table --help

Usage: backup-table region/table s3url

Options:
  --jobid      assign a jobid to this backup
  --segment    segment identifier (0-based)
  --segments   total number of segments
  --metric     cloudwatch metric namespace. Will provide dimension TableName = the name of the backed-up table.

# Writes a backup file to s3://my-bucket/some-prefix/<random string>/0
$ backup-table us-east-1/primary s3://my-bucket/some-prefix

# Specifying a jobid guarantees the S3 location
# Writes a backup file to s3://my-bucket/some-prefix/my-job-id/0
$ backup-table us-east-1/primary s3://my-bucket/some-prefix --jobid my-job-id

# Perform one segment of a parallel backup
# Writes a backup file to s3://my-bucket/some-prefix/my-job-id/4
$ backup-table us-east-1/primary s3://my-bucket/some-prefix --jobid my-job-id --segment 4 --segments 10
```

### incremental-backfill

Scans a table and dumps each individual record as an object to a folder on S3.

```
$ npm install -g dynamodb-replicator
$ incremental-backfill --help

Usage: incremental-backfill region/table s3url

# Write each item in the table to S3. `s3url` should provide any desired bucket/prefix.
# The name of the table will be appended to the s3 prefix that you provide.
$ incremental-backfill us-east-1/primary s3://dynamodb-backups/incremental
```

### incremental-snapshot

Reads each item in an S3 folder representing an incremental table backup, and writes an aggregate line-delimited JSON file to S3.

```
$ npm install -g dynamodb-replicator
$ incremental-snapshot --help

Usage: incremental-snapshot <source> <dest>

Options:
  --metric     cloudwatch metric region/namespace/tablename. Will provide dimension TableName = the tablename.

# Aggregate all the items in an S3 folder into a single snapshot file
$ incremental-snapshot s3://dynamodb-backups/incremental/primary s3://dynamodb-backups/snapshots/primary
```

### incremental-diff-record

Checks for consistency between a DynamoDB record and its backed-up version on S3.

```
$ npm install -g dynamodb-replicator
$ incremental-diff-record --help

Usage: incremental-diff-record <tableinfo> <s3url> <recordkey>
 - tableinfo: the table where the record lives, specified as `region/tablename`
 - s3url: s3 folder where the incremental backups live
 - recordkey: the key for the record specified as a JSON object

# Check that a record is up-to-date in the incremental backup
$ incremental-diff-record us-east-1/primary s3://dynamodb-backups/incremental '{"id":"abc"}'
```

### incremental-backup-record

Copies a DynamoDB record's present state to an incremental backup folder on S3.

```
$ npm install -g dynamodb-replicator
$ incremental-backup-record --help

Usage: incremental-backup-record <tableinfo> <s3url> <recordkey>
 - tableinfo: the table to backup from, specified as `region/tablename`
 - s3url: s3 folder into which the record should be backed up to
 - recordkey: the key for the record specified as a JSON object

# Backup a single record to S3
$ incremental-backup-record us-east-1/primary s3://dynamodb-backups/incremental '{"id":"abc"}'
```

### incremental-record-history

Prints each version of a record that is available in an incremental backup folder on S3.

```
$ incremental-record-history --help

Usage: incremental-record-history <tableinfo> <s3url> <recordkey>
 - tableinfo: the table where the record lives, specified as `region/tablename`
 - s3url: s3 folder where the incremental backups live. Table name will be appended
 - recordkey: the key for the record specified as a JSON object

# Read the history of a single record
$ incremental-record-history us-east-1/my-table s3://dynamodb-backups/incremental '{"id":"abc"}'
```


================================================
FILE: backup.js
================================================
var AWS = require('aws-sdk');
var Dyno = require('@mapbox/dyno');
var stream = require('stream');
var zlib = require('zlib');

module.exports = function(config, done) {
    var primary = Dyno(config);
    var s3 = new AWS.S3();

    var log = config.log || console.log;

    var scanOpts = Object.prototype.hasOwnProperty.call(config, 'segment') && config.segments ?
        { Segment: config.segment, TotalSegments: config.segments } : undefined;

    if (config.backup)
        if (!config.backup.bucket || !config.backup.prefix || !config.backup.jobid)
            return done(new Error('Must provide a bucket, prefix and jobid for backups'));

    var index = !isNaN(parseInt(config.segment)) ? config.segment.toString() : 0;
    var key = [config.backup.prefix, config.backup.jobid, index].join('/');
    var count = 0;
    var size = 0;

    var stringify = new stream.Transform({ objectMode: true });
    stringify._transform = function(record, enc, callback) {
        var line = Dyno.serialize(record);

        setImmediate(function() {
            stringify.push(line + '\n');
            count++;
            callback();
        });
    };

    var data = primary.scanStream(scanOpts)
        .on('error', next)
        .pipe(stringify)
        .on('error', next)
        .pipe(zlib.createGzip());

    log('[segment %s] Starting backup job %s of %s', index, config.backup.jobid, config.region + '/' + config.table);

    s3.upload({
        Bucket: config.backup.bucket,
        Key: key,
        Body: data
    }, function(err) {
        if (err) return next(err);
        log('[segment %s] Uploaded dynamo backup to s3://%s/%s', index, config.backup.bucket, key);
        log('[segment %s] Wrote %s items to backup', index, count);
        next();
    }).on('httpUploadProgress', function(progress) {
        log('[segment %s] Uploaded %s bytes', index, progress.loaded);
        size = progress.total;
    });

    function next(err) {
        if (err) return done(err);
        done(null, { size: size, count: count });
    }
};


================================================
FILE: bin/backup-table.js
================================================
#!/usr/bin/env node

var backup = require('../backup');
var fastlog = require('../fastlog');
var args = require('minimist')(process.argv.slice(2));
var crypto = require('crypto');
var s3urls = require('s3urls');
var AWS = require('aws-sdk');

function usage() {
    console.error('');
    console.error('Usage: backup-table region/table s3url');
    console.error('');
    console.error('Options:');
    console.error('  --jobid      assign a jobid to this backup');
    console.error('  --segment    segment identifier (0-based)');
    console.error('  --segments   total number of segments');
    console.error('  --metric     cloudwatch metric namespace. Will provide dimension TableName = the name of the backed-up table.');
}

if (args.help) {
    usage();
    process.exit(0);
}

var table = args._[0];

if (!table) {
    console.error('Must provide table information');
    usage();
    process.exit(1);
}

table = table.split('/');

var s3url = args._[1];

if (!s3url) {
    console.error('Must provide an s3url');
    usage();
    process.exit(1);
}

s3url = s3urls.fromUrl(s3url);

var jobid = args.jobid || crypto.randomBytes(8).toString('hex');
var format = '[${timestamp}] [${level}] [${category}] [' + jobid + ']';
var log = fastlog('backup-table', 'info', format);

var config = {
    region: table[0],
    table: table[1],
    segment: args.segment,
    segments: args.segments,
    log: log.info,
    backup: {
        bucket: s3url.Bucket,
        prefix: s3url.Key,
        jobid: jobid
    }
};

backup(config, function(err, details) {
    if (err) log.error(err);

    if (args.metric) {
        var cw = new AWS.CloudWatch({ region: config.region });
        var params = {
            Namespace: args.metric,
            MetricData: []
        };

        if (err) {
            params.MetricData.push({
                MetricName: 'BackupErrors',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: config.table
                    }
                ],
                Value: 1
            });
        }

        if (details) {
            params.MetricData.push({
                MetricName: 'BackupSize',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: config.table
                    }
                ],
                Value: details.size,
                Unit: 'Bytes'
            }, {
                MetricName: 'BackupRecordCount',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: config.table
                    }
                ],
                Value: details.count,
                Unit: 'Count'
            });
        }

        cw.putMetricData(params, function(err) {
            if (err) log.error(err);
        });
    }
});


================================================
FILE: bin/diff-record.js
================================================
#!/usr/bin/env node

var Dyno = require('@mapbox/dyno');
var args = require('minimist')(process.argv.slice(2));
var assert = require('assert');

function usage() {
    console.error('');
    console.error('Usage: diff-record <primary region/table> <replica region/table> <key>');
}

if (args.help) {
    usage();
    process.exit(0);
}

args.primary = args._[0];
if (!args.primary) {
    console.error('You must specify the primary region/table');
    usage();
    process.exit(1);
}

args.replica = args._[1];
if (!args.replica) {
    console.error('You must specify the replica region/table');
    usage();
    process.exit(1);
}

var key = args._[2];
if (!key) {
    console.error('You must specify the key for the record to check');
    usage();
    process.exit(1);
}

// Converts incoming strings in wire or dyno format into dyno format
try {
    var obj = Dyno.deserialize(key);
    for (var k in obj) if (!obj[k]) throw new Error();
    key = obj;

}
catch (err) { key = JSON.parse(key); }

var primaryConfig = {
    table: args.primary.split('/')[1],
    region: args.primary.split('/')[0]
};

if (primaryConfig.region === 'local') {
    primaryConfig.accessKeyId = 'fake';
    primaryConfig.secretAccessKey = 'fake';
    primaryConfig.endpoint = 'http://localhost:4567';
}

var primary = Dyno(primaryConfig);

var replicaConfig = {
    table: args.replica.split('/')[1],
    region: args.replica.split('/')[0]
};

if (replicaConfig.region === 'local') {
    replicaConfig.accessKeyId = 'fake';
    replicaConfig.secretAccessKey = 'fake';
    replicaConfig.endpoint = 'http://localhost:4567';
}

var replica = Dyno(replicaConfig);

primary.getItem({ Key: key }, function(err, data) {
    if (err) throw err;
    var primaryRecord = data.Item;

    replica.getItem({ Key: key }, function(err, data) {
        if (err) throw err;
        var replicaRecord = data.Item;

        console.log('Primary record');
        console.log('--------------');
        console.log(primaryRecord);
        console.log('');

        console.log('Replica record');
        console.log('--------------');
        console.log(replicaRecord);
        console.log('');

        try {
            assert.deepEqual(replicaRecord, primaryRecord);
            console.log('----------------------------');
            console.log('✔ The records are equivalent');
            console.log('----------------------------');
        }
        catch (err) {
            console.log('--------------------------------');
            console.log('✘ The records are not equivalent');
            console.log('--------------------------------');
        }
    });
});


================================================
FILE: bin/diff-tables.js
================================================
#!/usr/bin/env node

var diff = require('../diff');
var fastlog = require('../fastlog');
var args = require('minimist')(process.argv.slice(2));
var crypto = require('crypto');
var parse_location = require('../parse-location')

function usage() {
    console.error('');
    console.error('Usage: diff-tables primary-region/primary-table replica-region/replica-table');
    console.error('');
    console.error('Options:');
    console.error('  --repair     perform actions to fix discrepancies in the replica table');
    console.error('  --segment    segment identifier (0-based)');
    console.error('  --segments   total number of segments');
    console.error('  --backfill   only scan primary table and write to replica');
}

if (args.help) {
    usage();
    process.exit(0);
}

var primary = args._[0];
var replica = args._[1];

if (!primary) {
    console.error('Must provide primary table information');
    usage();
    process.exit(1);
}

if (!replica) {
    config.log.error('Must provide replica table information');
    usage();
    process.exit(1);
}

primary = primary.split('/');
replica = replica.split('/');

var jobid = crypto.randomBytes(8).toString('hex');
var format = '[${timestamp}] [${level}] [${category}] [' + jobid + ']';
var log = fastlog('diff-tables', 'info', format);

var locations = parse_location.parse(primary, replica)
primary = locations[0]
replica = locations[1]

var config = {
    primary: primary,
    replica:replica,
    repair: !!args.repair,
    segment: args.segment,
    segments: args.segments,
    backfill: args.backfill,
    log: log.info
};

diff(config, function(err) {
    if (err) {
        log.error(err);
        process.exit(1);
    }
});


================================================
FILE: bin/incremental-backfill.js
================================================
#!/usr/bin/env node

var args = require('minimist')(process.argv.slice(2));
var s3urls = require('s3urls');
var backfill = require('../s3-backfill');

function usage() {
    console.error('');
    console.error('Usage: incremental-backfill region/table s3url');
}

if (args.help) {
    usage();
    process.exit(0);
}

var table = args._[0];

if (!table) {
    console.error('Must provide table information');
    usage();
    process.exit(1);
}

table = table.split('/');

var s3url = args._[1];

if (!s3url) {
    console.error('Must provide an s3url');
    usage();
    process.exit(1);
}

s3url = s3urls.fromUrl(s3url);

var config = {
    region: table[0],
    table: table[1],
    backup: {
        bucket: s3url.Bucket,
        prefix: s3url.Key
    }
};

backfill(config, function(err) {
    if (err) {
        console.error(err);
        process.exit(1);
    }
});


================================================
FILE: bin/incremental-backup-record.js
================================================
#!/usr/bin/env node

var minimist = require('minimist');
var s3urls = require('s3urls');
var Dyno = require('@mapbox/dyno');
var backup = require('..').backup;

var args = minimist(process.argv.slice(2));

function usage() {
    console.error('');
    console.error('Usage: incremental-backup-record <tableinfo> <s3url> <recordkey>');
    console.error(' - tableinfo: the table to backup from, specified as `region/tablename`');
    console.error(' - s3url: s3 folder into which the record should be backed up to');
    console.error(' - recordkey: the key for the record specified as a JSON object');
}

if (args.help) {
    usage();
    process.exit(0);
}

var table = args._[0];

if (!table) {
    console.error('Must provide table information');
    usage();
    process.exit(1);
}

var region = table.split('/')[0];
table = table.split('/')[1];

var s3url = args._[1];

if (!s3url) {
    console.error('Must provide an s3url');
    usage();
    process.exit(1);
}

s3url = s3urls.fromUrl(s3url);
process.env.BackupBucket = s3url.Bucket;
process.env.BackupPrefix = s3url.Key;

var key = args._[2];

if (!key) {
    console.error('Must provide a record key');
    usage();
    process.exit(1);
}

// Converts incoming strings in wire or dyno format into dyno format
try {
    var obj = Dyno.deserialize(key);
    for (var k in obj) if (!obj[k]) throw new Error();
    key = obj;

}
catch (err) { key = JSON.parse(key); }

var dyno = Dyno({
    region: region,
    table: table
});

dyno.getItem({ Key: key }, function(err, data) {
    if (err) throw err;
    data = data.Item;

    var event = {
        Records: [
            {
                dynamodb: {
                    Keys: JSON.parse(Dyno.serialize(key)),
                    NewImage: data ? JSON.parse(Dyno.serialize(data)) : undefined
                },
                eventSourceARN: '/' + table,
                eventName: data ? 'INSERT' : 'REMOVE'
            }
        ]
    };

    backup(event, function(err) {
        if (err) throw err;
    });
});


================================================
FILE: bin/incremental-diff-record.js
================================================
#!/usr/bin/env node

var minimist = require('minimist');
var s3urls = require('s3urls');
var Dyno = require('@mapbox/dyno');
var crypto = require('crypto');
var AWS = require('aws-sdk');
var s3 = new AWS.S3();
var assert = require('assert');

var args = minimist(process.argv.slice(2));

function usage() {
    console.error('');
    console.error('Usage: incremental-diff-record <tableinfo> <s3url> <recordkey>');
    console.error(' - tableinfo: the table where the record lives, specified as `region/tablename`');
    console.error(' - s3url: s3 folder where the incremental backups live');
    console.error(' - recordkey: the key for the record specified as a JSON object');
}

if (args.help) {
    usage();
    process.exit(0);
}

var table = args._[0];

if (!table) {
    console.error('Must provide table information');
    usage();
    process.exit(1);
}

var region = table.split('/')[0];
table = table.split('/')[1];

var s3url = args._[1];

if (!s3url) {
    console.error('Must provide an s3url');
    usage();
    process.exit(1);
}

s3url = s3urls.fromUrl(s3url);

var key = args._[2];

if (!key) {
    console.error('Must provide a record key');
    usage();
    process.exit(1);
}

// Sort the attributes in the provided key
key = JSON.parse(key);
key = JSON.stringify(Object.keys(key).sort().reduce(function(keyObj, attr) {
    keyObj[attr] = key[attr];
    return keyObj;
}, {}));

// Converts incoming strings in wire or dyno format into dyno format
try {
    var obj = Dyno.deserialize(key);
    for (var k in obj) if (!obj[k]) throw new Error();
    key = obj;

}
catch (err) { key = JSON.parse(key); }


s3url.Key = [
    s3url.Key,
    table,
    crypto.createHash('sha256')
        .update(Dyno.serialize(key))
        .digest('hex')
].join('/');

var dyno = Dyno({
    region: region,
    table: table
});

dyno.getItem({ Key: key }, function(err, data) {
    if (err) throw err;
    var dynamoRecord = data.Item;

    s3.getObject(s3url, function(err, data) {
        if (err && err.statusCode !== 404) throw err;
        var s3data = err ? undefined : Dyno.deserialize(data.Body.toString());

        console.log('DynamoDB record');
        console.log('--------------');
        console.log(dynamoRecord);
        console.log('');

        console.log('Incremental backup record (%s)', s3url.Key);
        console.log('--------------');
        console.log(s3data);
        console.log('');

        try {
            assert.deepEqual(s3data, dynamoRecord);
            console.log('----------------------------');
            console.log('✔ The records are equivalent');
            console.log('----------------------------');
        }
        catch (err) {
            console.log('--------------------------------');
            console.log('✘ The records are not equivalent');
            console.log('--------------------------------');
        }
    });
});


================================================
FILE: bin/incremental-record-history.js
================================================
#!/usr/bin/env node

var minimist = require('minimist');
var s3urls = require('s3urls');
var crypto = require('crypto');
var AWS = require('aws-sdk');
var s3 = new AWS.S3();
var queue = require('queue-async');
var Dyno = require('@mapbox/dyno');

var args = minimist(process.argv.slice(2));

function usage() {
    console.error('');
    console.error('Usage: incremental-record-history <tableinfo> <s3url> <recordkey>');
    console.error(' - tableinfo: the table where the record lives, specified as `region/tablename`');
    console.error(' - s3url: s3 folder where the incremental backups live. Table name will be appended');
    console.error(' - recordkey: the key for the record specified as a JSON object');
}

if (args.help) {
    usage();
    process.exit(0);
}

var table = args._[0];

if (!table) {
    console.error('Must provide table information');
    usage();
    process.exit(1);
}

table = table.split('/')[1];

var s3url = args._[1];

if (!s3url) {
    console.error('Must provide an s3url');
    usage();
    process.exit(1);
}

s3url = s3urls.fromUrl(s3url);

var key = args._[2];

if (!key) {
    console.error('Must provide a record key');
    usage();
    process.exit(1);
}

// Sort the attributes in the provided key
key = JSON.parse(key);
key = JSON.stringify(Object.keys(key).sort().reduce(function(keyObj, attr) {
    keyObj[attr] = key[attr];
    return keyObj;
}, {}));

// Converts incoming strings in wire or dyno format into dyno format
try {
    var obj = Dyno.deserialize(key);
    for (var k in obj) if (!obj[k]) throw new Error();
    key = obj;

}
catch (err) { key = JSON.parse(key); }

s3url.Key = [
    s3url.Key,
    table,
    crypto.createHash('sha256')
        .update(Dyno.serialize(key))
        .digest('hex')
].join('/');

var q = queue(100);
q.awaitAll(function(err, results) {
    if (err) throw err;
    if (!results.length) return;
    results.sort(function(a, b) { return b.date - a.date }).forEach(function(version) {
        console.log('\nModified: %s', version.date.toISOString());
        console.log('----------------------------------');
        console.log((typeof version.data === 'string' ? version.data : JSON.stringify(version.data, null, 2)) + '\n');
    });
});

(function listVersions(nextVersion, nextKey) {
    var params = {
        Bucket: s3url.Bucket,
        Prefix: s3url.Key
    };

    if (nextVersion) params.VersionIdMarker = nextVersion;
    if (nextKey) params.KeyMarker = nextKey;

    s3.listObjectVersions(params, function(err, data) {
        if (err) throw err;

        data.Versions.forEach(function(version) {
            q.defer(function(next) {
                s3.getObject({
                    Bucket: s3url.Bucket,
                    Key: s3url.Key,
                    VersionId: version.VersionId
                }, function(err, data) {
                    if (err && err.name === 'InvalidObjectState') return next(null, {
                        date: new Date(version.LastModified),
                        data: 'Version archived: ' + version.VersionId
                    });
                    if (err) return next(err);
                    next(null, {
                        date: new Date(version.LastModified),
                        data: Dyno.deserialize(data.Body.toString())
                    });
                });
            });
        });

        data.DeleteMarkers.forEach(function(del) {
            q.defer(function(next) {
                next(null, {
                    date: new Date(del.LastModified),
                    data: 'Record was deleted'
                });
            });
        });

        if (data.IsTruncated && data.NextVersionIdMarker)
            return listVersions(data.NextVersionIdMarker, data.NextKeyMarker);
    });
})();


================================================
FILE: bin/incremental-snapshot.js
================================================
#!/usr/bin/env node

var AWS = require('aws-sdk');
var args = require('minimist')(process.argv.slice(2));
var s3urls = require('s3urls');
var fastlog = require('../fastlog');
var snapshot = require('../s3-snapshot');
var fs = require('fs');
var path = require('path');

function usage() {
    console.error('');
    console.error('Usage: incremental-snapshot <source> <dest>');
    console.error('');
    console.error('Options:');
    console.error('  --logger     filename where detailed log messages should be sent');
    console.error('  --retries    number of times that failed S3 requests should be retried');
    console.error('  --metric     cloudwatch metric region/namespace/tablename. Will provide dimension TableName = the tablename.');
}

if (args.help) {
    usage();
    process.exit(0);
}

var source = s3urls.fromUrl(args._[0]);
var dest = s3urls.fromUrl(args._[1]);

if (!source || !dest) {
    console.error('Must provide source and destination S3 locations');
    usage();
    process.exit(1);
}

var log = fastlog('incremental-snapshot', 'info');

var config = {
    log: log.info,
    source: {
        bucket: source.Bucket,
        prefix: source.Key
    },
    destination: {
        bucket: dest.Bucket,
        key: dest.Key
    }
};

if (args.logger)
    config.logger = fs.createWriteStream(path.resolve(args.logger), { flags: 'a' });
if (args.retries)
    config.maxRetries = args.retries;

snapshot(config, function(err, details) {
    if (err) log.error(err);

    if (args.metric) {
        var region = args.metric.split('/')[0];
        var namespace = args.metric.split('/')[1];
        var table = args.metric.split('/')[2];

        var cw = new AWS.CloudWatch({ region: region });

        var params = {
            Namespace: namespace,
            MetricData: []
        };

        if (err) {
            params.MetricData.push({
                MetricName: 'BackupErrors',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: table
                    }
                ],
                Value: 1
            });
        }

        if (details) {
            params.MetricData.push({
                MetricName: 'BackupSize',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: table
                    }
                ],
                Value: details.size,
                Unit: 'Bytes'
            }, {
                MetricName: 'BackupRecordCount',
                Dimensions: [
                    {
                        Name: 'TableName',
                        Value: table
                    }
                ],
                Value: details.count,
                Unit: 'Count'
            });
        }

        cw.putMetricData(params, function(err) {
            if (err) return log.error(err);
            if (!details) return log.info('Snapshot failed, wrote error metric to %s', args.metric);
            log.info('Wrote %s size / %s count metrics to %s', details.size, details.count, args.metric);
        });
    }
});


================================================
FILE: bin/replicate-record.js
================================================
#!/usr/bin/env node

var minimist = require('minimist');
var Dyno = require('@mapbox/dyno');

var args = minimist(process.argv.slice(2));

function usage() {
    console.error('');
    console.error('Usage: replicate-record <primary tableinfo> <replica tableinfo> <recordkey>');
    console.error(' - primary tableinfo: the primary table to replicate from, specified as `region/tablename`');
    console.error(' - replica tableinfo: the replica table to replicate to, specified as `region/tablename`');
    console.error(' - recordkey: the key for the record specified as a JSON object');
}

if (args.help) {
    usage();
    process.exit(0);
}

var primary = args._[0];

if (!primary) {
    console.error('Must provide primary table information');
    usage();
    process.exit(1);
}

var primaryDyno = Dyno({
    table: primary.split('/')[1],
    region: primary.split('/')[0]
});

var replica = args._[1];

if (!replica) {
    console.error('Must provide replica table information');
    usage();
    process.exit(1);
}

var replicaDyno = Dyno({
    table: replica.split('/')[1],
    region: replica.split('/')[0]
});

var key = args._[2];

if (!key) {
    console.error('Must provide a record key');
    usage();
    process.exit(1);
}

// Converts incoming strings in wire or dyno format into dyno format
try {
    var obj = Dyno.deserialize(key);
    for (var k in obj) if (!obj[k]) throw new Error();
    key = obj;

}
catch (err) { key = JSON.parse(key); }

primaryDyno.getItem({ Key: key, ConsistentRead: true }, function(err, data) {
    if (err) throw err;
    var item = data.Item;

    if (!item) return replicaDyno.deleteItem({ Key: key }, function(err) {
        if (err) throw err;
    });

    replicaDyno.putItem({ Item: item }, function(err) {
        if (err) throw err;
    });
});


================================================
FILE: cloudformation/travis.template
================================================
{
    "AWSTemplateFormatVersion": "2010-09-09",
    "Description": "Travis user for testing dynamodb-replicator",
    "Resources": {
        "dynamodbReplicatorUser": {
            "Type": "AWS::IAM::User",
            "Properties": {
                "Path": "/",
                "Policies": [
                    {
                        "PolicyName": "dynamodb-replicator",
                        "PolicyDocument": {
                            "Statement": [
                                {
                                    "Action": [
                                        "s3:ListBucket"
                                    ],
                                    "Effect": "Allow",
                                    "Resource": ["arn:aws:s3:::mapbox"],
                                    "Condition":{"StringLike":{"s3:prefix":["dynamodb-replicator/test*"]}}
                                }, {
                                    "Resource": [
                                        "arn:aws:s3:::mapbox/dynamodb-replicator/test*"
                                    ],
                                    "Action": [
                                        "s3:PutObject",
                                        "s3:GetObject",
                                        "s3:DeleteObject"
                                    ],
                                    "Effect": "Allow"
                                },
                                {
                                    "Action": [
                                        "dynamodb:CreateTable",
                                        "dynamodb:DeleteTable",
                                        "dynamodb:DescribeTable",
                                        "dynamodb:PutItem",
                                        "dynamodb:BatchWriteItem",
                                        "dynamodb:Scan"
                                    ],
                                    "Effect": "Allow",
                                    "Resource": {
                                        "Fn::Join": [
                                            "",
                                            [
                                                "arn:aws:dynamodb:us-east-1:",
                                                {
                                                    "Ref": "AWS::AccountId"
                                                },
                                                ":table/test-dynamodb-replicator-*"
                                            ]
                                        ]
                                    }
                                },
                                {
                                    "Action": [
                                        "cloudwatch:PutMetricData",
                                        "cloudwatch:GetMetricStatistics"
                                    ],
                                    "Effect": "Allow",
                                    "Resource": "*"
                                }
                            ]
                        }
                    }
                ]
            }
        },
        "dynamodbReplicatorKey": {
            "Type": "AWS::IAM::AccessKey",
            "Properties": {
                "UserName": {
                    "Ref": "dynamodbReplicatorUser"
                }
            }
        }
    },
    "Outputs": {
        "BuildAccessKeyId": {
            "Value": {
                "Ref": "dynamodbReplicatorKey"
            }
        },
        "BuildSecretAccessKey": {
            "Value": {
                "Fn::GetAtt": [
                    "dynamodbReplicatorKey",
                    "SecretAccessKey"
                ]
            }
        }
    }
}


================================================
FILE: diff.js
================================================
var _ = require('underscore');
var queue = require('queue-async');
var Dyno = require('@mapbox/dyno');
var stream = require('stream');
var assert = require('assert');

module.exports = function(config, done) {
    var primary = Dyno(config.primary);
    var replica = Dyno(config.replica);
    primary.tableName = config.primary.table;
    replica.tableName = config.replica.table;
    primary.name = 'primary';
    replica.name = 'replica';

    var log = config.log || console.log;

    var scanOpts = Object.prototype.hasOwnProperty.call(config, 'segment') && config.segments ?
        { Segment: config.segment, TotalSegments: config.segments } : undefined;

    var discrepancies = 0;
    var itemsScanned = 0;
    var itemsCompared = 0;
    var start = Date.now();

    function report() {
        var elapsed = (Date.now() - start) / 1000;
        var scanRate = Math.min(itemsScanned, (itemsScanned / elapsed).toFixed(2));
        var compareRate = Math.min(itemsCompared, (itemsCompared / elapsed).toFixed(2));
        log('[progress] Scan rate: %s items @ %s items/s | Compare rate: %s items/s', itemsScanned, scanRate, compareRate);
    }

    var reporter = setInterval(report, 60000).unref();

    function Aggregate() {
        var aggregation = new stream.Transform({ objectMode: true });
        aggregation.records = [];

        aggregation._transform = function(record, enc, callback) {
            aggregation.records.push(record);

            if (aggregation.records.length < 25) return callback();

            aggregation.push(aggregation.records);
            aggregation.records = [];
            callback();
        };

        aggregation._flush = function(callback) {
            if (aggregation.records.length) aggregation.push(aggregation.records);
            callback();
        };

        return aggregation;
    }

    function Compare(readFrom, compareTo, keySchema, deleteMissing) {
        var noItem = deleteMissing ? 'extraneous' : 'missing';
        var comparison = new stream.Transform({ objectMode: true });
        comparison.discrepancies = 0;

        comparison._transform = function(records, enc, callback) {
            var params = { RequestItems: {} };
            params.RequestItems[readFrom.tableName] = { Keys: [] };
            itemsScanned += records.length;

            var recordKeys = records.reduce(function(recordKeys, record) {
                var key = keySchema.reduce(function(key, attribute) {
                    key[attribute] = record[attribute];
                    return key;
                }, {});
                params.RequestItems[readFrom.tableName].Keys.push(key);
                recordKeys.push(key);
                return recordKeys;
            }, []);

            var indexedRecords = records.reduce(function(indexedRecords, record, i) {
                indexedRecords[JSON.stringify(recordKeys[i])] = record;
                return indexedRecords;
            }, {});

            if (config.backfill) {
                Object.keys(indexedRecords).forEach(function(key) {
                    var record = indexedRecords[key];
                    log('[backfill] %s', key);
                    comparison.discrepancies++;
                    itemsCompared++;
                    comparison.push({ put: record });
                });

                return callback();
            }

            var items = [];
            (function read(params) {
                readFrom.batchGetItem(params, function(err, data) {
                    if (err) return callback(err);

                    items = items.concat(data.Responses[readFrom.tableName]);

                    if (Object.keys(data.UnprocessedKeys).length)
                        return read({ RequestItems: data.UnprocessedKeys });

                    gotAll();
                })
            })(params);

            function gotAll() {
                var itemKeys = items.reduce(function(itemKeys, item) {
                    itemKeys.push(keySchema.reduce(function(key, attribute) {
                        key[attribute] = item[attribute];
                        return key;
                    }, {}));
                    return itemKeys;
                }, []);

                var indexedItems = items.reduce(function(indexedItems, item, i) {
                    indexedItems[JSON.stringify(itemKeys[i])] = item;
                    return indexedItems;
                }, {});

                var q = queue();

                // Find missing records -- scan gave us a key but the batch read did not find a match
                recordKeys.forEach(function(key) {
                    var item = indexedItems[JSON.stringify(key)];

                    if (!item) {
                        q.defer(function(next) {
                            compareTo.getItem({ Key: key, ConsistentRead: true }, function(err, data) {
                                itemsCompared++;
                                if (err) return next(err);
                                var record = data.Item;

                                if (record) {
                                    comparison.discrepancies++;
                                    log('[%s] %j', noItem, key);
                                    if (!config.repair) return next();
                                    if (deleteMissing) comparison.push({ remove: key });
                                    else comparison.push({ put: record });
                                }

                                next();
                            });
                        });
                    }
                });

                // Find differing records -- iterate through each item that we did find in the batch read
                _(indexedItems).each(function(item, key) {
                    itemsCompared++;
                    var record = indexedRecords[key];
                    var recordString = Dyno.serialize(record);
                    var itemString = Dyno.serialize(item);

                    try { assert.deepEqual(JSON.parse(recordString), JSON.parse(itemString)); }
                    catch (notEqual) {
                        q.defer(function(next) {
                            compareTo.getItem({ Key: JSON.parse(key), ConsistentRead: true }, function(err, data) {
                                if (err) return next(err);
                                var record = data.Item;

                                var recordString = Dyno.serialize(record);

                                try { assert.deepEqual(JSON.parse(recordString), JSON.parse(itemString)); }
                                catch (notEqual) {
                                    comparison.discrepancies++;
                                    log('[different] %s', key);
                                    if (!config.repair) return next();
                                    comparison.push({ put: record });
                                }

                                next();
                            });
                        });
                    }
                });

                q.awaitAll(function(err) { callback(err); });
            }
        };

        return comparison;
    }

    function Write() {
        var writer = new stream.Writable({ objectMode: true, highWaterMark: 40 });
        writer.params = { RequestItems: {} };
        writer.params.RequestItems[replica.tableName] = [];
        writer.pending = false;

        writer._write = function(item, enc, callback) {
            if (!item.put && !item.remove) {
                return callback(new Error('Invalid item sent to writer: ' + JSON.stringify(item)));
            }

            var buffer = writer.params.RequestItems[replica.tableName];
            buffer.push(item.put ? { PutRequest: { Item: item.put } } : { DeleteRequest: { Key: item.remove } });
            if (buffer.length < 25) return callback();

            (function write(params) {
                writer.pending = true;
                replica.batchWriteItem(params, function(err, data) {
                    writer.pending = false;
                    if (err) return callback(err);

                    if (data.UnprocessedKeys && Object.keys(data.UnprocessedKeys).length)
                        return write({ RequestItems: data.UnprocessedKeys });

                    writer.params.RequestItems[replica.tableName] = [];
                    callback();
                });
            })(writer.params);
        };

        var streamEnd = writer.end.bind(writer);
        writer.end = function() {
            if (writer.pending) return setImmediate(writer.end);

            if (!writer.params.RequestItems[replica.tableName].length)
                return streamEnd();

            (function write(params) {
                replica.batchWriteItem(params, function(err, data) {
                    if (err) return streamEnd(err);

                    if (data.UnprocessedKeys && Object.keys(data.UnprocessedKeys).length)
                        return write({ RequestItems: data.UnprocessedKeys });

                    streamEnd();
                });
            })(writer.params);
        };

        return writer;
    }

    primary.describeTable(function(err, description) {
        if (err) return done(err);
        var keySchema = _(description.Table.KeySchema).pluck('AttributeName');
        scanPrimary(keySchema);
    });

    function scanPrimary(keySchema) {
        var aggregate = Aggregate();
        var compare = Compare(replica, primary, keySchema, false);
        var write = Write();

        log('Scanning primary table and comparing to replica');

        primary.scanStream(scanOpts)
            .on('error', finish)
            .pipe(aggregate)
            .on('error', finish)
            .pipe(compare)
            .on('error', finish)
            .pipe(write)
            .on('error', finish)
            .on('finish', function() {
                discrepancies += compare.discrepancies;
                log('[discrepancies] %s', compare.discrepancies);
                if (!config.backfill) return scanReplica(keySchema);
                finish();
            });
    }

    function scanReplica(keySchema) {
        var aggregate = Aggregate();
        var compare = Compare(primary, replica, keySchema, true);
        var write = Write();

        log('Scanning replica table and comparing to primary');

        replica.scanStream(scanOpts)
            .on('error', finish)
            .pipe(aggregate)
            .on('error', finish)
            .pipe(compare)
            .on('error', finish)
            .pipe(write)
            .on('error', finish)
            .on('finish', function() {
                discrepancies += compare.discrepancies;
                log('[discrepancies] %s', compare.discrepancies);
                finish();
            });
    }

    function finish(err) {
        clearInterval(reporter);
        report();
        done(err, discrepancies);
    }
};


================================================
FILE: fastlog.js
================================================
var _ = require('underscore');
var util = require('util');

module.exports = function(category, level, template) {
    category = category || 'default';
    template = template || '[${timestamp}] [${level}] [${category}]';
    level = level || process.env.FASTLOG_LEVEL || 'info';
    var levels = ['debug', 'info', 'warn', 'error', 'fatal'];
    return _(levels).reduce(function(logger, l) {
        logger[l] = function() {
            if (levels.indexOf(l) < levels.indexOf(level)) return;
            var prefix = template
                .replace(/\${ ?timestamp ?}/g, new Date().toUTCString())
                .replace(/\${ ?level ?}/g, l)
                .replace(/\${ ?category ?}/g, category);
            var msg;
            if (arguments[0] instanceof Error) {
                var err = arguments[0];
                // Error objects passed directly.
                var lines = [err.toString()];
                if (err.stack) {
                    var stack = err.stack.split('\n');
                    lines = lines.concat(stack.slice(1, stack.length));
                }
                _(err).each(function(val, key) {
                    if (_(val).isString() || _(val).isNumber()) lines.push('    ' + key + ': ' + val);
                });

                msg = lines.join('\n');
            } else {
                // Normal string messages.
                msg = util.format.apply(this, arguments);
            }

            console.log('%s %s', prefix, msg);
            return util.format('%s %s', prefix, msg);
        };
        return logger;
    }, {});
};

================================================
FILE: index.js
================================================
var AWS = require('aws-sdk');
var Dyno = require('@mapbox/dyno');
var queue = require('queue-async');
var crypto = require('crypto');
var https = require('https');

module.exports.replicate = replicate;
module.exports.backup = incrementalBackup;
module.exports.snapshot = require('./s3-snapshot');
module.exports.agent = new https.Agent({
    keepAlive: true,
    maxSockets: Math.ceil(require('os').cpus().length * 16),
    keepAliveMsecs: 60000
})

function replicate(event, context, callback) {
    var replicaConfig = {
        accessKeyId: process.env.ReplicaAccessKeyId || undefined,
        secretAccessKey: process.env.ReplicaSecretAccessKey || undefined,
        table: process.env.ReplicaTable,
        region: process.env.ReplicaRegion,
        maxRetries: 1000,
        httpOptions: {
            timeout: 2000,
            agent: module.exports.agent
        }
    };

    if (process.env.ReplicaEndpoint) replicaConfig.endpoint = process.env.ReplicaEndpoint;
    var replica = new Dyno(replicaConfig);

    var keyAttrs = Object.keys(event.Records[0].dynamodb.Keys);

    var filterer;
    if (process.env.TurnoverRole && process.env.TurnoverAt) {
        // Filterer function should return true if the record SHOULD be processed
        filterer = function(record) {
            var created = Number(record.dynamodb.ApproximateCreationDateTime + '000');
            var turnoverAt = Number(process.env.TurnoverAt);
            if (process.env.TurnoverRole === 'BEFORE') return created < turnoverAt;
            else if (process.env.TurnoverRole === 'AFTER') return created >= turnoverAt;
            else return true;
        };
    }

    var count = 0;
    var allRecords = event.Records.reduce(function(allRecords, change) {
        if (filterer && !filterer(change)) return allRecords;
        var id = JSON.stringify(change.dynamodb.Keys);
        allRecords[id] = allRecords[id] || [];
        allRecords[id].push(change);
        count++;
        return allRecords;
    }, {});

    if (count === 0) {
        console.log('No records replicated');
        return callback();
    }

    var params = { RequestItems: {} };
    params.RequestItems[process.env.ReplicaTable] = Object.keys(allRecords).map(function(key) {
        var change = allRecords[key].pop();
        if (change.eventName === 'INSERT' || change.eventName === 'MODIFY') {
            return {
                PutRequest: { Item: Dyno.deserialize(JSON.stringify(change.dynamodb.NewImage)) }
            };
        } else if (change.eventName === 'REMOVE') {
            return {
                DeleteRequest: { Key: Dyno.deserialize(JSON.stringify(change.dynamodb.Keys)) }
            }
        }
    });

    (function batchWrite(requestSet, attempts) {
        requestSet.forEach(function(req) {
            if (req) req.on('retry', function(res) {
                if (!res.error || !res.httpResponse || !res.httpResponse.headers) return;
                if (res.error.name === 'TimeoutError') res.error.retryable = true;
                console.log(
                    '[failed-request] %s | request-id: %s | crc32: %s | items: %j',
                    res.error.message,
                    res.httpResponse.headers['x-amzn-requestid'],
                    res.httpResponse.headers['x-amz-crc32'],
                    req.params.RequestItems[process.env.ReplicaTable].map(function(req) {
                        if (req.DeleteRequest) return req.DeleteRequest.Key;
                        if (req.PutRequest) return keyAttrs.reduce(function(key, k) {
                            key[k] = req.PutRequest.Item[k];
                            return key;
                        }, {});
                    })
                );
            });
        });

        requestSet.sendAll(100, function(errs, responses, unprocessed) {
            attempts++;

            if (errs) {
                var messages = errs
                    .filter(function(err) { return !!err; })
                    .map(function(err) { return err.message; })
                    .join(' | ');
                console.log('[error] %s', messages);
                return callback(errs);
            }

            if (unprocessed) {
                console.log('[retry] attempt %s contained unprocessed items', attempts);
                return setTimeout(batchWrite, Math.pow(2, attempts), unprocessed, attempts);
            }

            console.log('Replicated ' + count + ' records');
            callback();
        });
    })(replica.batchWriteItemRequests(params), 0);
}

function incrementalBackup(event, context, callback) {
    var params = {
        maxRetries: 1000,
        httpOptions: {
            timeout: 1000,
            agent: module.exports.agent
        }
    };

    if (process.env.BackupRegion) params.region = process.env.BackupRegion;

    var s3 = new AWS.S3(params);

    var filterer;
    if (process.env.TurnoverRole && process.env.TurnoverAt) {
        // Filterer function should return true if the record SHOULD be processed
        filterer = function(record) {
            var created = Number(record.dynamodb.ApproximateCreationDateTime + '000');
            var turnoverAt = Number(process.env.TurnoverAt);
            if (process.env.TurnoverRole === 'BEFORE') return created < turnoverAt;
            else if (process.env.TurnoverRole === 'AFTER') return created >= turnoverAt;
            else return true;
        };
    }

    var count = 0;
    var allRecords = event.Records.reduce(function(allRecords, action) {
        if (filterer && !filterer(action)) return allRecords;

        var id = JSON.stringify(action.dynamodb.Keys);

        allRecords[id] = allRecords[id] || [];
        allRecords[id].push(action);
        count++;
        return allRecords;
    }, {});

    if (count === 0) {
        console.log('No records backed up');
        return callback();
    }

    var q = queue();

    Object.keys(allRecords).forEach(function(key) {
        q.defer(backupRecord, allRecords[key]);
    });

    q.awaitAll(function(err) {
        if (err) throw err;
        callback();
    });

    function backupRecord(changes, callback) {
        var q = queue(1);

        changes.forEach(function(change) {
            q.defer(function(next) {
                var id = crypto.createHash('sha256')
                    .update(JSON.stringify(change.dynamodb.Keys))
                    .digest('hex');

                var table = change.eventSourceARN.split('/')[1];

                var params = {
                    Bucket: process.env.BackupBucket,
                    Key: [process.env.BackupPrefix, table, id].join('/')
                };

                var req = change.eventName === 'REMOVE' ? 'deleteObject' : 'putObject';
                if (req === 'putObject') params.Body = JSON.stringify(change.dynamodb.NewImage);

                s3[req](params, function(err) {
                    if (err) console.log(
                        '[error] %s | %s s3://%s/%s | %s',
                        JSON.stringify(change.dynamodb.Keys),
                        req, params.Bucket, params.Key,
                        err.message
                    );
                    next(err);
                }).on('retry', function(res) {
                    if (!res.error || !res.httpResponse || !res.httpResponse.headers) return;
                    if (res.error.name === 'TimeoutError') res.error.retryable = true;
                    console.log(
                        '[failed-request] request-id: %s | id-2: %s | %s s3://%s/%s | %s',
                        res.httpResponse.headers['x-amz-request-id'],
                        res.httpResponse.headers['x-amz-id-2'],
                        req, params.Bucket, params.Key,
                        res.error
                    );
                });
            });
        });

        q.awaitAll(function(err) {
            if (err) return callback(err);
            console.log('Backed up ' + count + ' records')
            callback();
        });
    }
}


================================================
FILE: package.json
================================================
{
  "name": "@mapbox/dynamodb-replicator",
  "version": "10.1.1",
  "description": "",
  "main": "index.js",
  "scripts": {
    "pretest": "eslint *.js bin/*.js test/*.js",
    "test": "tape test/*.test.js"
  },
  "bin": {
    "diff-tables": "bin/diff-tables.js",
    "diff-record": "bin/diff-record.js",
    "replicate-record": "bin/replicate-record.js",
    "backup-table": "bin/backup-table.js",
    "incremental-backfill": "bin/incremental-backfill.js",
    "incremental-snapshot": "bin/incremental-snapshot.js",
    "incremental-backup-record": "bin/incremental-backup-record.js",
    "incremental-record-history": "bin/incremental-record-history.js",
    "incremental-diff-record": "bin/incremental-diff-record.js"
  },
  "repository": {
    "type": "git",
    "url": "https://github.com/mapbox/dynamodb-replicator.git"
  },
  "author": "Mapbox",
  "license": "ISC",
  "bugs": {
    "url": "https://github.com/mapbox/dynamodb-replicator/issues"
  },
  "homepage": "https://github.com/mapbox/dynamodb-replicator",
  "dependencies": {
    "@mapbox/dyno": "^1.6.2",
    "agentkeepalive": "^4.5.0",
    "aws-sdk": "^2.1476.0",
    "minimist": "^1.2.8",
    "queue-async": "1.0.7",
    "@mapbox/s3scan": "^1.1.0",
    "s3urls": "^1.5.2",
    "underscore": "^1.13.8"
  },
  "devDependencies": {
    "@mapbox/dynamodb-test": "^0.6.2",
    "eslint": "^8.51.0",
    "tape": "^5.7.1"
  },
  "overrides": {
    "minimatch": ">=3.1.3"
  }
}


================================================
FILE: parse-location.js
================================================
// Regex to find IP address and port
// ex.: 127.0.0.1:80 or localhost:80
var regex = new RegExp('(\\b(?:[0-9]{1,3}\\.){3}[0-9]{1,3}\\b:[0-9]+)|(localhost:[0-9]+)', 'i');

module.exports = {
    parse: function (primary, replica){
        if (regex.test(primary[0])){
            primary = {region: 'local', endpoint:'http://' + primary[0], table: primary[1]};
        } else {
            primary = {region: primary[0], table: primary[1]};
        }

        if (regex.test(replica[0])) {
            replica = {region: 'local', endpoint: 'http://' + replica[0], table: replica[1]};
        } else { 
            replica = {region: replica[0], table: replica[1]};
        }
        return [primary, replica]
    }
}

================================================
FILE: s3-backfill.js
================================================
var Dyno = require('@mapbox/dyno');
var AWS = require('aws-sdk');
var stream = require('stream');
var queue = require('queue-async');
var crypto = require('crypto');
var https = require('https');

module.exports = backfill;

module.exports.agent = new https.Agent({
    keepAlive: true,
    maxSockets: Math.ceil(require('os').cpus().length * 16),
    keepAliveMsecs: 60000
});

function backfill(config, done) {
    var s3 = new AWS.S3({
        maxRetries: 1000,
        httpOptions: {
            timeout: 1000,
            agent: module.exports.agent
        }
    });

    var primary = Dyno(config);

    if (config.backup)
        if (!config.backup.bucket || !config.backup.prefix)
            return done(new Error('Must provide a bucket and prefix for incremental backups'));

    primary.describeTable(function(err, data) {
        if (err) return done(err);

        var keys = data.Table.KeySchema.map(function(schema) {
            return schema.AttributeName;
        });

        var count = 0;
        var starttime = Date.now();

        var writer = new stream.Writable({ objectMode: true, highWaterMark: 1000 });

        writer.queue = queue();
        writer.queue.awaitAll(function(err) { if (err) done(err); });
        writer.pending = 0;

        writer._write = function(record, enc, callback) {
            if (writer.pending > 1000)
                return setImmediate(writer._write.bind(writer), record, enc, callback);

            var key = keys.reduce(function(key, k) {
                key[k] = record[k];
                return key;
            }, {});

            var id = crypto.createHash('sha256')
                .update(Dyno.serialize(key))
                .digest('hex');

            var params = {
                Bucket: config.backup.bucket,
                Key: [config.backup.prefix, config.table, id].join('/'),
                Body: Dyno.serialize(record)
            };

            writer.drained = false;
            writer.pending++;
            writer.queue.defer(function(next) {
                s3.putObject(params, function(err) {
                    count++;
                    process.stdout.write('\r\033[K' + count + ' - ' + (count / ((Date.now() - starttime) / 1000)).toFixed(2) + '/s');
                    writer.pending--;
                    if (err) writer.emit('error', err);
                    next();
                });
            });
            callback();
        };

        writer.once('error', done);

        var end = writer.end.bind(writer);
        writer.end = function() {
            writer.queue.awaitAll(end);
        };

        primary.scanStream()
            .on('error', next)
            .pipe(writer)
            .on('error', next)
            .on('finish', next);

        function next(err) {
            if (err) return done(err);
            done(null, { count: count });
        }
    });
}


================================================
FILE: s3-snapshot.js
================================================
var AWS = require('aws-sdk');
var s3scan = require('@mapbox/s3scan');
var zlib = require('zlib');
var stream = require('stream');
var AgentKeepAlive = require('agentkeepalive');

module.exports = function(config, done) {
    var log = config.log || console.log;

    if (!config.source || !config.source.bucket || !config.source.prefix)
        return done(new Error('Must provide source bucket and prefix to snapshot'));

    if (!config.destination || !config.destination.bucket || !config.destination.key)
        return done(new Error('Must provide destination bucket and key where the snapshot will be put'));

    var s3Options = {
        httpOptions: {
            timeout: 1000,
            agent: new AgentKeepAlive.HttpsAgent({
                keepAlive: true,
                maxSockets: 256,
                keepAliveTimeout: 60000
            })
        }
    };
    if (config.maxRetries) s3Options.maxRetries = config.maxRetries;
    if (config.logger) s3Options.logger = config.logger;

    var s3 = new AWS.S3(s3Options);

    var size = 0;
    var uri = ['s3:/', config.source.bucket, config.source.prefix].join('/');
    var partsLoaded = -1;

    var objStream = s3scan.Scan(uri, { s3: s3 })
        .on('error', function(err) { done(err); });
    var gzip = zlib.createGzip()
        .on('error', function(err) { done(err); });

    var stringify = new stream.Transform();
    stringify._writableState.objectMode = true;
    stringify._transform = function(data, enc, callback) {
        if (!data || !data.Body ) return callback();
        callback(null, data.Body.toString() + '\n');
    };

    var upload = s3.upload({
        Bucket: config.destination.bucket,
        Key: config.destination.key,
        Body: gzip
    }).on('httpUploadProgress', function(details) {
        if (details.part !== partsLoaded) {
            log(
                'Starting upload of part #%s, %s bytes uploaded, %s items uploaded @ %s items/s',
                details.part - 1, size,
                objStream.got, objStream.rate()
            );
        }

        partsLoaded = details.part;
        size = details.loaded;
    }).on('error', function(err) { done(err); });

    log(
        'Starting snapshot from s3://%s/%s to s3://%s/%s',
        config.source.bucket, config.source.prefix,
        config.destination.bucket, config.destination.key
    );

    objStream.pipe(stringify).pipe(gzip);

    upload.send(function(err) {
        if (err) return done(err);

        log('Uploaded snapshot to s3://%s/%s', config.destination.bucket, config.destination.key);
        log('Wrote %s items and %s bytes to snapshot', objStream.got, size);
        done(null, { size: size, count: objStream.got });
    });
};


================================================
FILE: test/backup.test.js
================================================
var test = require('tape');
var dynamodb = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'))
var backup = require('../backup');
var _ = require('underscore');
var crypto = require('crypto');
var AWS = require('aws-sdk');
var s3 = new AWS.S3();
var queue = require('queue-async');
var zlib = require('zlib');

var primaryItems = [
    {hash: 'hash1', range: 'range1', other:1},
    {hash: 'hash1', range: 'range2', other:2},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

var records = _.range(1000).map(function() {
    return {
        hash: crypto.randomBytes(8).toString('hex'),
        range: crypto.randomBytes(8).toString('hex'),
        other: crypto.randomBytes(8)
    };
});

dynamodb.start();

dynamodb.test('backup: one segment', primaryItems, function(assert) {
    var config = {
        backup: {
            bucket: 'mapbox',
            prefix: 'dynamodb-replicator/test',
            jobid: crypto.randomBytes(4).toString('hex')
        },
        table: dynamodb.tableName,
        region: 'us-east-1',
        accessKeyId: 'fake',
        secretAccessKey: 'fake',
        endpoint: 'http://localhost:4567'
    };

    backup(config, function(err, details) {
        assert.ifError(err, 'backup completed');
        if (err) return assert.end();

        assert.equal(details.count, 3, 'reported 3 records');
        assert.equal(details.size, 101, 'reported 101 bytes');

        s3.getObject({
            Bucket: 'mapbox',
            Key: [config.backup.prefix, config.backup.jobid, '0'].join('/')
        }, function(err, data) {
            assert.ifError(err, 'retrieved backup from S3');
            if (err) return assert.end();

            assert.ok(data.Body, 'file has content');

            zlib.gunzip(data.Body, function(err, data) {
                assert.ifError(err, 'gzipped backup');
                data = data.toString().trim().split('\n');
                assert.deepEqual(data, [
                    '{"hash":{"S":"hash1"},"range":{"S":"range1"},"other":{"N":"1"}}',
                    '{"hash":{"S":"hash1"},"range":{"S":"range2"},"other":{"N":"2"}}',
                    '{"hash":{"S":"hash1"},"range":{"S":"range4"},"other":{"B":"aGVsbG8gd29ybGQ="}}'
                ], 'expected data backed up to S3');

                assert.end();
            });
        });
    });
});

dynamodb.test('backup: parallel', records, function(assert) {
    var config = {
        backup: {
            bucket: 'mapbox',
            prefix: 'dynamodb-replicator/test',
            jobid: crypto.randomBytes(4).toString('hex')
        },
        table: dynamodb.tableName,
        region: 'us-east-1',
        accessKeyId: 'fake',
        secretAccessKey: 'fake',
        endpoint: 'http://localhost:4567',
        segments: 2
    };

    var firstConfig = _({ segment: 0 }).extend(config);
    var secondConfig = _({ segment: 1 }).extend(config);
    var firstKey = [config.backup.prefix, config.backup.jobid, firstConfig.segment].join('/');
    var secondKey = [config.backup.prefix, config.backup.jobid, secondConfig.segment].join('/');

    queue(1)
        .defer(backup, firstConfig)
        .defer(backup, secondConfig)
        .defer(s3.getObject.bind(s3), { Bucket: 'mapbox', Key: firstKey })
        .defer(s3.getObject.bind(s3), { Bucket: 'mapbox', Key: secondKey })
        .awaitAll(function(err, results) {
            assert.ifError(err, 'all requests completed');
            if (err) return assert.end();

            assert.equal(results[0].count + results[1].count, 1000, 'reported 1000 records');

            var s3results = results.slice(2);
            zlib.gunzip(s3results[0].Body, function(err, first) {
                assert.ifError(err, 'gzipped backup');
                zlib.gunzip(s3results[1].Body, function(err, second) {
                    assert.ifError(err, 'gzipped backup');
                    first = first.toString().trim().split('\n');
                    second = second.toString().trim().split('\n');
                    assert.equal(first.length + second.length, 1000, 'backed up all records');
                    assert.end();
                });
            });
        });
});

dynamodb.delete();
dynamodb.close();


================================================
FILE: test/diff-record.test.js
================================================
var test = require('tape');
var primary = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'));
var replica = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'));
var exec = require('child_process').exec;
var queue = require('queue-async');
var diffRecord = require('path').resolve(__dirname, '..', 'bin', 'diff-record.js');

var primaryItems = [
    {hash: 'hash1', range: 'range1', other:1},
    {hash: 'hash1', range: 'range2', other:2},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

var replicaItems = [
    {hash: 'hash1', range: 'range2', other:10000},
    {hash: 'hash1', range: 'range3', other:3},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

primary.start();
primary.load(primaryItems);
replica.start();
replica.load(replicaItems);

test('diff-record', function(assert) {
    queue()
        .defer(function(next) {
            var cmd = [
                diffRecord,
                'local/' + primary.tableName,
                'local/' + replica.tableName,
                '\'{"hash":"hash1","range":"range2"}\''
            ].join(' ');
            exec(cmd, function(err, stdout) {
                assert.ifError(err, '[different] does not error');
                assert.ok(/✘/.test(stdout), '[different] reports difference');
                next();
            });
        })
        .defer(function(next) {
            var cmd = [
                diffRecord,
                'local/' + primary.tableName,
                'local/' + replica.tableName,
                '\'{"hash":"hash1","range":"range4"}\''
            ].join(' ');
            exec(cmd, function(err, stdout) {
                assert.ifError(err, '[equivalent] does not error');
                assert.ok(/✔/.test(stdout), '[equivalent] reports equivalence');
                next();
            });
        })
        .awaitAll(function() {
            assert.end();
        });
});

primary.delete();
replica.delete();
primary.close();


================================================
FILE: test/diff.test.js
================================================
var test = require('tape');
var primary = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'));
var replica = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'));
var diff = require('../diff');
var util = require('util');
var _ = require('underscore');
var crypto = require('crypto');
var parse_location = require('../parse-location')

var primaryItems = [
    {hash: 'hash1', range: 'range1', other:1},
    {hash: 'hash1', range: 'range2', other:2},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

var replicaItems = [
    {hash: 'hash1', range: 'range2', other:10000},
    {hash: 'hash1', range: 'range3', other:3},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

var records = _.range(1000).map(function() {
    return {
        hash: crypto.randomBytes(8).toString('hex'),
        range: crypto.randomBytes(8).toString('hex'),
        other: crypto.randomBytes(8)
    };
});

var config = {
    primary: {
        table: primary.tableName,
        region: 'fake',
        endpoint: 'http://localhost:4567'
    },
    replica: {
        table: replica.tableName,
        region: 'fake',
        endpoint: 'http://localhost:4567'
    },
    log: function() {
        config.log.messages.push(util.format.apply(this, arguments));
    }
};

config.log.messages = [];

primary.start();
replica.start();

primary.load(primaryItems);
replica.load(replicaItems);

test('diff: without repairs', function(assert) {
    config.repair = false;
    config.log.messages = [];

    diff(config, function(err, discrepancies) {
        assert.ifError(err, 'diff tables');
        if (err) return assert.end(err);

        assert.equal(discrepancies, 4, 'four discrepacies');

        assert.equal(_.difference(config.log.messages, [
            'Scanning primary table and comparing to replica',
            '[missing] {"hash":"hash1","range":"range1"}',
            '[different] {"hash":"hash1","range":"range2"}',
            '[discrepancies] 2',
            'Scanning replica table and comparing to primary',
            '[extraneous] {"hash":"hash1","range":"range3"}',
            '[different] {"hash":"hash1","range":"range2"}',
            '[discrepancies] 2',
            '[progress] Scan rate: 6 items @ 6 items/s | Compare rate: 6 items/s'
        ]).length, 0, 'expected log messages');

        config.log.messages = [];
        diff(config, function(err, discrepancies) {
            assert.ifError(err, 'diff tables');
            if (err) return assert.end();

            assert.equal(discrepancies, 4, 'four discrepacies on second comparison');

            assert.equal(_.difference(config.log.messages, [
                'Scanning primary table and comparing to replica',
                '[missing] {"hash":"hash1","range":"range1"}',
                '[different] {"hash":"hash1","range":"range2"}',
                '[discrepancies] 2',
                'Scanning replica table and comparing to primary',
                '[extraneous] {"hash":"hash1","range":"range3"}',
                '[different] {"hash":"hash1","range":"range2"}',
                '[discrepancies] 2',
                '[progress] Scan rate: 6 items @ 6 items/s | Compare rate: 6 items/s'
            ]).length, 0, 'expected log messages');

            assert.end();
        });
    });

});

test('diff: with repairs', function(assert) {
    config.repair = true;
    config.log.messages = [];

    diff(config, function(err, discrepancies) {
        assert.ifError(err, 'diff tables');
        if (err) return assert.end(err);

        assert.equal(discrepancies, 3, 'three discrepacies');

        assert.equal(_.difference(config.log.messages, [
            'Scanning primary table and comparing to replica',
            '[missing] {"hash":"hash1","range":"range1"}',
            '[different] {"hash":"hash1","range":"range2"}',
            '[discrepancies] 2',
            'Scanning replica table and comparing to primary',
            '[extraneous] {"hash":"hash1","range":"range3"}',
            '[discrepancies] 1',
            '[progress] Scan rate: 7 items @ 7 items/s | Compare rate: 7 items/s'
        ]).length, 0, 'expected log messages');

        config.repair = false;
        config.log.messages = [];
        diff(config, function(err, discrepancies) {
            assert.ifError(err, 'diff tables');
            if (err) return assert.end();

            assert.equal(discrepancies, 0, 'no discrepacies on second comparison');

            assert.equal(_.difference(config.log.messages, [
                'Scanning primary table and comparing to replica',
                '[discrepancies] 0',
                'Scanning replica table and comparing to primary',
                '[discrepancies] 0',
                '[progress] Scan rate: 6 items @ 6 items/s | Compare rate: 6 items/s'
            ]).length, 0, 'expected log messages');

            assert.end();
        });
    });
});

primary.empty();
replica.empty();
primary.load(primaryItems);
replica.load(replicaItems);

test('diff: backfill', function(assert) {
    config.repair = true;
    config.backfill = true;
    config.log.messages = [];

    diff(config, function(err, discrepancies) {
        assert.ifError(err, 'diff tables');
        if (err) return assert.end();

        assert.equal(discrepancies, 3, 'three records backfilled');
        assert.deepEqual(config.log.messages, [
            'Scanning primary table and comparing to replica',
            '[backfill] {"hash":"hash1","range":"range1"}',
            '[backfill] {"hash":"hash1","range":"range2"}',
            '[backfill] {"hash":"hash1","range":"range4"}',
            '[discrepancies] 3',
            '[progress] Scan rate: 3 items @ 3 items/s | Compare rate: 3 items/s'
        ]);

        config.repair = false;
        config.backfill = false;
        config.log.messages = [];
        diff(config, function(err, discrepancies) {
            assert.ifError(err, 'diff tables');
            if (err) return assert.end();

            assert.equal(discrepancies, 1, 'only an extraneous discrepacy on second comparison');
            assert.deepEqual(config.log.messages, [
                'Scanning primary table and comparing to replica',
                '[discrepancies] 0',
                'Scanning replica table and comparing to primary',
                '[extraneous] {"hash":"hash1","range":"range3"}',
                '[discrepancies] 1',
                '[progress] Scan rate: 7 items @ 7 items/s | Compare rate: 7 items/s'
            ]);

            assert.end();
        });
    });
});

primary.empty();
replica.empty();
primary.load(records);

test('diff: repair/backfill large number of discrepancies', function(assert) {
    config.repair = true;
    config.backfill = true;
    config.log.messages = [];

    diff(config, function(err, discrepancies) {
        assert.ifError(err, 'diff tables');
        if (err) return assert.end();
        assert.equal(discrepancies, 1000, '1000 records backfilled');

        config.repair = false;
        config.backfill = false;
        config.log.messages = [];
        diff(config, function(err, discrepancies) {
            assert.ifError(err, 'diff tables');
            if (err) return assert.end();

            assert.equal(discrepancies, 0, '0 discrepancies post-backfill');
            assert.end();
        });
    });
});

primary.empty();
replica.empty();
primary.load(records);

test('diff: parallel', function(assert) {
    config.repair = false;
    config.backfill = false;
    config.segment = 0;
    config.segments = 10;
    config.log.messages = [];

    diff(config, function(err, discrepancies) {
        assert.ifError(err, 'diff tables');
        if (err) return assert.end();

        assert.ok(discrepancies < 1000, 'scanned partial table');
        assert.end();
    });
});

primary.delete();
replica.delete();
primary.close();

test('diff: parsing locations', function(assert) {
    // Testing with local region and endpoint URL
    primary = '127.0.0.1:8000/table1', replica = 'localhost:8000/table2';
    primary = primary.split('/'), replica = replica.split('/');
    var locations = parse_location.parse(primary, replica);
    primary = locations[0], replica = locations[1];
    assert.ok(primary['endpoint']=='http://127.0.0.1:8000' && primary['region']=='local',
        'got region and endpoint from local ip');
    assert.ok(replica['endpoint']=='http://localhost:8000' && replica['region']=='local',
        'got region and endpoint from localhost');

    // Testing with valid AWS region (Using Beijing region)
    primary = 'cn-north-1/table1', replica = 'cn-north-1/table2';
    primary = primary.split('/'), replica = replica.split('/');
    locations = parse_location.parse(primary, replica);
    primary = locations[0], replica = locations[1];
    assert.ok(primary['endpoint']==null && primary['region']=='cn-north-1' && primary['table']=='table1',
        'got endpoint, region and table from AWS region');
    assert.ok(replica['endpoint']==null && replica['region']=='cn-north-1' && replica['table']=='table2',
        'got endpoint, region and table from AWS region');
    assert.end()
});

================================================
FILE: test/fixtures/events/adjust-many.json
================================================
{
    "Records":[
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"MODIFY",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "11"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":59,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"222",
                "OldImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"2",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-3"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-3"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-2"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-2"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"MODIFY",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "22"
                    },
                    "id": {
                        "S": "record-2"
                    }
                },
                "SizeBytes":59,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"222",
                "OldImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-2"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-2"
                    }
                }
            },
            "eventID":"2",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"REMOVE",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "SizeBytes":38,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"333",
                "OldImage":{
                    "range": {
                        "N": "11"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"3",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"MODIFY",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "33"
                    },
                    "id": {
                        "S": "record-3"
                    }
                },
                "SizeBytes":59,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"222",
                "OldImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-3"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-3"
                    }
                }
            },
            "eventID":"2",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        }
    ]
}


================================================
FILE: test/fixtures/events/insert-buffer.json
================================================
{
    "Records":[
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    },
                    "val": {
                        "B": "aGVsbG8="
                    },
                    "map": {
                        "M": {
                            "prop": {
                                "B": "aGVsbG8="
                            }
                        }
                    },
                    "list": {
                        "L": [
                            {
                                "S": "string"
                            },
                            {
                                "B": "aGVsbG8="
                            }
                        ]
                    },
                    "bufferSet": {
                        "BS": [
                            "aGVsbG8="
                        ]
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        }
    ]
}


================================================
FILE: test/fixtures/events/insert-modify-delete.json
================================================
{
    "Records":[
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"MODIFY",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "2"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":59,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"222",
                "OldImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"2",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"REMOVE",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "SizeBytes":38,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"333",
                "OldImage":{
                    "range": {
                        "N": "2"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"3",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        }
    ]
}


================================================
FILE: test/fixtures/events/insert-modify.json
================================================
{
    "Records":[
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        },
        {
            "eventName":"MODIFY",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "2"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":59,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"222",
                "OldImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"2",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        }
    ]
}


================================================
FILE: test/fixtures/events/insert.json
================================================
{
    "Records":[
        {
            "eventName":"INSERT",
            "eventVersion":"1.0",
            "eventSource":"aws:dynamodb",
            "dynamodb": {
                "NewImage":{
                    "range": {
                        "N": "1"
                    },
                    "id": {
                        "S": "record-1"
                    }
                },
                "SizeBytes":26,
                "StreamViewType":"NEW_AND_OLD_IMAGES",
                "SequenceNumber":"111",
                "Keys":{
                    "id": {
                        "S": "record-1"
                    }
                }
            },
            "eventID":"1",
            "eventSourceARN":"arn:aws:dynamodb:us-east-1:123456789012:table/fake",
            "awsRegion":"us-east-1"
        }
    ]
}


================================================
FILE: test/fixtures/records.js
================================================
var crypto = require('crypto');
var _ = require('underscore');

module.exports = function(num) {
    return _.range(0, num).map(function() {
        return {
            id: crypto.randomBytes(16).toString('hex'),
            data: crypto.randomBytes(256).toString('base64')
        };
    });
};


================================================
FILE: test/fixtures/table.js
================================================
module.exports = {
    AttributeDefinitions: [
        {AttributeName: 'id', AttributeType: 'S'}
    ],
    KeySchema: [
        {AttributeName: 'id', KeyType: 'HASH'}
    ],
    ProvisionedThroughput: {
        ReadCapacityUnits: 1,
        WriteCapacityUnits: 1
    }
};


================================================
FILE: test/incremental.test.js
================================================
var test = require('tape');
var AWS = require('aws-sdk');
var s3 = new AWS.S3();
var crypto = require('crypto');
var tableDef = require('./fixtures/table');
var dynamodb = require('@mapbox/dynamodb-test')(test, 's3-backfill', tableDef);
var Dyno = require('@mapbox/dyno');
var queue = require('queue-async');
var zlib = require('zlib');
var path = require('path');
var os = require('os');
var fs = require('fs');

var backfill = require('../s3-backfill');
var snapshot = require('../s3-snapshot');

var bucket = 'mapbox';
var prefix = 'dynamodb-replicator/test/' + crypto.randomBytes(4).toString('hex');
var records = require('./fixtures/records')(50);

dynamodb.test('[s3-backfill]', records, function(assert) {

    var config = {
        table: dynamodb.tableName,
        region: 'fake',
        endpoint: 'http://localhost:4567',
        accessKey: 'fake',
        secretAccessKey: 'fake',
        backup: {
            bucket: bucket,
            prefix: prefix
        }
    };

    backfill(config, function(err) {
        console.log('\n');
        assert.ifError(err, 'success');

        s3.listObjects({
            Bucket: bucket,
            Prefix: prefix
        }, function(err, data) {
            if (err) throw err;
            checkS3(data.Contents.map(function(item) {
                return item.Key;
            }));
        });
    });

    function checkS3(keys) {
        assert.equal(keys.length, 50, 'all records written to S3');
        var q = queue(20);

        records.forEach(function(expected) {
            var key = crypto.createHash('sha256')
                .update(Dyno.serialize({ id: expected.id }))
                .digest('hex');

            key = [prefix, dynamodb.tableName, key].join('/');
            expected = Dyno.serialize(expected);

            assert.ok(keys.indexOf(key) > -1, 'expected item written for ' + key);
            q.defer(function(next) {
                s3.getObject({
                    Bucket: bucket,
                    Key: key
                }, function(err, data) {
                    if (err) throw err;
                    assert.equal(data.Body.toString(), expected, 'expected data for ' + key);
                    next();
                });
            });
        });

        q.awaitAll(function() {
            assert.end();
        });
    }
});

test('[s3-snapshot]', function(assert) {
    var snapshotKey = [prefix, dynamodb.tableName, 'snapshot'].join('/');
    var log = path.join(os.tmpdir(), crypto.randomBytes(16).toString('hex'));

    var config = {
        source: {
            bucket: bucket,
            prefix: [prefix, dynamodb.tableName].join('/')
        },
        destination: {
            bucket: bucket,
            key: snapshotKey
        },
        logger: fs.createWriteStream(log)
    };

    snapshot(config, function(err, details) {
        assert.ifError(err, 'success');
        assert.equal(details.count, 50, 'reported 50 items');
        assert.ok(details.size, 'reported size');

        var result = '';
        var gunzip = zlib.createGunzip()
            .on('readable', function() {
                var d = gunzip.read();
                if (d) result += d;
            })
            .on('end', function() {
                checkFile(result);
            });

        s3.getObject({
            Bucket: bucket,
            Key: snapshotKey
        }).createReadStream().pipe(gunzip);
    });

    function checkFile(found) {
        found = found.trim().split('\n').map(function(line) {
            return JSON.parse(line);
        });

        var expected = records.reduce(function(expected, item) {
            expected[item.id] = Dyno.serialize(item);
            return expected;
        }, {});

        assert.equal(found.length, 50, 'all objects snapshotted');

        found.forEach(function(item) {
            var id = item.id.S;
            var original = expected[id];
            assert.equal(JSON.stringify(item), original, 'expected item in snapshot ' + id);
        });
        checkLog();
    }

    function checkLog() {
        fs.readFile(log, 'utf8', function(err, data) {
            assert.ifError(err, 'log file was written');
            assert.ok(data.length, 'has logs in it');
            assert.end();
        });
    }
});

dynamodb.close();


================================================
FILE: test/index.test.js
================================================
var test = require('tape');
var tableDef = require('./fixtures/table');
var DynamoDB = require('@mapbox/dynamodb-test');
var replica = DynamoDB(test, 'mapbox-replicator', tableDef);
var Dyno = require('@mapbox/dyno');
var path = require('path');
var events = path.resolve(__dirname, 'fixtures', 'events');
var main = require('..');
var replicate = require('..').replicate;
var backup = require('..').backup;
var _ = require('underscore');
var crypto = require('crypto');
var AWS = require('aws-sdk');
var s3 = new AWS.S3();
var queue = require('queue-async');

replica.start();

var dyno = Dyno({
    table: replica.tableName,
    region: 'mock',
    accessKeyId: 'mock',
    secretAccessKey: 'mock',
    endpoint: 'http://localhost:4567'
});

process.env.ReplicaTable = replica.tableName;
process.env.ReplicaRegion = 'mock';
process.env.ReplicaEndpoint = 'http://localhost:4567';
process.env.AWS_ACCESS_KEY_ID = 'mock';
process.env.AWS_SECRET_ACCESS_KEY = 'mock';
process.env.BackupBucket = 'mapbox';

var httpsAgent;
test('[agent] use http agent for replication tests', function(assert) {
    httpsAgent = main.agent;
    main.agent = require('http').globalAgent;
    assert.end();
});

replica.test('[replicate] insert', function(assert) {
    var event = require(path.join(events, 'insert.json'));
    replicate(event, {}, function(err) {
        assert.ifError(err, 'success');
        dyno.scan(function(err, data) {
            if (err) throw err;
            assert.deepEqual(data, { Count: 1, Items: [{ id: 'record-1', range: 1 }], ScannedCount: 1 }, 'inserted desired record');
            assert.end();
        });
    });
});

replica.test('[replicate] insert & modify', function(assert) {
    var event = require(path.join(events, 'insert-modify.json'));
    replicate(event, {}, function(err) {
        assert.ifError(err, 'success');
        dyno.scan(function(err, data) {
            if (err) throw err;
            assert.deepEqual(data, { Count: 1, Items: [{ id: 'record-1', range: 2 }], ScannedCount: 1 }, 'inserted & modified desired record');
            assert.end();
        });
    });
});

replica.test('[replicate] insert, modify & delete', function(assert) {
    var event = require(path.join(events, 'insert-modify-delete.json'));
    replicate(event, {}, function(err) {
        assert.ifError(err, 'success');
        dyno.scan(function(err, data) {
            if (err) throw err;
            assert.deepEqual(data, { Count: 0, Items: [], ScannedCount: 0 }, 'inserted, modified, and deleted desired record');
            assert.end();
        });
    });
});

replica.test('[replicate] adjust many', function(assert) {
    var event = require(path.join(events, 'adjust-many.json'));
    replicate(event, {}, function(err) {
        assert.ifError(err, 'success');
        dyno.scan(function(err, data) {
            if (err) throw err;

            var expected = [
                { range: 22, id: 'record-2' },
                { range: 33, id: 'record-3' }
            ];

            data = data.Items.map(Dyno.serialize);
            expected = expected.map(Dyno.serialize);

            assert.equal(
                _.intersection(data, expected).length,
                expected.length,
                'adjusted many records correctly'
            );

            assert.end();
        });
    });
});

replica.test('[lambda] insert with buffers', function(assert) {
    var event = require(path.join(events, 'insert-buffer.json'));
    replicate(event, {}, function(err) {
        assert.ifError(err, 'success');
        dyno.scan(function(err, data) {
            if (err) throw err;

            var expected = {
                range: 1,
                id: 'record-1',
                val: new Buffer('hello'),
                map: { prop: new Buffer('hello') },
                list: ['string', new Buffer('hello')],
                bufferSet: Dyno.createSet([new Buffer('hello')], 'B')
            };

            data = data.Items[0];

            assert.equal(data.range, expected.range, 'expected range');
            assert.equal(data.id, expected.id, 'expected id');
            assert.deepEqual(data.val, expected.val, 'expected val');
            assert.deepEqual(data.map, expected.map, 'expected map');
            assert.deepEqual(data.list, expected.list, 'expected list');
            assert.deepEqual(data.bufferSet.contents, expected.bufferSet.contents, 'expected bufferSet.contents');
            assert.end();
        });
    });
});

test('[agent] return agent to normal', function(assert) {
    main.agent = httpsAgent;
    assert.end();
});

test('[incremental backup] configurable region', function(assert) {
    process.env.BackupRegion = 'fake';
    assert.plan(2);

    var S3 = AWS.S3;
    AWS.S3 = function(config) {
        assert.equal(config.region, 'fake', 'configured region on S3 client');
    };

    backup({ Records: [] }, {}, function(err) {
        assert.ifError(err, 'backup success');
        AWS.S3 = S3;
        delete process.env.BackupRegion;
    });
});

test('[incremental backup] insert', function(assert) {
    process.env.BackupPrefix = 'dynamodb-replicator/test/' + crypto.randomBytes(4).toString('hex');

    var event = require(path.join(events, 'insert.json'));
    var table = event.Records[0].eventSourceARN.split('/')[1];
    var id = crypto.createHash('sha256')
        .update(JSON.stringify(event.Records[0].dynamodb.Keys))
        .digest('hex');

    backup(event, {}, function(err) {
        assert.ifError(err, 'success');

        s3.getObject({
            Bucket: process.env.BackupBucket,
            Key: [process.env.BackupPrefix, table, id].join('/')
        }, function(err, data) {
            assert.ifError(err, 'no S3 error');
            assert.ok(data.Body, 'got S3 object');

            var found = JSON.parse(data.Body.toString());
            var expected = { range: { N:'1' }, id: { S: 'record-1' } };
            assert.deepEqual(found, expected, 'expected item put to S3');
            assert.end();
        });
    });
});

test('[incremental backup] insert & modify', function(assert) {
    process.env.BackupPrefix = 'dynamodb-replicator/test/' + crypto.randomBytes(4).toString('hex');

    var event = require(path.join(events, 'insert-modify.json'));
    var table = event.Records[0].eventSourceARN.split('/')[1];
    var id = crypto.createHash('sha256')
        .update(JSON.stringify(event.Records[0].dynamodb.Keys))
        .digest('hex');

    backup(event, {}, function(err) {
        assert.ifError(err, 'success');

        s3.getObject({
            Bucket: process.env.BackupBucket,
            Key: [process.env.BackupPrefix, table, id].join('/')
        }, function(err, data) {
            assert.ifError(err, 'no S3 error');
            assert.ok(data.Body, 'got S3 object');

            var found = JSON.parse(data.Body.toString());
            var expected = { range: { N:'2' }, id: { S: 'record-1' } };
            assert.deepEqual(found, expected, 'expected item modified on S3');
            assert.end();
        });
    });
});

test('[incremental backup] insert, modify & delete', function(assert) {
    process.env.BackupPrefix = 'dynamodb-replicator/test/' + crypto.randomBytes(4).toString('hex');

    var event = require(path.join(events, 'insert-modify-delete.json'));
    var table = event.Records[0].eventSourceARN.split('/')[1];
    var id = crypto.createHash('sha256')
        .update(JSON.stringify(event.Records[0].dynamodb.Keys))
        .digest('hex');

    backup(event, {}, function(err) {
        assert.ifError(err, 'success');

        s3.getObject({
            Bucket: process.env.BackupBucket,
            Key: [process.env.BackupPrefix, table, id].join('/')
        }, function(err) {
            assert.equal(err.code, 'NoSuchKey', 'object was deleted');
            assert.end();
        });
    });
});

test('[incremental backup] adjust many', function(assert) {
    process.env.BackupPrefix = 'dynamodb-replicator/test/' + crypto.randomBytes(4).toString('hex');

    var event = require(path.join(events, 'adjust-many.json'));
    var table = event.Records[0].eventSourceARN.split('/')[1];

    var expected = [
        { range: { N: '22' }, id: { S: 'record-2' } },
        { range: { N: '33' }, id: { S: 'record-3' } }
    ];

    backup(event, {}, function(err) {
        assert.ifError(err, 'success');
        var q = queue();

        expected.forEach(function(record) {
            q.defer(function(next) {
                var key = { id: record.id };
                var id = crypto.createHash('sha256')
                    .update(JSON.stringify(key))
                    .digest('hex');

                s3.getObject({
                    Bucket: process.env.BackupBucket,
                    Key: [process.env.BackupPrefix, table, id].join('/')
                }, function(err, data) {
                    assert.ifError(err, 'no S3 error for ' + JSON.stringify(key));
                    if (!data) return next();
                    assert.ok(data.Body, 'got S3 object for ' + JSON.stringify(key));

                    var found = JSON.parse(data.Body.toString());
                    assert.deepEqual(found, record, 'expected item modified on S3 for ' + JSON.stringify(key));
                    next();
                });
            });
        });

        q.defer(function(next) {
            var id = crypto.createHash('sha256')
                .update(JSON.stringify({ id: { S: 'record-1' } }))
                .digest('hex');

            s3.getObject({
                Bucket: process.env.BackupBucket,
                Key: [process.env.BackupPrefix, table, id].join('/')
            }, function(err) {
                assert.equal(err.code, 'NoSuchKey', 'object was deleted');
                next();
            });
        });

        q.awaitAll(function() {
            assert.end();
        });
    });
});

replica.close();


================================================
FILE: test/live-test.backup-table.js
================================================
var test = require('tape');
var dynamodb = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table.json'), 'us-east-1');
var exec = require('child_process').exec;
var path = require('path');
var crypto = require('crypto');
var AWS = require('aws-sdk');
var queue = require('queue-async');

var primaryItems = [
    {hash: 'hash1', range: 'range1', other:1},
    {hash: 'hash1', range: 'range2', other:2},
    {hash: 'hash1', range: 'range4', other: new Buffer('hello world')}
];

var starttime = (new Date()).toISOString();

dynamodb.start();

dynamodb.test('backup-table shell script', primaryItems, function(assert) {
    var jobid = crypto.randomBytes(4).toString('hex');

    var cmd = [
        path.resolve(__dirname, '..', 'bin', 'backup-table.js'),
        'us-east-1/' + dynamodb.tableName,
        's3://mapbox/dynamodb-replicator/test',
        '--jobid', jobid,
        '--segment 0',
        '--segments 1',
        '--metric Mapbox'
    ].join(' ');

    exec(cmd, function(err) {
        assert.ifError(err, 'success');
        var s3 = new AWS.S3();
        var cw = new AWS.CloudWatch({ region: 'us-east-1' });

        console.log('Waiting 60s for CW to land...');

        setTimeout(function() {
            queue()
                .defer(function(next) {
                    s3.getObject({
                        Bucket: 'mapbox',
                        Key: 'dynamodb-replicator/test/' + jobid + '/0'
                    }, function(err, data) {
                        assert.ifError(err, 'S3 getObject success');
                        assert.ok(data.Body, 'Backup written to s3');
                        next();
                    });
                })
                .defer(function(next) {
                    cw.getMetricStatistics({
                        Namespace: 'Mapbox',
                        Dimensions: [
                            {
                                Name: 'TableName',
                                Value: dynamodb.tableName
                            }
                        ],
                        MetricName: 'BackupSize',
                        StartTime: starttime,
                        EndTime: (new Date()).toISOString(),
                        Period: 60,
                        Statistics: ['Sum']
                    }, function(err, data) {
                        assert.ifError(err, 'CW BackupSize success');
                        if (!data.Datapoints || !data.Datapoints.length) {
                            assert.fail('No CW data found');
                            return next();
                        }
                        assert.equal(data.Datapoints.length, 1, 'BackupSize put to CW');
                        assert.equal(data.Datapoints[0].Sum, 101, 'Correct BackupSize value on CW');
                        next();
                    });
                })
                .defer(function(next) {
                    cw.getMetricStatistics({
                        Namespace: 'Mapbox',
                        Dimensions: [
                            {
                                Name: 'TableName',
                                Value: dynamodb.tableName
                            }
                        ],
                        MetricName: 'BackupRecordCount',
                        StartTime: starttime,
                        EndTime: (new Date()).toISOString(),
                        Period: 60,
                        Statistics: ['Sum']
                    }, function(err, data) {
                        assert.ifError(err, 'CW BackupRecordCount success');
                        if (!data.Datapoints || !data.Datapoints.length) {
                            assert.fail('No CW data found');
                            return next();
                        }
                        assert.equal(data.Datapoints.length, 1, 'BackupRecordCount put to CW');
                        assert.equal(data.Datapoints[0].Sum, 3, 'Correct BackupRecordCount value on CW');
                        next();
                    });
                })
                .await(function() {
                    assert.end();
                });
        }, 60000);
    });
});

dynamodb.delete();


================================================
FILE: test/table.json
================================================
{
    "AttributeDefinitions": [
        {
            "AttributeName": "hash",
            "AttributeType": "S"
        },
        {
            "AttributeName": "range",
            "AttributeType": "S"
        }
    ],
    "KeySchema": [
        {
            "AttributeName": "hash",
            "KeyType": "HASH"
        },
        {
            "AttributeName": "range",
            "KeyType": "RANGE"
        }
    ],
    "ProvisionedThroughput": {
        "ReadCapacityUnits": 10,
        "WriteCapacityUnits": 10
    }
}
Download .txt
gitextract_geww9p6m/

├── .eslintrc
├── .gitignore
├── CODEOWNERS
├── DESIGN.md
├── LICENSE.txt
├── README.md
├── backup.js
├── bin/
│   ├── backup-table.js
│   ├── diff-record.js
│   ├── diff-tables.js
│   ├── incremental-backfill.js
│   ├── incremental-backup-record.js
│   ├── incremental-diff-record.js
│   ├── incremental-record-history.js
│   ├── incremental-snapshot.js
│   └── replicate-record.js
├── cloudformation/
│   └── travis.template
├── diff.js
├── fastlog.js
├── index.js
├── package.json
├── parse-location.js
├── s3-backfill.js
├── s3-snapshot.js
└── test/
    ├── backup.test.js
    ├── diff-record.test.js
    ├── diff.test.js
    ├── fixtures/
    │   ├── events/
    │   │   ├── adjust-many.json
    │   │   ├── insert-buffer.json
    │   │   ├── insert-modify-delete.json
    │   │   ├── insert-modify.json
    │   │   └── insert.json
    │   ├── records.js
    │   └── table.js
    ├── incremental.test.js
    ├── index.test.js
    ├── live-test.backup-table.js
    └── table.json
Download .txt
SYMBOL INDEX (23 symbols across 14 files)

FILE: backup.js
  function next (line 57) | function next(err) {

FILE: bin/backup-table.js
  function usage (line 10) | function usage() {

FILE: bin/diff-record.js
  function usage (line 7) | function usage() {

FILE: bin/diff-tables.js
  function usage (line 9) | function usage() {

FILE: bin/incremental-backfill.js
  function usage (line 7) | function usage() {

FILE: bin/incremental-backup-record.js
  function usage (line 10) | function usage() {

FILE: bin/incremental-diff-record.js
  function usage (line 13) | function usage() {

FILE: bin/incremental-record-history.js
  function usage (line 13) | function usage() {

FILE: bin/incremental-snapshot.js
  function usage (line 11) | function usage() {

FILE: bin/replicate-record.js
  function usage (line 8) | function usage() {

FILE: diff.js
  function report (line 25) | function report() {
  function Aggregate (line 34) | function Aggregate() {
  function Compare (line 56) | function Compare(readFrom, compareTo, keySchema, deleteMissing) {
  function Write (line 185) | function Write() {
  function scanPrimary (line 243) | function scanPrimary(keySchema) {
  function scanReplica (line 266) | function scanReplica(keySchema) {
  function finish (line 288) | function finish(err) {

FILE: index.js
  function replicate (line 16) | function replicate(event, context, callback) {
  function incrementalBackup (line 119) | function incrementalBackup(event, context, callback) {

FILE: s3-backfill.js
  function backfill (line 16) | function backfill(config, done) {

FILE: test/incremental.test.js
  function checkS3 (line 50) | function checkS3(keys) {
  function checkFile (line 118) | function checkFile(found) {
  function checkLog (line 138) | function checkLog() {
Condensed preview — 38 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (125K chars).
[
  {
    "path": ".eslintrc",
    "chars": 285,
    "preview": "{\n    \"rules\": {\n        \"indent\": [2, 4],\n        \"quotes\": [2, \"single\"],\n        \"no-console\": [0]\n    },\n    \"env\": "
  },
  {
    "path": ".gitignore",
    "chars": 23,
    "preview": "node_modules\n.DS_Store\n"
  },
  {
    "path": "CODEOWNERS",
    "chars": 47,
    "preview": "# global owners\n*       @mapbox/cloud-platform\n"
  },
  {
    "path": "DESIGN.md",
    "chars": 4298,
    "preview": "# Design\n\n## Replication\n\nThis replication system is built such that there is a **primary** table and a **replica** tabl"
  },
  {
    "path": "LICENSE.txt",
    "chars": 746,
    "preview": "\nISC License\n\nCopyright (c) 2017, Mapbox\n\nPermission to use, copy, modify, and/or distribute this software for any\npurpo"
  },
  {
    "path": "README.md",
    "chars": 7621,
    "preview": "# dynamodb-replicator\n\n[dynamodb-replicator](https://github.com/mapbox/dynamodb-replicator) offers several different mec"
  },
  {
    "path": "backup.js",
    "chars": 2046,
    "preview": "var AWS = require('aws-sdk');\nvar Dyno = require('@mapbox/dyno');\nvar stream = require('stream');\nvar zlib = require('zl"
  },
  {
    "path": "bin/backup-table.js",
    "chars": 2901,
    "preview": "#!/usr/bin/env node\n\nvar backup = require('../backup');\nvar fastlog = require('../fastlog');\nvar args = require('minimis"
  },
  {
    "path": "bin/diff-record.js",
    "chars": 2639,
    "preview": "#!/usr/bin/env node\n\nvar Dyno = require('@mapbox/dyno');\nvar args = require('minimist')(process.argv.slice(2));\nvar asse"
  },
  {
    "path": "bin/diff-tables.js",
    "chars": 1698,
    "preview": "#!/usr/bin/env node\n\nvar diff = require('../diff');\nvar fastlog = require('../fastlog');\nvar args = require('minimist')("
  },
  {
    "path": "bin/incremental-backfill.js",
    "chars": 874,
    "preview": "#!/usr/bin/env node\n\nvar args = require('minimist')(process.argv.slice(2));\nvar s3urls = require('s3urls');\nvar backfill"
  },
  {
    "path": "bin/incremental-backup-record.js",
    "chars": 2025,
    "preview": "#!/usr/bin/env node\n\nvar minimist = require('minimist');\nvar s3urls = require('s3urls');\nvar Dyno = require('@mapbox/dyn"
  },
  {
    "path": "bin/incremental-diff-record.js",
    "chars": 2895,
    "preview": "#!/usr/bin/env node\n\nvar minimist = require('minimist');\nvar s3urls = require('s3urls');\nvar Dyno = require('@mapbox/dyn"
  },
  {
    "path": "bin/incremental-record-history.js",
    "chars": 3784,
    "preview": "#!/usr/bin/env node\n\nvar minimist = require('minimist');\nvar s3urls = require('s3urls');\nvar crypto = require('crypto');"
  },
  {
    "path": "bin/incremental-snapshot.js",
    "chars": 3140,
    "preview": "#!/usr/bin/env node\n\nvar AWS = require('aws-sdk');\nvar args = require('minimist')(process.argv.slice(2));\nvar s3urls = r"
  },
  {
    "path": "bin/replicate-record.js",
    "chars": 1803,
    "preview": "#!/usr/bin/env node\n\nvar minimist = require('minimist');\nvar Dyno = require('@mapbox/dyno');\n\nvar args = minimist(proces"
  },
  {
    "path": "cloudformation/travis.template",
    "chars": 3813,
    "preview": "{\n    \"AWSTemplateFormatVersion\": \"2010-09-09\",\n    \"Description\": \"Travis user for testing dynamodb-replicator\",\n    \"R"
  },
  {
    "path": "diff.js",
    "chars": 11018,
    "preview": "var _ = require('underscore');\nvar queue = require('queue-async');\nvar Dyno = require('@mapbox/dyno');\nvar stream = requ"
  },
  {
    "path": "fastlog.js",
    "chars": 1586,
    "preview": "var _ = require('underscore');\nvar util = require('util');\n\nmodule.exports = function(category, level, template) {\n    c"
  },
  {
    "path": "index.js",
    "chars": 8039,
    "preview": "var AWS = require('aws-sdk');\nvar Dyno = require('@mapbox/dyno');\nvar queue = require('queue-async');\nvar crypto = requi"
  },
  {
    "path": "package.json",
    "chars": 1435,
    "preview": "{\n  \"name\": \"@mapbox/dynamodb-replicator\",\n  \"version\": \"10.1.1\",\n  \"description\": \"\",\n  \"main\": \"index.js\",\n  \"scripts\""
  },
  {
    "path": "parse-location.js",
    "chars": 716,
    "preview": "// Regex to find IP address and port\n// ex.: 127.0.0.1:80 or localhost:80\nvar regex = new RegExp('(\\\\b(?:[0-9]{1,3}\\\\.){"
  },
  {
    "path": "s3-backfill.js",
    "chars": 2894,
    "preview": "var Dyno = require('@mapbox/dyno');\nvar AWS = require('aws-sdk');\nvar stream = require('stream');\nvar queue = require('q"
  },
  {
    "path": "s3-snapshot.js",
    "chars": 2729,
    "preview": "var AWS = require('aws-sdk');\nvar s3scan = require('@mapbox/s3scan');\nvar zlib = require('zlib');\nvar stream = require('"
  },
  {
    "path": "test/backup.test.js",
    "chars": 4262,
    "preview": "var test = require('tape');\nvar dynamodb = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./tabl"
  },
  {
    "path": "test/diff-record.test.js",
    "chars": 2051,
    "preview": "var test = require('tape');\nvar primary = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table"
  },
  {
    "path": "test/diff.test.js",
    "chars": 9257,
    "preview": "var test = require('tape');\nvar primary = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./table"
  },
  {
    "path": "test/fixtures/events/adjust-many.json",
    "chars": 6325,
    "preview": "{\n    \"Records\":[\n        {\n            \"eventName\":\"INSERT\",\n            \"eventVersion\":\"1.0\",\n            \"eventSource"
  },
  {
    "path": "test/fixtures/events/insert-buffer.json",
    "chars": 1625,
    "preview": "{\n    \"Records\":[\n        {\n            \"eventName\":\"INSERT\",\n            \"eventVersion\":\"1.0\",\n            \"eventSource"
  },
  {
    "path": "test/fixtures/events/insert-modify-delete.json",
    "chars": 2659,
    "preview": "{\n    \"Records\":[\n        {\n            \"eventName\":\"INSERT\",\n            \"eventVersion\":\"1.0\",\n            \"eventSource"
  },
  {
    "path": "test/fixtures/events/insert-modify.json",
    "chars": 1856,
    "preview": "{\n    \"Records\":[\n        {\n            \"eventName\":\"INSERT\",\n            \"eventVersion\":\"1.0\",\n            \"eventSource"
  },
  {
    "path": "test/fixtures/events/insert.json",
    "chars": 828,
    "preview": "{\n    \"Records\":[\n        {\n            \"eventName\":\"INSERT\",\n            \"eventVersion\":\"1.0\",\n            \"eventSource"
  },
  {
    "path": "test/fixtures/records.js",
    "chars": 297,
    "preview": "var crypto = require('crypto');\nvar _ = require('underscore');\n\nmodule.exports = function(num) {\n    return _.range(0, n"
  },
  {
    "path": "test/fixtures/table.js",
    "chars": 273,
    "preview": "module.exports = {\n    AttributeDefinitions: [\n        {AttributeName: 'id', AttributeType: 'S'}\n    ],\n    KeySchema: ["
  },
  {
    "path": "test/incremental.test.js",
    "chars": 4305,
    "preview": "var test = require('tape');\nvar AWS = require('aws-sdk');\nvar s3 = new AWS.S3();\nvar crypto = require('crypto');\nvar tab"
  },
  {
    "path": "test/index.test.js",
    "chars": 9939,
    "preview": "var test = require('tape');\nvar tableDef = require('./fixtures/table');\nvar DynamoDB = require('@mapbox/dynamodb-test');"
  },
  {
    "path": "test/live-test.backup-table.js",
    "chars": 4246,
    "preview": "var test = require('tape');\nvar dynamodb = require('@mapbox/dynamodb-test')(test, 'dynamodb-replicator', require('./tabl"
  },
  {
    "path": "test/table.json",
    "chars": 529,
    "preview": "{\n    \"AttributeDefinitions\": [\n        {\n            \"AttributeName\": \"hash\",\n            \"AttributeType\": \"S\"\n        "
  }
]

About this extraction

This page contains the full source code of the mapbox/dynamodb-replicator GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 38 files (114.8 KB), approximately 26.9k tokens, and a symbol index with 23 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Copied to clipboard!