Full Code of aws/sagemaker-tensorflow-serving-container for AI

master da192561ad81 cached

130 files

3.2 MB

839.3k tokens

272 symbols

1 requests

Download .txt

Showing preview only (3,356K chars total). Download the full file or copy to clipboard to get everything.

Repository: aws/sagemaker-tensorflow-serving-container
Branch: master
Commit: da192561ad81
Files: 130
Total size: 3.2 MB

Directory structure:
gitextract_7xv872_6/

├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   ├── config.yml
│   │   ├── documentation-request.md
│   │   └── feature_request.md
│   └── PULL_REQUEST_TEMPLATE.md
├── .gitignore
├── .jshintrc
├── .pylintrc
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── NOTICE
├── README.md
├── VERSION
├── buildspec.yml
├── docker/
│   ├── 1.11/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.12/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.13/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.14/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.15/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 2.0/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 2.1/
│   │   ├── Dockerfile.cpu
│   │   └── Dockerfile.gpu
│   ├── __init__.py
│   └── build_artifacts/
│       ├── __init__.py
│       ├── deep_learning_container.py
│       ├── dockerd-entrypoint.py
│       └── sagemaker/
│           ├── __init__.py
│           ├── multi_model_utils.py
│           ├── nginx.conf.template
│           ├── python_service.py
│           ├── serve
│           ├── serve.py
│           ├── tensorflowServing.js
│           └── tfs_utils.py
├── scripts/
│   ├── build-all.sh
│   ├── build.sh
│   ├── curl.sh
│   ├── publish-all.sh
│   ├── publish.sh
│   ├── shared.sh
│   ├── start.sh
│   └── stop.sh
├── test/
│   ├── conftest.py
│   ├── data/
│   │   └── batch.csv
│   ├── integration/
│   │   ├── local/
│   │   │   ├── conftest.py
│   │   │   ├── multi_model_endpoint_test_utils.py
│   │   │   ├── test_container.py
│   │   │   ├── test_multi_model_endpoint.py
│   │   │   ├── test_multi_tfs.py
│   │   │   ├── test_nginx_config.py
│   │   │   ├── test_pre_post_processing.py
│   │   │   ├── test_pre_post_processing_mme.py
│   │   │   └── test_tfs_batching.py
│   │   └── sagemaker/
│   │       ├── conftest.py
│   │       ├── test_ei.py
│   │       ├── test_tfs.py
│   │       └── util.py
│   ├── perf/
│   │   ├── ab.sh
│   │   ├── create-endpoint.sh
│   │   ├── create-model.sh
│   │   ├── data_generator.py
│   │   ├── delete-endpoint.sh
│   │   ├── ec2-perftest.sh
│   │   └── perftest_endpoint.py
│   ├── resources/
│   │   ├── examples/
│   │   │   ├── test1/
│   │   │   │   └── inference.py
│   │   │   ├── test2/
│   │   │   │   └── inference.py
│   │   │   ├── test3/
│   │   │   │   ├── inference.py
│   │   │   │   └── requirements.txt
│   │   │   ├── test4/
│   │   │   │   ├── inference.py
│   │   │   │   └── lib/
│   │   │   │       └── dummy_module/
│   │   │   │           └── __init__.py
│   │   │   └── test5/
│   │   │       ├── inference.py
│   │   │       ├── lib/
│   │   │       │   └── dummy_module/
│   │   │       │       └── __init__.py
│   │   │       └── requirements.txt
│   │   ├── inputs/
│   │   │   ├── test-cifar.json
│   │   │   ├── test-gcloud.jsons
│   │   │   ├── test-generic.json
│   │   │   ├── test-large.csv
│   │   │   ├── test.csv
│   │   │   └── test.json
│   │   ├── mme/
│   │   │   ├── cifar/
│   │   │   │   └── 1540855709/
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   ├── half_plus_three/
│   │   │   │   ├── 00000123/
│   │   │   │   │   ├── assets/
│   │   │   │   │   │   └── foo.txt
│   │   │   │   │   ├── saved_model.pb
│   │   │   │   │   └── variables/
│   │   │   │   │       ├── variables.data-00000-of-00001
│   │   │   │   │       └── variables.index
│   │   │   │   └── 00000124/
│   │   │   │       ├── assets/
│   │   │   │       │   └── foo.txt
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   ├── half_plus_two/
│   │   │   │   └── 00000123/
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   └── invalid_version/
│   │   │       └── abcde/
│   │   │           └── dummy.txt
│   │   ├── mme_universal_script/
│   │   │   ├── code/
│   │   │   │   ├── inference.py
│   │   │   │   └── requirements.txt
│   │   │   └── half_plus_three/
│   │   │       └── model/
│   │   │           └── half_plus_three/
│   │   │               ├── 00000123/
│   │   │               │   ├── assets/
│   │   │               │   │   └── foo.txt
│   │   │               │   ├── saved_model.pb
│   │   │               │   └── variables/
│   │   │               │       ├── variables.data-00000-of-00001
│   │   │               │       └── variables.index
│   │   │               └── 00000124/
│   │   │                   ├── assets/
│   │   │                   │   └── foo.txt
│   │   │                   ├── saved_model.pb
│   │   │                   └── variables/
│   │   │                       ├── variables.data-00000-of-00001
│   │   │                       └── variables.index
│   │   └── models/
│   │       └── half_plus_three/
│   │           ├── .00000111/
│   │           │   └── .hidden_file
│   │           ├── 00000123/
│   │           │   ├── assets/
│   │           │   │   └── foo.txt
│   │           │   ├── saved_model.pb
│   │           │   └── variables/
│   │           │       ├── variables.data-00000-of-00001
│   │           │       └── variables.index
│   │           └── 00000124/
│   │               ├── assets/
│   │               │   └── foo.txt
│   │               ├── saved_model.pb
│   │               └── variables/
│   │                   ├── variables.data-00000-of-00001
│   │                   └── variables.index
│   └── unit/
│       ├── test_deep_learning_container.py
│       └── test_proxy_client.py
└── tox.ini

================================================
FILE CONTENTS
================================================

================================================
FILE: .github/ISSUE_TEMPLATE/bug_report.md
================================================
---
name: Bug report
about: File a report to help us reproduce and fix the problem
title: ''
labels: ''
assignees: ''

---

**Describe the bug**
A clear and concise description of what the bug is.

**To reproduce**
A clear, step-by-step set of instructions to reproduce the bug.

**Expected behavior**
A clear and concise description of what you expected to happen.

**Screenshots or logs**
If applicable, add screenshots or logs to help explain your problem.

**System information**
A description of your system. Please provide:
- **Toolkit version**:
- **Framework version**:
- **Python version**:
- **CPU or GPU**:
- **Custom Docker image (Y/N)**:

**Additional context**
Add any other context about the problem here.


================================================
FILE: .github/ISSUE_TEMPLATE/config.yml
================================================
blank_issues_enabled: false
contact_links:
  - name: Ask a question
    url: https://stackoverflow.com/questions/tagged/amazon-sagemaker
    about: Use Stack Overflow to ask and answer questions


================================================
FILE: .github/ISSUE_TEMPLATE/documentation-request.md
================================================
---
name: Documentation request
about: Request improved documentation
title: ''
labels: ''
assignees: ''

---

**What did you find confusing? Please describe.**
A clear and concise description of what you found confusing. Ex. I tried to [...] but I didn't understand how to [...]

**Describe how documentation can be improved**
A clear and concise description of where documentation was lacking and how it can be improved.

**Additional context**
Add any other context or screenshots about the documentation request here.


================================================
FILE: .github/ISSUE_TEMPLATE/feature_request.md
================================================
---
name: Feature request
about: Suggest new functionality for this toolkit
title: ''
labels: ''
assignees: ''

---

**Describe the feature you'd like**
A clear and concise description of the functionality you want.

**How would this feature be used? Please describe.**
A clear and concise description of the use case for this feature. Please provide an example, if possible.

**Describe alternatives you've considered**
A clear and concise description of any alternative solutions or features you've considered.

**Additional context**
Add any other context or screenshots about the feature request here.


================================================
FILE: .github/PULL_REQUEST_TEMPLATE.md
================================================
*Issue #, if available:*

*Description of changes:*


By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.


================================================
FILE: .gitignore
================================================
__pycache__
.tox/
log.txt
.idea/
node_modules/
package.json
package-lock.json


================================================
FILE: .jshintrc
================================================
{
  "asi": true,
  "esversion": 6
}


================================================
FILE: .pylintrc
================================================
[MASTER]

ignore=
    tensorflow_serving,
    tensorflow-2.1,
    tensorflow-2.2

[MESSAGES CONTROL]

disable=
    C, # convention
    R, # refactor
    too-many-arguments, # We should fix the offending ones soon.
    too-many-lines, # Some files are too big, we should fix this too
    too-few-public-methods,
    too-many-instance-attributes,
    too-many-locals,
    len-as-condition, # Nice to have in the future
    bad-indentation,
    line-too-long, # We let Flake8 take care of this
    logging-format-interpolation,
    useless-object-inheritance, # We still support python2 so inheriting from object is ok
    invalid-name,
    import-error,
    logging-not-lazy,
    fixme,
    no-self-use,
    attribute-defined-outside-init,
    protected-access,
    invalid-all-object,
    arguments-differ,
    abstract-method,
    signature-differs,
    raise-missing-from

[REPORTS]
# Set the output format. Available formats are text, parseable, colorized, msvs
# (visual studio) and html
output-format=colorized

# Tells whether to display a full report or only the messages
# CHANGE: No report.
reports=no

[FORMAT]
# Maximum number of characters on a single line.
max-line-length=100
# Maximum number of lines in a module
#max-module-lines=1000
# String used as indentation unit. This is usually " " (4 spaces) or "\t" (1 tab).
indent-string='  '

[BASIC]

# Required attributes for module, separated by a comma
#required-attributes=
# List of builtins function names that should not be used, separated by a comma.
# XXX: Should we ban map() & filter() for list comprehensions?
# exit & quit are for the interactive interpreter shell only.
# https://docs.python.org/3/library/constants.html#constants-added-by-the-site-module
bad-functions=
    apply,
    exit,
    input,
    quit,

[SIMILARITIES]
# Minimum lines number of a similarity.
min-similarity-lines=5
# Ignore comments when computing similarities.
ignore-comments=yes
# Ignore docstrings when computing similarities.
ignore-docstrings=yes

[VARIABLES]
# Tells whether we should check for unused import in __init__ files.
init-import=no
# A regular expression matching the beginning of the name of dummy variables
# (i.e. not used).
dummy-variables-rgx=_|unused_

# List of additional names supposed to be defined in builtins. Remember that
# you should avoid to define new builtins when possible.
#additional-builtins=

[LOGGING]
# Apply logging string format checks to calls on these modules.
logging-modules=
    logging

[TYPECHECK]
ignored-modules=
    distutils


================================================
FILE: CHANGELOG.md
================================================
# Changelog

## v1.8.4 (2021-06-30)

### Bug Fixes and Other Changes

 * modify the way port number passing

## v1.8.3 (2021-04-26)

### Bug Fixes and Other Changes

 * Create test_multi_tfs.py

## v1.8.2 (2021-04-13)

### Bug Fixes and Other Changes

 * Update test_container.py

## v1.8.1 (2021-04-07)

### Bug Fixes and Other Changes

 * set OMP_NUM_THREADS default value to 1
 * Wait tfs before starting gunicorn

## v1.8.0 (2021-03-23)

### Features

 * expose tunable parameters to support multiple tfs
 * universal requirements.txt and inference.py

### Bug Fixes and Other Changes

 * To fix python sdk repo localmode tfs tests failing issue
 * Fix TFS Deploy codebuild issue
 * Remove the 1s lock for inference
 * install boto3
 * extending the serve.py with GUNICORN workers and threads
 * adding isnumeric filter to find_model_versions function.
 * return information of all models
 * Support multiple Accept types

### Documentation Changes

 * fix typo in README.md
 * updating readme with the right context object
 * fix broken link in README

## v1.7.0 (2020-07-29)

### Features

 * add model_version_policy to model config

## v1.6.25 (2020-07-23)

### Bug Fixes and Other Changes

 * change single quotes to double quotes

## v1.6.24 (2020-07-17)

### Bug Fixes and Other Changes

 * update MME Pre/Post-Processing model and script paths
 * increasing max_retry for model availability check
 * multi-model-endpoint support

### Documentation Changes

 * update README for multi-model endpoint

## v1.6.23.post0 (2020-06-25)

### Testing and Release Infrastructure

 * add issue templates

## v1.6.23 (2020-06-11)

### Bug Fixes and Other Changes

 * error code files for TFS2.2 (copied from pip install tf)

## v1.6.22.post0 (2020-05-13)

### Testing and Release Infrastructure

 * Test against py37 in buildspec

## v1.6.22 (2020-04-20)

### Bug Fixes and Other Changes

 * TF EIA 1.15 and 2.0

## v1.6.21 (2020-04-16)

### Bug Fixes and Other Changes

 * Replaced deprecated aws ecr get-login by get-login-password

## v1.6.20 (2020-04-03)

### Bug Fixes and Other Changes

 * Updating Pyyaml and Awscli version for tf1.15
 * upgrade Pillow version in test example

## v1.6.19 (2020-04-01)

### Bug Fixes and Other Changes

 * Allowing arguments for deep_learning_container.py for tf

## v1.6.18 (2020-03-26)

### Bug Fixes and Other Changes

 * Adding of deep_learning_container.py

## v1.6.17 (2020-02-18)

### Bug Fixes and Other Changes

 * update: Remove multi-model label from CPU containers

## v1.6.16 (2020-02-17)

### Bug Fixes and Other Changes

 * update r2.0.1 dockerfiles
 * add 2.1 dockerfiles and tensorflow artifacts
 * update for 1.15.2

## v1.6.15 (2020-02-04)

### Bug Fixes and Other Changes

 * return on_delete method when model successfully deleted
 * validate tensorflow model version number in model path

## v1.6.14.post0 (2020-01-20)

### Documentation changes

 * document that pre-/post-processing is not supported with multi-model

## v1.6.14 (2020-01-10)

### Bug fixes and other changes

 * Add __init__.py to fix unit test
 * update: Update buildspec for TF 1.15 and 2.0

## v1.6.13 (2020-01-08)

### Bug fixes and other changes

 * update copyright year in license header

## v1.6.12.post0 (2020-01-03)

### Documentation changes

 * update Readme with correct TF versions.

## v1.6.12 (2020-01-02)

### Bug fixes and other changes

 * update: Release TF 1.15 and TF 2.0 dockerfiles

## v1.6.11 (2019-12-17)

### Bug fixes and other changes

 * check container is ready in tests

## v1.6.10 (2019-12-13)

### Bug fixes and other changes

 * increase attempts to allow for large gpu images
 * ping check and model status check

## v1.6.9 (2019-11-25)

### Bug fixes and other changes

 * Update EI Dockerfile 1.14 with New Health check, and new binaries

## v1.6.8 (2019-10-25)

### Bug fixes and other changes

 * update publish-all.sh to match the versions in build-all.sh
 * upgrade pillow to 6.2.0 in requirements.txt test
 * use regional endpoint for STS in builds and tests
 * merge dockerfiles

## v1.6.7 (2019-10-22)

### Bug fixes and other changes

 * update instance type region availability

## v1.6.6 (2019-10-16)

### Bug fixes and other changes

 * mme improvements

## v1.6.5 (2019-09-30)

### Bug fixes and other changes

 * "change: merge asimov branch to master branch (#80)"
 * merge asimov branch to master branch

## v1.6.4 (2019-09-17)

### Bug fixes and other changes

 * TFS_SHORT_VERSION explicitly defined in dockerfile

## v1.6.3 (2019-09-13)

### Bug fixes and other changes

 * add Dockerfile for 1.14 in py3 with EIA

## v1.6.2 (2019-09-09)

### Bug fixes and other changes

 * require defaul model env var
 * default SAGEMAKER_TFS_DEFAULT_MODEL_NAME to string value
 * remove unused fixture

## v1.6.1 (2019-08-15)

### Bug fixes and other changes

 * Update no-p2 and no-p3 regions

## v1.6.0 (2019-08-09)

### Features

 * delete models
 * different  invocation url
 * list models
 * add model

### Bug fixes and other changes

 * update request/response model and url key names
 * README changes

### Documentation changes

 * add batching to readme

## v1.5.1 (2019-06-21)

### Bug fixes and other changes

 * use nvidia-docker for local gpu tests

## v1.5.0 (2019-06-20)

### Features

 * add 1.13 EIA containers

## v1.4.6 (2019-06-18)

### Bug fixes and other changes

 * fix broken ei tests

## v1.4.5 (2019-06-17)

### Bug fixes and other changes

 * add batch transform integration test

## v1.4.4 (2019-06-12)

### Bug fixes and other changes

 * move SageMaker tests to release build

## v1.4.3 (2019-06-12)

### Bug fixes and other changes

 * use p2.xlarge by default in tests

## v1.4.2 (2019-06-12)

### Bug fixes and other changes

 * add Tensorflow 1.13

## v1.4.1 (2019-06-11)

### Bug fixes and other changes

 * make tox run any pytest tests

## v1.4.0 (2019-06-03)

### Features

 * support jsonlines output

### Documentation changes

 * update README.md for EI image

## v1.3.2 (2019-05-29)

### Bug fixes and other changes

 * change Dockerfile directory structure
 * allow local test against single container

## v1.3.1.post1 (2019-05-20)

### Documentation changes

 * update README.md
 * add pre/post-processing usage examples

## v1.3.1.post0 (2019-05-16)

### Documentation changes

 * add pre-post-processing documentation

## v1.3.1 (2019-05-08)

### Bug fixes and other changes

 * add build-all.sh, publish-all.sh scripts

## v1.3.0 (2019-05-06)

### Features

 * add tensorflow serving batching

### Bug fixes and other changes

 * install requirements.txt in writable dir

## v1.2.1 (2019-04-29)

### Bug fixes and other changes

 * make njs code handle missing custom attributes header

## v1.2.0 (2019-04-29)

### Features

 * add python service for pre/post-processing

## v1.1.9 (2019-04-09)

### Bug fixes and other changes

 * improve handling of ei binary during builds

## v1.1.8 (2019-04-08)

### Bug fixes and other changes

 * add data generator and perf tests
 * remove per-line parsing

## v1.1.7 (2019-04-05)

### Bug fixes and other changes

 * add additional csv test case

## v1.1.6 (2019-04-04)

### Bug fixes and other changes

 * handle zero values correctly

## v1.1.5 (2019-04-04)

### Bug fixes and other changes

 * update EI binary directory.

## v1.1.4 (2019-04-01)

### Changes

 * Support payloads with many csv rows and change CSV parsing behavior

## v1.1.3 (2019-03-29)

### Bug fixes

 * update EI binary location

## v1.1.2 (2019-03-14)

### Bug fixes

 * remove tfs deployment tests

## v1.1.1 (2019-03-13)

### Bug fixes

 * create bucket during test
 * fix argname in deployment test
 * fix repository name in buildspec
 * add deployment tests and run them concurrently
 * report error for missing ei version
 * remove extra commma in buildspec

### Other changes

 * add eia images to release build
 * update buildspec to output deployments.json
 * Modify EI image repository and tag to match Python SDK.
 * Change test directory to be consistent with PDT pipeline.
 * Add EI support to TFS container.
 * simplify tfs versioning
 * add buildspec.yml for codebuild
 * add tox, pylint, flake8, jshint


================================================
FILE: CODE_OF_CONDUCT.md
================================================
## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
opensource-codeofconduct@amazon.com with any additional questions or comments.


================================================
FILE: CONTRIBUTING.md
================================================
# Contributing Guidelines

Thank you for your interest in contributing to our project. Whether it's a bug report, new feature, correction, or additional 
documentation, we greatly value feedback and contributions from our community.

Please read through this document before submitting any issues or pull requests to ensure we have all the necessary 
information to effectively respond to your bug report or contribution.


## Reporting Bugs/Feature Requests

We welcome you to use the GitHub issue tracker to report bugs or suggest features.

When filing an issue, please check [existing open](https://github.com/aws/sagemaker-tfs-container/issues), or [recently closed](https://github.com/aws/sagemaker-tfs-container/issues?utf8=%E2%9C%93&q=is%3Aissue%20is%3Aclosed%20), issues to make sure somebody else hasn't already 
reported the issue. Please try to include as much information as you can. Details like these are incredibly useful:

* A reproducible test case or series of steps
* The version of our code being used
* Any modifications you've made relevant to the bug
* Anything unusual about your environment or deployment


## Contributing via Pull Requests
Contributions via pull requests are much appreciated. Before sending us a pull request, please ensure that:

1. You are working against the latest source on the *master* branch.
2. You check existing open, and recently merged, pull requests to make sure someone else hasn't addressed the problem already.
3. You open an issue to discuss any significant work - we would hate for your time to be wasted.

To send us a pull request, please:

1. Fork the repository.
2. Modify the source; please focus on the specific change you are contributing. If you also reformat all the code, it will be hard for us to focus on your change.
3. Ensure local tests pass.
4. Commit to your fork using clear commit messages.
5. Send us a pull request, answering any default questions in the pull request interface.
6. Pay attention to any automated CI failures reported in the pull request, and stay involved in the conversation.

GitHub provides additional document on [forking a repository](https://help.github.com/articles/fork-a-repo/) and 
[creating a pull request](https://help.github.com/articles/creating-a-pull-request/).


## Finding contributions to work on
Looking at the existing issues is a great way to find something to contribute on. As our projects, by default, use the default GitHub issue labels (enhancement/bug/duplicate/help wanted/invalid/question/wontfix), looking at any ['help wanted'](https://github.com/aws/sagemaker-tfs-container/labels/help%20wanted) issues is a great place to start. 


## Code of Conduct
This project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-conduct). 
For more information see the [Code of Conduct FAQ](https://aws.github.io/code-of-conduct-faq) or contact 
opensource-codeofconduct@amazon.com with any additional questions or comments.


## Security issue notifications
If you discover a potential security issue in this project we ask that you notify AWS/Amazon Security via our [vulnerability reporting page](http://aws.amazon.com/security/vulnerability-reporting/). Please do **not** create a public github issue.


## Licensing

See the [LICENSE](https://github.com/aws/sagemaker-tfs-container/blob/master/LICENSE) file for our project's licensing. We will ask you to confirm the licensing of your contribution.

We may ask you to sign a [Contributor License Agreement (CLA)](http://en.wikipedia.org/wiki/Contributor_License_Agreement) for larger changes.


================================================
FILE: LICENSE
================================================

                                 Apache License
                           Version 2.0, January 2004
                        http://www.apache.org/licenses/

   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION

   1. Definitions.

      "License" shall mean the terms and conditions for use, reproduction,
      and distribution as defined by Sections 1 through 9 of this document.

      "Licensor" shall mean the copyright owner or entity authorized by
      the copyright owner that is granting the License.

      "Legal Entity" shall mean the union of the acting entity and all
      other entities that control, are controlled by, or are under common
      control with that entity. For the purposes of this definition,
      "control" means (i) the power, direct or indirect, to cause the
      direction or management of such entity, whether by contract or
      otherwise, or (ii) ownership of fifty percent (50%) or more of the
      outstanding shares, or (iii) beneficial ownership of such entity.

      "You" (or "Your") shall mean an individual or Legal Entity
      exercising permissions granted by this License.

      "Source" form shall mean the preferred form for making modifications,
      including but not limited to software source code, documentation
      source, and configuration files.

      "Object" form shall mean any form resulting from mechanical
      transformation or translation of a Source form, including but
      not limited to compiled object code, generated documentation,
      and conversions to other media types.

      "Work" shall mean the work of authorship, whether in Source or
      Object form, made available under the License, as indicated by a
      copyright notice that is included in or attached to the work
      (an example is provided in the Appendix below).

      "Derivative Works" shall mean any work, whether in Source or Object
      form, that is based on (or derived from) the Work and for which the
      editorial revisions, annotations, elaborations, or other modifications
      represent, as a whole, an original work of authorship. For the purposes
      of this License, Derivative Works shall not include works that remain
      separable from, or merely link (or bind by name) to the interfaces of,
      the Work and Derivative Works thereof.

      "Contribution" shall mean any work of authorship, including
      the original version of the Work and any modifications or additions
      to that Work or Derivative Works thereof, that is intentionally
      submitted to Licensor for inclusion in the Work by the copyright owner
      or by an individual or Legal Entity authorized to submit on behalf of
      the copyright owner. For the purposes of this definition, "submitted"
      means any form of electronic, verbal, or written communication sent
      to the Licensor or its representatives, including but not limited to
      communication on electronic mailing lists, source code control systems,
      and issue tracking systems that are managed by, or on behalf of, the
      Licensor for the purpose of discussing and improving the Work, but
      excluding communication that is conspicuously marked or otherwise
      designated in writing by the copyright owner as "Not a Contribution."

      "Contributor" shall mean Licensor and any individual or Legal Entity
      on behalf of whom a Contribution has been received by Licensor and
      subsequently incorporated within the Work.

   2. Grant of Copyright License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      copyright license to reproduce, prepare Derivative Works of,
      publicly display, publicly perform, sublicense, and distribute the
      Work and such Derivative Works in Source or Object form.

   3. Grant of Patent License. Subject to the terms and conditions of
      this License, each Contributor hereby grants to You a perpetual,
      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
      (except as stated in this section) patent license to make, have made,
      use, offer to sell, sell, import, and otherwise transfer the Work,
      where such license applies only to those patent claims licensable
      by such Contributor that are necessarily infringed by their
      Contribution(s) alone or by combination of their Contribution(s)
      with the Work to which such Contribution(s) was submitted. If You
      institute patent litigation against any entity (including a
      cross-claim or counterclaim in a lawsuit) alleging that the Work
      or a Contribution incorporated within the Work constitutes direct
      or contributory patent infringement, then any patent licenses
      granted to You under this License for that Work shall terminate
      as of the date such litigation is filed.

   4. Redistribution. You may reproduce and distribute copies of the
      Work or Derivative Works thereof in any medium, with or without
      modifications, and in Source or Object form, provided that You
      meet the following conditions:

      (a) You must give any other recipients of the Work or
          Derivative Works a copy of this License; and

      (b) You must cause any modified files to carry prominent notices
          stating that You changed the files; and

      (c) You must retain, in the Source form of any Derivative Works
          that You distribute, all copyright, patent, trademark, and
          attribution notices from the Source form of the Work,
          excluding those notices that do not pertain to any part of
          the Derivative Works; and

      (d) If the Work includes a "NOTICE" text file as part of its
          distribution, then any Derivative Works that You distribute must
          include a readable copy of the attribution notices contained
          within such NOTICE file, excluding those notices that do not
          pertain to any part of the Derivative Works, in at least one
          of the following places: within a NOTICE text file distributed
          as part of the Derivative Works; within the Source form or
          documentation, if provided along with the Derivative Works; or,
          within a display generated by the Derivative Works, if and
          wherever such third-party notices normally appear. The contents
          of the NOTICE file are for informational purposes only and
          do not modify the License. You may add Your own attribution
          notices within Derivative Works that You distribute, alongside
          or as an addendum to the NOTICE text from the Work, provided
          that such additional attribution notices cannot be construed
          as modifying the License.

      You may add Your own copyright statement to Your modifications and
      may provide additional or different license terms and conditions
      for use, reproduction, or distribution of Your modifications, or
      for any such Derivative Works as a whole, provided Your use,
      reproduction, and distribution of the Work otherwise complies with
      the conditions stated in this License.

   5. Submission of Contributions. Unless You explicitly state otherwise,
      any Contribution intentionally submitted for inclusion in the Work
      by You to the Licensor shall be under the terms and conditions of
      this License, without any additional terms or conditions.
      Notwithstanding the above, nothing herein shall supersede or modify
      the terms of any separate license agreement you may have executed
      with Licensor regarding such Contributions.

   6. Trademarks. This License does not grant permission to use the trade
      names, trademarks, service marks, or product names of the Licensor,
      except as required for reasonable and customary use in describing the
      origin of the Work and reproducing the content of the NOTICE file.

   7. Disclaimer of Warranty. Unless required by applicable law or
      agreed to in writing, Licensor provides the Work (and each
      Contributor provides its Contributions) on an "AS IS" BASIS,
      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
      implied, including, without limitation, any warranties or conditions
      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
      PARTICULAR PURPOSE. You are solely responsible for determining the
      appropriateness of using or redistributing the Work and assume any
      risks associated with Your exercise of permissions under this License.

   8. Limitation of Liability. In no event and under no legal theory,
      whether in tort (including negligence), contract, or otherwise,
      unless required by applicable law (such as deliberate and grossly
      negligent acts) or agreed to in writing, shall any Contributor be
      liable to You for damages, including any direct, indirect, special,
      incidental, or consequential damages of any character arising as a
      result of this License or out of the use or inability to use the
      Work (including but not limited to damages for loss of goodwill,
      work stoppage, computer failure or malfunction, or any and all
      other commercial damages or losses), even if such Contributor
      has been advised of the possibility of such damages.

   9. Accepting Warranty or Additional Liability. While redistributing
      the Work or Derivative Works thereof, You may choose to offer,
      and charge a fee for, acceptance of support, warranty, indemnity,
      or other liability obligations and/or rights consistent with this
      License. However, in accepting such obligations, You may act only
      on Your own behalf and on Your sole responsibility, not on behalf
      of any other Contributor, and only if You agree to indemnify,
      defend, and hold each Contributor harmless for any liability
      incurred by, or claims asserted against, such Contributor by reason
      of your accepting any such warranty or additional liability.

   END OF TERMS AND CONDITIONS

   APPENDIX: How to apply the Apache License to your work.

      To apply the Apache License to your work, attach the following
      boilerplate notice, with the fields enclosed by brackets "[]"
      replaced with your own identifying information. (Don't include
      the brackets!)  The text should be enclosed in the appropriate
      comment syntax for the file format. We also recommend that a
      file or class name and description of purpose be included on the
      same "printed page" as the copyright notice for easier
      identification within third-party archives.

   Copyright [yyyy] [name of copyright owner]

   Licensed under the Apache License, Version 2.0 (the "License");
   you may not use this file except in compliance with the License.
   You may obtain a copy of the License at

       http://www.apache.org/licenses/LICENSE-2.0

   Unless required by applicable law or agreed to in writing, software
   distributed under the License is distributed on an "AS IS" BASIS,
   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
   See the License for the specific language governing permissions and
   limitations under the License.


================================================
FILE: NOTICE
================================================
Sagemaker TensorFlow Serving Container
Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. 


================================================
FILE: README.md
================================================
# ![image](https://user-images.githubusercontent.com/56273942/202568467-0ee721bb-1424-4efd-88fc-31b4f2a59dc6.png) DEPRECATED

## Announcement:
As of September 13th, 2023, this repository is deprecated. The contents of this repository will remain available but we will no longer provide updates or accept new contributions and pull requests.

# <img alt="SageMaker" src="branding/icon/sagemaker-banner.png" height="100">

# SageMaker TensorFlow Serving Container

SageMaker TensorFlow Serving Container is an a open source project that builds
docker images for running TensorFlow Serving on
[Amazon SageMaker](https://aws.amazon.com/documentation/sagemaker/).

Supported versions of TensorFlow: ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``, ``1.9.0``, ``1.10.0``, ``1.11.0``, ``1.12.0``, ``1.13.1``, ``1.14.0``, ``1.15.0``, ``2.0.0``.

Supported versions of TensorFlow for Elastic Inference: ``1.11.0``, ``1.12.0``, ``1.13.1``, ``1.14.0``.

ECR repositories for SageMaker built TensorFlow Serving Container:

- `'tensorflow-inference'` for any new version starting with ``1.13.0`` in the following AWS accounts:
  - `"871362719292"` in `"ap-east-1"`;
  - `"217643126080"` in `"me-south-1"`;
  - `"886529160074"` in `"us-iso-east-1"`;
  - `"763104351884"` in other SageMaker public regions.
- `'sagemaker-tensorflow-serving'` for ``1.4.1``, ``1.5.0``, ``1.6.0``, ``1.7.0``, ``1.8.0``, ``1.9.0``, ``1.10.0``, ``1.11.0``, ``1.12.0`` versions in the following AWS accounts:
  - `"057415533634"` in `"ap-east-1"`;
  - `"724002660598"` in `"me-south-1"`;
  - `"520713654638"` in other SageMaker public regions.

ECR repositories for SageMaker built TensorFlow Serving Container for Elastic Inference:

- `'tensorflow-inference-eia'` for any new version starting with ``1.14.0`` in the same AWS accounts as TensorFlow Serving Container for newer TensorFlow versions listed above;
- `'sagemaker-tensorflow-serving-eia'` for ``1.11.0``, ``1.12.0``, ``1.13.1`` versions in the same AWS accounts as TensorFlow Serving Container for older TensorFlow versions listed above.

This documentation covers building and testing these docker images.

For information about using TensorFlow Serving on SageMaker, see:
[Deploying to TensorFlow Serving Endpoints](https://sagemaker.readthedocs.io/en/stable/frameworks/tensorflow/deploying_tensorflow_serving.html)
in the [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk) documentation.

For notebook examples, see: [Amazon SageMaker Examples](https://github.com/awslabs/amazon-sagemaker-examples).

## Table of Contents

1. [Getting Started](#getting-started)
2. [Building your image](#building-your-image)
3. [Running the tests](#running-the-tests)
4. [Pre/Post-Processing](#pre/post-processing)
5. [Deploying a TensorFlow Serving Model](#deploying-a-tensorflow-serving-model)
6. [Enable Batching](#enabling-batching)
7. [Configurable SageMaker Environment Variables](#configurable-sagemaker-environment-variables)
8. [Deploying to Multi-Model Endpoint](#deploying-to-multi-model-endpoint)

## Getting Started

### Prerequisites

Make sure you have installed all of the following prerequisites on your
development machine:

- [Docker](https://www.docker.com/)
- [AWS CLI](https://aws.amazon.com/cli/)

For testing, you will also need:

- [Python 3.6](https://www.python.org/)
- [tox](https://tox.readthedocs.io/en/latest/)
- [npm](https://npmjs.org/)
- [jshint](https://jshint.com/about/)

To test GPU images locally, you will also need:

- [nvidia-docker](https://github.com/NVIDIA/nvidia-docker)

**Note:** Some of the build and tests scripts interact with resources in your AWS account. Be sure to
set your default AWS credentials and region using `aws configure` before using these scripts.

## Building your image

Amazon SageMaker uses Docker containers to run all training jobs and inference endpoints.

The Docker images are built from the Dockerfiles in
[docker/](https://github.com/aws/sagemaker-tensorflow-serving-container/tree/master/docker).

The Dockerfiles are grouped based on the version of TensorFlow Serving they support. Each supported
processor type (e.g. "cpu", "gpu", "ei") has a different Dockerfile in each group.

To build an image, run the `./scripts/build.sh` script:

```bash
./scripts/build.sh --version 1.13 --arch cpu
./scripts/build.sh --version 1.13 --arch gpu
./scripts/build.sh --version 1.13 --arch eia
```


If your are testing locally, building the image is enough. But if you want to your updated image
in SageMaker, you need to publish it to an ECR repository in your account. The
`./scripts/publish.sh` script makes that easy:

```bash
./scripts/publish.sh --version 1.13 --arch cpu
./scripts/publish.sh --version 1.13 --arch gpu
./scripts/publish.sh --version 1.13 --arch eia
```

Note: this will publish to ECR in your default region. Use the `--region` argument to
specify a different region.

### Running your image in local docker

You can also run your container locally in Docker to test different models and input
inference requests by hand. Standard `docker run` commands (or `nvidia-docker run` for
GPU images) will work for this, or you can use the provided `start.sh`
and `stop.sh` scripts:

```bash
./scripts/start.sh [--version x.xx] [--arch cpu|gpu|eia|...]
./scripts/stop.sh [--version x.xx] [--arch cpu|gpu|eia|...]
```

When the container is running, you can send test requests to it using any HTTP client. Here's
and an example using the `curl` command:

```bash
curl -X POST --data-binary @test/resources/inputs/test.json \
     -H 'Content-Type: application/json' \
     -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=half_plus_three' \
     http://localhost:8080/invocations
```

Additional `curl` examples can be found in `./scripts/curl.sh`.

## Running the tests

The package includes automated tests and code checks. The tests use Docker to run the container
image locally, and do not access resources in AWS. You can run the tests and static code
checkers using `tox`:

```bash
tox
```

To run local tests against a single container or with other options, you can use the following command:

```bash
python -m pytest test/integration/local
    [--docker-name-base <docker_name_base>]
    [--framework-version <framework_version>]
    [--processor-type <processor_type>]
```

To test against Elastic Inference with Accelerator, you will need an AWS account, publish your built image to ECR repository and run the following command:

    tox -e py36 -- test/integration/sagemaker/test_ei.py
        [--repo <ECR_repository_name>]
        [--instance-types <instance_type>,...]
        [--accelerator-type <accelerator_type>]
        [--versions <version>,...]

For example:

    tox -e py36 -- test/integration/sagemaker/test_ei.py \
        --repo sagemaker-tensorflow-serving-eia \
        --instance_type ml.m5.xlarge \
        --accelerator-type ml.eia1.medium \
        --versions 1.13.0


## Pre/Post-Processing

**NOTE: There is currently no support for pre-/post-processing with multi-model containers.**

SageMaker TensorFlow Serving Container supports the following Content-Types for requests:

* `application/json` (default)
* `text/csv`
* `application/jsonlines`

And the following content types for responses:

* `application/json` (default)
* `application/jsonlines`

The container will convert data in these formats to [TensorFlow Serving REST API](https://www.tensorflow.org/tfx/serving/api_rest) requests,
and will send these requests to the default serving signature of your SavedModel bundle.

You can also add customized Python code to process your input and output data. To use this feature, you need to:
1. Add a python file named `inference.py` to the code directory inside your model archive.
2. In `inference.py`, implement either a pair of `input_handler` and `output_handler` functions or a single `handler` function. Note that if `handler` function is implemented, `input_handler` and `output_handler` will be ignored.

To implement pre/post-processing handler(s), you will need to make use of the `Context` object created by Python service. The `Context` is a `namedtuple` with following attributes:
- `model_name (string)`: the name of the model you will to use for inference, for example 'half_plus_three'
- `model_version (string)`: version of the model, for example '5'
- `method (string)`: inference method, for example, 'predict', 'classify' or 'regress', for more information on methods, please see [Classify and Regress API](https://www.tensorflow.org/tfx/serving/api_rest#classify_and_regress_api) and [Predict API](https://www.tensorflow.org/tfx/serving/api_rest#predict_api)
- `rest_uri (string)`: the TFS REST uri generated by the Python service, for example, 'http://localhost:8501/v1/models/half_plus_three:predict'
- `grpc_port (string)`: the GRPC port number generated by the Python service, for example, '9000'
- `custom_attributes (string)`: content of 'X-Amzn-SageMaker-Custom-Attributes' header from the original request, for example, 'tfs-model-name=half_plus_three,tfs-method=predict'
- `request_content_type (string)`: the original request content type, defaulted to 'application/json' if not provided
- `accept_header (string)`: the original request accept type, defaulted to 'application/json' if not provided
- `content_length (int)`: content length of the original request

Here's a code example implementing `input_handler` and `output_handler`. By providing these, the Python service will post the request to TFS REST uri with the data pre-processed by `input_handler` and pass the response to `output_handler` for post-processing.

```python
import json

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data, in format of dict or string
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """
    if context.request_content_type == 'application/json':
        # pass through json (assumes it's correctly formed)
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        # very simple csv handler
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))


def output_handler(data, context):
    """Post-process TensorFlow Serving output before it is returned to the client.
    Args:
        data (obj): the TensorFlow serving response
        context (Context): an object containing request and configuration details
    Returns:
        (bytes, string): data to return to client, response content type
    """
    if data.status_code != 200:
        raise ValueError(data.content.decode('utf-8'))

    response_content_type = context.accept_header
    prediction = data.content
    return prediction, response_content_type
```

Here's another code example implementing `input_handler` and `output_handler` to format image data into a TFS request that expects image data as an encoded string rather than as a numeric tensor:

```python
import base64
import io
import json
import requests

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API

    Args:
        data (obj): the request data stream
        context (Context): an object containing request and configuration details

    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """

    if context.request_content_type == 'application/x-image':
        payload = data.read()
        encoded_image = base64.b64encode(payload).decode('utf-8')
        instance = [{"b64": encoded_image}]
        return json.dumps({"instances": instance})
    else:
        _return_error(415, 'Unsupported content type "{}"'.format(
            context.request_content_type or 'Unknown'))


def output_handler(response, context):
    """Post-process TensorFlow Serving output before it is returned to the client.

    Args:
        response (obj): the TensorFlow serving response
        context (Context): an object containing request and configuration details

    Returns:
        (bytes, string): data to return to client, response content type
    """
    if response.status_code != 200:
        _return_error(response.status_code, response.content.decode('utf-8'))
    response_content_type = context.accept_header
    prediction = response.content
    return prediction, response_content_type


def _return_error(code, message):
    raise ValueError('Error: {}, {}'.format(str(code), message))
```

The `input_handler` above creates requests that match the input of the following TensorFlow Serving SignatureDef, displayed
using the TensorFlow `saved_model_cli`:

```
signature_def['serving_default']:
  The given SavedModel SignatureDef contains the following input(s):
    inputs['image_bytes'] tensor_info:
        dtype: DT_STRING
        shape: (-1)
        name: input_tensor:0
  The given SavedModel SignatureDef contains the following output(s):
    outputs['classes'] tensor_info:
        dtype: DT_INT64
        shape: (-1)
        name: ArgMax:0
    outputs['probabilities'] tensor_info:
        dtype: DT_FLOAT
        shape: (-1, 1001)
        name: softmax_tensor:0
  Method name is: tensorflow/serving/predict
```


There are occasions when you might want to have complete control over the request handler. For example, making TFS request (REST or GRPC) to one model, and then making a request to a second model. In this case, you may implement the `handler` instead of the `input_handler` and `output_handler` pair:

```python
import json
import requests


def handler(data, context):
    """Handle request.
    Args:
        data (obj): the request data
        context (Context): an object containing request and configuration details
    Returns:
        (bytes, string): data to return to client, (optional) response content type
    """
    processed_input = _process_input(data, context)
    response = requests.post(context.rest_uri, data=processed_input)
    return _process_output(response, context)


def _process_input(data, context):
    if context.request_content_type == 'application/json':
        # pass through json (assumes it's correctly formed)
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        # very simple csv handler
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))


def _process_output(data, context):
    if data.status_code != 200:
        raise ValueError(data.content.decode('utf-8'))

    response_content_type = context.accept_header
    prediction = data.content
    return prediction, response_content_type
```

You can also bring in external dependencies to help with your data processing. There are 2 ways to do this:
1. If your model archive contains `code/requirements.txt`, the container will install the Python dependencies at runtime using `pip install -r`.
2. If you are working in a network-isolation situation or if you don't want to install dependencies at runtime everytime your Endpoint starts or Batch Transform job runs, you may want to put pre-downloaded dependencies under `code/lib` directory in your model archive, the container will then add the modules to the Python path. Note that if both `code/lib` and `code/requirements.txt` are present in the model archive, the `requirements.txt` will be ignored.

Your untarred model directory structure may look like this if you are using `requirements.txt`:

        model1
            |--[model_version_number]
                |--variables
                |--saved_model.pb
        model2
            |--[model_version_number]
                |--assets
                |--variables
                |--saved_model.pb
        code
            |--inference.py
            |--requirements.txt

Your untarred model directory structure may look like this if you have downloaded modules under `code/lib`:

        model1
            |--[model_version_number]
                |--variables
                |--saved_model.pb
        model2
            |--[model_version_number]
                |--assets
                |--variables
                |--saved_model.pb
        code
            |--lib
                |--external_module
            |--inference.py

## Deploying a TensorFlow Serving Model

To use your TensorFlow Serving model on SageMaker, you first need to create a SageMaker Model. After creating a SageMaker Model, you can use it to create [SageMaker Batch Transform Jobs](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-batch.html)
 for offline inference, or create [SageMaker Endpoints](https://docs.aws.amazon.com/sagemaker/latest/dg/how-it-works-hosting.html) for real-time inference.


### Creating a SageMaker Model

A SageMaker Model contains references to a `model.tar.gz` file in S3 containing serialized model data, and a Docker image used to serve predictions with that model.

You must package the contents in a model directory (including models, inference.py and external modules) in .tar.gz format in a file named "model.tar.gz" and upload it to S3. If you're on a Unix-based operating system, you can create a "model.tar.gz" using the `tar` utility:

```
tar -czvf model.tar.gz 12345 code
```

where "12345" is your TensorFlow serving model version which contains your SavedModel.

After uploading your `model.tar.gz` to an S3 URI, such as `s3://your-bucket/your-models/model.tar.gz`, create a [SageMaker Model](https://docs.aws.amazon.com/sagemaker/latest/dg/API_CreateModel.html) which will be used to generate inferences. Set `PrimaryContainer.ModelDataUrl` to the S3 URI where you uploaded the `model.tar.gz`, and set `PrimaryContainer.Image` to an image following this format:

```
520713654638.dkr.ecr.{REGION}.amazonaws.com/sagemaker-tensorflow-serving:{SAGEMAKER_TENSORFLOW_SERVING_VERSION}-{cpu|gpu}
```

```
763104351884.dkr.ecr.{REGION}.amazonaws.com/tensorflow-inference:{TENSORFLOW_INFERENCE_VERSION}-{cpu|gpu}
```

For those using Elastic Inference set the image following this format instead:

```
520713654638.dkr.ecr.{REGION}.amazonaws.com/sagemaker-tensorflow-serving-eia:{SAGEMAKER_TENSORFLOW_SERVING_EIA_VERSION}-cpu
```

```
763104351884.dkr.ecr.{REGION}.amazonaws.com/tensorflow-inference-eia:{TENSORFLOW_INFERENCE_EIA_VERSION}-cpu
```

Where `REGION` is your AWS region, such as "us-east-1" or "eu-west-1"; `SAGEMAKER_TENSORFLOW_SERVING_VERSION`, `SAGEMAKER_TENSORFLOW_SERVING_EIA_VERSION`, `TENSORFLOW_INFERENCE_VERSION`, `TENSORFLOW_INFERENCE_EIA_VERSION` are one of the supported versions mentioned above; and "gpu" for use on GPU-based instance types like ml.p3.2xlarge, or "cpu" for use on CPU-based instances like `ml.c5.xlarge`.

The code examples below show how to create a SageMaker Model from a `model.tar.gz` containing a TensorFlow Serving model using the AWS CLI (though you can use any language supported by the [AWS SDK](https://aws.amazon.com/tools/)) and the [SageMaker Python SDK](https://github.com/aws/sagemaker-python-sdk).

#### AWS CLI
```bash
timestamp() {
  date +%Y-%m-%d-%H-%M-%S
}


MODEL_NAME="image-classification-tfs-$(timestamp)"
MODEL_DATA_URL="s3://my-sagemaker-bucket/model/model.tar.gz"

aws s3 cp model.tar.gz $MODEL_DATA_URL

REGION="us-west-2"
TFS_VERSION="1.12.0"
PROCESSOR_TYPE="gpu"
IMAGE="520713654638.dkr.ecr.$REGION.amazonaws.com/sagemaker-tensorflow-serving:$TFS_VERSION-$PROCESSOR_TYPE"

# See the following document for more on SageMaker Roles:
# https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html
ROLE_ARN="[SageMaker-compatible IAM Role ARN]"

aws sagemaker create-model \
    --model-name $MODEL_NAME \
    --primary-container Image=$IMAGE,ModelDataUrl=$MODEL_DATA_URL \
    --execution-role-arn $ROLE_ARN
```

#### SageMaker Python SDK

```python
import os
import sagemaker
from sagemaker.tensorflow.serving import Model

sagemaker_session = sagemaker.Session()
role = 'arn:aws:iam::038453126632:role/service-role/AmazonSageMaker-ExecutionRole-20180718T141171'
bucket = 'am-datasets'
prefix = 'sagemaker/high-throughput-tfs-batch-transform'
s3_path = 's3://{}/{}'.format(bucket, prefix)

model_data = sagemaker_session.upload_data('model.tar.gz',
                                           bucket,
                                           os.path.join(prefix, 'model'))

# The "Model" object doesn't create a SageMaker Model until a Transform Job or Endpoint is created.
tensorflow_serving_model = Model(model_data=model_data,
                                 role=role,
                                 framework_version='1.13',
                                 sagemaker_session=sagemaker_session)

```

After creating a SageMaker Model, you can refer to the model name to create Transform Jobs and Endpoints. Code examples are given below.

### Creating a Batch Transform Job

A Batch Transform job runs an offline-inference job using your TensorFlow Serving model. Input data in S3 is converted to HTTP requests,
and responses are saved to an output bucket in S3.

#### CLI
```bash
TRANSFORM_JOB_NAME="tfs-transform-job"
TRANSFORM_S3_INPUT="s3://my-sagemaker-input-bucket/sagemaker-transform-input-data/"
TRANSFORM_S3_OUTPUT="s3://my-sagemaker-output-bucket/sagemaker-transform-output-data/"

TRANSFORM_INPUT_DATA_SOURCE={S3DataSource={S3DataType="S3Prefix",S3Uri=$TRANSFORM_S3_INPUT}}
CONTENT_TYPE="application/x-image"

INSTANCE_TYPE="ml.p2.xlarge"
INSTANCE_COUNT=2

MAX_PAYLOAD_IN_MB=1
MAX_CONCURRENT_TRANSFORMS=16

aws sagemaker create-transform-job \
    --model-name $MODEL_NAME \
    --transform-input DataSource=$TRANSFORM_INPUT_DATA_SOURCE,ContentType=$CONTENT_TYPE \
    --transform-output S3OutputPath=$TRANSFORM_S3_OUTPUT \
    --transform-resources InstanceType=$INSTANCE_TYPE,InstanceCount=$INSTANCE_COUNT \
    --max-payload-in-mb $MAX_PAYLOAD_IN_MB \
    --max-concurrent-transforms $MAX_CONCURRENT_TRANSFORMS \
    --transform-job-name $JOB_NAME
```

#### SageMaker Python SDK

```python
output_path = 's3://my-sagemaker-output-bucket/sagemaker-transform-output-data/'
tensorflow_serving_transformer = tensorflow_serving_model.transformer(
                                     framework_version = '1.12',
                                     instance_count=2,
                                     instance_type='ml.p2.xlarge',
                                     max_concurrent_transforms=16,
                                     max_payload=1,
                                     output_path=output_path)

input_path = 's3://my-sagemaker-input-bucket/sagemaker-transform-input-data/'
tensorflow_serving_transformer.transform(input_path, content_type='application/x-image')
```

### Creating an Endpoint

A SageMaker Endpoint hosts your TensorFlow Serving model for real-time inference. The [InvokeEndpoint](https://docs.aws.amazon.com/sagemaker/latest/dg/API_runtime_InvokeEndpoint.html) API
is used to send data for predictions to your TensorFlow Serving model.

#### AWS CLI

```bash
ENDPOINT_CONFIG_NAME="my-endpoint-config"
VARIANT_NAME="TFS"
INITIAL_INSTANCE_COUNT=1
INSTANCE_TYPE="ml.p2.xlarge"
aws sagemaker create-endpoint-config \
    --endpoint-config-name $ENDPOINT_CONFIG_NAME \
    --production-variants VariantName=$VARIANT_NAME,ModelName=$MODEL_NAME,InitialInstanceCount=$INITIAL_INSTANCE_COUNT,InstanceType=$INSTANCE_TYPE

ENDPOINT_NAME="my-tfs-endpoint"
aws sagemaker create-endpoint \
    --endpoint-name $ENDPOINT_NAME \
    --endpoint-config-name $ENDPOINT_CONFIG_NAME

BODY="fileb://myfile.jpeg"
CONTENT_TYPE='application/x-image'
OUTFILE="response.json"
aws sagemaker-runtime invoke-endpoint \
    --endpoint-name $ENDPOINT_NAME \
    --content-type=$CONTENT_TYPE \
    --body $BODY \
    $OUTFILE
```

#### SageMaker Python SDK

```python
predictor = tensorflow_serving_model.deploy(initial_instance_count=1,
                                            framework_version='1.12',
                                            instance_type='ml.p2.xlarge')
prediction = predictor.predict(data)
```

## Enabling Batching

You can configure SageMaker TensorFlow Serving Container to batch multiple records together before
performing an inference. This uses [TensorFlow Serving's](https://github.com/tensorflow/serving/blob/master/tensorflow_serving/batching/README.md)
underlying batching feature.

You may be able to significantly improve throughput, especially on GPU instances, by
enabling and configuring batching. To get the best performance, it may be necessary to tune batching parameters,
especially the batch size and batch timeout, to your model, input data, and instance type.

You can set the following environment variables on a SageMaker Model or Transform Job to enable
and configure batching:

```bash
# Configures whether to enable record batching.
# Defaults to false.
SAGEMAKER_TFS_ENABLE_BATCHING="true"

# Configures how many records
# Corresponds to "max_batch_size" in TensorFlow Serving.
# Defaults to 8.
SAGEMAKER_TFS_MAX_BATCH_SIZE="32"

# Configures how long to wait for a full batch, in microseconds.
# Corresponds to "batch_timeout_micros" in TensorFlow Serving.
# Defaults to 1000 (1ms).
SAGEMAKER_TFS_BATCH_TIMEOUT_MICROS="100000"

# Configures how many batches to process concurrently.
# Corresponds to "num_batch_threads" in TensorFlow Serving
# Defaults to number of CPUs.
SAGEMAKER_TFS_NUM_BATCH_THREADS="16"

# Configures number of batches that can be enqueued.
# Corresponds to "max_enqueued_batches" in TensorFlow Serving.
# Defaults to number of CPUs for real-time inference,
# or arbitrarily large for batch transform (because batch transform).
SAGEMAKER_TFS_MAX_ENQUEUED_BATCHES="10000"
```

## Configurable SageMaker Environment Variables
The following environment variables can be set on a SageMaker Model or Transform Job if further configuration is required:

[Configures](https://docs.gunicorn.org/en/stable/settings.html#loglevel)
the logging level for Gunicorn.
```bash
# Defaults to "info"
SAGEMAKER_GUNICORN_LOGLEVEL="debug"
```
[Configures](https://docs.gunicorn.org/en/stable/settings.html#timeout)
how long a Gunicorn worker may be silent before it is killed and restarted.
```bash
# Defaults to 30.
SAGEMAKER_GUNICORN_TIMEOUT_SECONDS="60"
```
[Configures](http://nginx.org/en/docs/http/ngx_http_proxy_module.html#proxy_read_timeout)
the timeout for reading a response from the proxied server.
Note: If SAGEMAKER_GUNICORN_TIMEOUT_SECONDS is greater, 
SAGEMAKER_NGINX_PROXY_READ_TIMEOUT_SECONDS will be set to the 
value of SAGEMAKER_GUNICORN_TIMEOUT_SECONDS.
```bash
# Defaults to 60.
SAGEMAKER_NGINX_PROXY_READ_TIMEOUT_SECONDS="120"
```

## Deploying to Multi-Model Endpoint

SageMaker TensorFlow Serving container (version 1.5.0 and 2.1.0, CPU) now supports Multi-Model Endpoint. With this feature, you can deploy different models (not just different versions of a model) to a single endpoint.
To deploy a Multi-Model endpoint with TFS container, please start the container with environment variable ``SAGEMAKER_MULTI_MODEL=True``.

### Multi-Model Interfaces
We provide four different interfaces for user to interact with a Multi-Model Mode container:

    +---------------------+---------------------------------+---------------------------------------------+
    | Functionality       | Request                         | Response/Actions                            |
    +---------------------+---------------------------------+---------------------------------------------+
    | List A Single Model | GET /models/{model_name}        | Information about the specified model       |
    +---------------------+---------------------------------+---------------------------------------------+
    | List All Models     | GET /models                     | List of Information about all loaded models |
    +---------------------+---------------------------------+---------------------------------------------+
    |                     | POST /models                    | Load model with "model_name" from           |
    |                     | data = {                        | specified url                               |
    | Load A Model        |     "model_name": <model-name>, |                                             |
    |                     |     "url": <path to model data> |                                             |
    |                     | }                               |                                             |
    +---------------------+---------------------------------+---------------------------------------------+
    | Make Invocations    | POST /models/{model_name}/invoke| Return inference result from                |
    |                     | data = <invocation payload>     | the specified model                         |
    +---------------------+---------------------------------+---------------------------------------------+
    | Unload A Model      | DELETE /models/{model_name}     | Unload the specified model                  |
    +---------------------+---------------------------------+---------------------------------------------+

### Maximum Number of Models
Also please note the environment variable ``SAGEMAKER_SAFE_PORT_RANGE`` will limit the number of models that can be loaded to the endpoint at the same time.
Only 90% of the ports will be utilized and each loaded model will be allocated with 2 ports (one for REST API and the other for GRPC).
For example, if the ``SAGEMAKER_SAFE_PORT_RANGE`` is between 9000 to 9999, the maximum number of models that can be loaded to the endpoint at the same time would be 499 ((9999 - 9000) * 0.9 / 2).

### Using Multi-Model Endpoint with Pre/Post-Processing
Multi-Model Endpoint can be used together with Pre/Post-Processing. Each model will need its own ``inference.py`` otherwise default handlers will be used. An example of the directory structure of Multi-Model Endpoint and Pre/Post-Processing would look like this:

        /opt/ml/models/model1/model
            |--[model_version_number]
                |--variables
                |--saved_model.pb
        /opt/ml/models/model2/model
            |--[model_version_number]
                |--assets
                |--variables
                |--saved_model.pb
            code
                |--lib
                    |--external_module
                |--inference.py

## Contributing

Please read [CONTRIBUTING.md](https://github.com/aws/sagemaker-tensorflow-serving-container/blob/master/CONTRIBUTING.md)
for details on our code of conduct, and the process for submitting pull requests to us.

## License

This library is licensed under the Apache 2.0 License.


================================================
FILE: VERSION
================================================
1.8.5.dev0


================================================
FILE: buildspec.yml
================================================
version: 0.2

phases:
  pre_build:
    commands:
      - start-dockerd

      # fix permissions dropped by CodePipeline
      - chmod +x ./scripts/*.sh
      - chmod +x ./docker/build_artifacts/sagemaker/serve
  build:
    commands:
      # prepare the release (update versions, changelog etc.)
      - if is-release-build; then git-release --prepare; fi

      - tox -e jshint,flake8,pylint

      # build images
      - ./scripts/build-all.sh

      # run local tests
      - tox -e py37 -- test/integration/local --framework-version 2.1

      # push docker images to ECR
      - |
        if is-release-build; then
          ./scripts/publish-all.sh
        fi

      # run SageMaker tests
      - |
        if is-release-build; then
          tox -e py37 -- -n 8 test/integration/sagemaker/test_tfs.py --versions 2.1.0
        fi

      # write deployment details to file
      # todo sort out eia versioning
      # todo add non-eia tests
      - |
        if is-release-build; then
          echo '[{
          "repository": "sagemaker-tensorflow-serving",
          "tags": [{
            "source": "1.15.0-cpu",
            "dest": ["1.15.0-cpu", "1.15-cpu", "1.15.0-cpu-'${CODEBUILD_BUILD_ID#*:}'"]
          },{
            "source": "1.15.0-gpu",
            "dest": ["1.15.0-gpu", "1.15-gpu", "1.15.0-gpu-'${CODEBUILD_BUILD_ID#*:}'"]
          }],
          "test": [
            "tox -e py37 -- -n 8 test/integration/sagemaker/test_tfs.py::test_tfs_model --versions 1.15.0 --region {region} --registry {aws-id}"
          ]
        }, {
          "repository": "sagemaker-tensorflow-serving-eia",
          "tags": [{
            "source": "1.14-cpu",
            "dest": ["1.14.0-cpu", "1.14-cpu", "1.14.0-cpu-'${CODEBUILD_BUILD_ID#*:}'"]
          }],
          "test": [
            "tox -e py37 -- test/integration/sagemaker/test_ei.py -n 8 --versions 1.14 --region {region} --registry {aws-id}"
          ]
        }]' > deployments.json
        fi

      # publish the release to github
      - if is-release-build; then git-release --publish; fi

artifacts:
  files:
    - deployments.json
  name: ARTIFACT_1


================================================
FILE: docker/1.11/Dockerfile.cpu
================================================
ARG TFS_VERSION

FROM tensorflow/serving:${TFS_VERSION} as tfs
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

COPY --from=tfs /usr/bin/tensorflow_model_server /usr/bin/tensorflow_model_server

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.11.1

COPY ./ /

ARG TFS_SHORT_VERSION
ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.11/Dockerfile.eia
================================================
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG TFS_SHORT_VERSION

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.11.1

COPY ./ /

RUN mv amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server && \
    chmod +x /usr/bin/tensorflow_model_server

ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.11/Dockerfile.gpu
================================================
ARG TFS_VERSION

FROM tensorflow/serving:${TFS_VERSION}-gpu as tfs
FROM nvidia/cuda:9.0-base-ubuntu16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

COPY --from=tfs /usr/bin/tensorflow_model_server /usr/bin/tensorflow_model_server

# https://github.com/tensorflow/serving/blob/1.12.0/tensorflow_serving/tools/docker/Dockerfile.gpu
ENV NCCL_VERSION=2.2.13
ENV CUDNN_VERSION=7.2.1.38
ENV TF_TENSORRT_VERSION=4.1.2

RUN \
    apt-get update && apt-get install -y --no-install-recommends \
        ca-certificates \
        cuda-command-line-tools-9-0 \
        cuda-command-line-tools-9-0 \
        cuda-cublas-9-0 \
        cuda-cufft-9-0 \
        cuda-curand-9-0 \
        cuda-cusolver-9-0 \
        cuda-cusparse-9-0 \
        libcudnn7=${CUDNN_VERSION}-1+cuda9.0 \
        libnccl2=${NCCL_VERSION}-1+cuda9.0 \
        libgomp1 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda9.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
RUN apt-get update && \
    apt-get install --no-install-recommends \
        nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda9.0 && \
    apt-get update && \
    apt-get install --no-install-recommends \
        libnvinfer4=${TF_TENSORRT_VERSION}-1+cuda9.0 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* && \
    rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* && \
    rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* && \
    rm /usr/lib/x86_64-linux-gnu/libnvparsers*

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.11.1

COPY ./ /

ARG TFS_SHORT_VERSION
ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.12/Dockerfile.cpu
================================================
ARG TFS_VERSION

FROM tensorflow/serving:${TFS_VERSION} as tfs
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

COPY --from=tfs /usr/bin/tensorflow_model_server /usr/bin/tensorflow_model_server

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.12.0

COPY ./ /

ARG TFS_SHORT_VERSION
ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.12/Dockerfile.eia
================================================
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG TFS_SHORT_VERSION

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.12.0

COPY ./ /

RUN mv amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server && \
    chmod +x /usr/bin/tensorflow_model_server

ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.12/Dockerfile.gpu
================================================
ARG TFS_VERSION

FROM tensorflow/serving:${TFS_VERSION}-gpu as tfs
FROM nvidia/cuda:9.0-base-ubuntu16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

COPY --from=tfs /usr/bin/tensorflow_model_server /usr/bin/tensorflow_model_server

# https://github.com/tensorflow/serving/blob/1.12.0/tensorflow_serving/tools/docker/Dockerfile.gpu
ENV NCCL_VERSION=2.2.13
ENV CUDNN_VERSION=7.2.1.38
ENV TF_TENSORRT_VERSION=4.1.2

RUN \
    apt-get update && apt-get install -y --no-install-recommends \
        ca-certificates \
        cuda-command-line-tools-9-0 \
        cuda-command-line-tools-9-0 \
        cuda-cublas-9-0 \
        cuda-cufft-9-0 \
        cuda-curand-9-0 \
        cuda-cusolver-9-0 \
        cuda-cusparse-9-0 \
        libcudnn7=${CUDNN_VERSION}-1+cuda9.0 \
        libnccl2=${NCCL_VERSION}-1+cuda9.0 \
        libgomp1 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/*

# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda9.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
RUN apt-get update && \
    apt-get install --no-install-recommends \
        nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda9.0 && \
    apt-get update && \
    apt-get install --no-install-recommends \
        libnvinfer4=${TF_TENSORRT_VERSION}-1+cuda9.0 && \
    apt-get clean && \
    rm -rf /var/lib/apt/lists/* && \
    rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* && \
    rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* && \
    rm /usr/lib/x86_64-linux-gnu/libnvparsers*

# nginx + njs
RUN \
    apt-get update && \
    apt-get -y install --no-install-recommends curl && \
    curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - && \
    echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list && \
    apt-get update && \
    apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools && \
    apt-get clean

# cython, falcon, gunicorn, tensorflow-serving
RUN \
    pip3 install --no-cache-dir cython falcon gunicorn gevent requests grpcio protobuf && \
    pip3 install --no-dependencies --no-cache-dir tensorflow-serving-api==1.12.0

COPY ./ /


ARG TFS_SHORT_VERSION
ENV SAGEMAKER_TFS_VERSION "${TFS_SHORT_VERSION}"
ENV PATH "$PATH:/sagemaker"


================================================
FILE: docker/1.13/Dockerfile.cpu
================================================
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=1.13

# See http://bugs.python.org/issue19846
ENV LANG C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends curl gnupg2 ca-certificates git wget vim build-essential zlib1g-dev \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    awscli==1.16.130 \
    cython==0.29.10 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.21.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.13.0

COPY ./ /

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

RUN curl https://s3-us-west-2.amazonaws.com/tensorflow-aws/1.13/Serving/CPU-WITH-MKL/libiomp5.so -o /usr/local/lib/libiomp5.so
RUN curl https://s3-us-west-2.amazonaws.com/tensorflow-aws/1.13/Serving/CPU-WITH-MKL/libmklml_intel.so -o /usr/local/lib/libmklml_intel.so

RUN curl https://s3-us-west-2.amazonaws.com/tensorflow-aws/1.13/Serving/CPU-WITH-MKL/tensorflow_model_server -o tensorflow_model_server \
 && chmod 555 tensorflow_model_server \
 && cp tensorflow_model_server /usr/bin/tensorflow_model_server \
 && rm -f tensorflow_model_server

# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.13/Dockerfile.eia
================================================
FROM ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PIP=pip3
ARG TFS_SHORT_VERSION

ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends curl gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli==1.16.130 \
    cython==0.29.10 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.21.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.13.0

COPY ./ /

RUN mv amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server && \
    chmod +x /usr/bin/tensorflow_model_server


================================================
FILE: docker/1.13/Dockerfile.gpu
================================================
FROM nvidia/cuda:10.0-base-ubuntu16.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Add arguments to achieve the version, python and url
# PYTHON=python for 2.7, PYTHON=python3 for 3.5, 3.6 is not available directly on 16.04
ARG PYTHON=python3
#  PIP=pip for 2.7, PIP=pip3 for 3.5, 3.6 is not available directly on 16.04
ARG PIP=pip3
ARG PYTHON_VERSION=3.6.6
ARG TFS_SHORT_VERSION=1.13

# See http://bugs.python.org/issue19846
ENV LANG C.UTF-8
ENV NCCL_VERSION=2.4.7-1+cuda10.0
ENV CUDNN_VERSION=7.5.1.10-1+cuda10.0
ENV TF_TENSORRT_VERSION=5.0.2
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model

RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    ca-certificates \
    cuda-command-line-tools-10-0 \
    cuda-cublas-10-0 \
    cuda-cufft-10-0 \
    cuda-curand-10-0 \
    cuda-cusolver-10-0 \
    cuda-cusparse-10-0 \
    libcudnn7=${CUDNN_VERSION} \
    libnccl2=${NCCL_VERSION} \
    libgomp1 \
    curl \
    git \
    wget \
    vim \
    #next two lines are needed to add python-3.6 should be removed from ubuntu-16.10
    build-essential \
    zlib1g-dev \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda10.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    nvinfer-runtime-trt-repo-ubuntu1604-${TF_TENSORRT_VERSION}-ga-cuda10.0 \
 && apt-get update \
 && apt-get install -y --no-install-recommends \
    libnvinfer5=${TF_TENSORRT_VERSION}-1+cuda10.0 \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/* \
 && rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* \
 && rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* \
 && rm /usr/lib/x86_64-linux-gnu/libnvparsers* \
 && rm -rf /var/lib/apt/lists/*

RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz \
 && tar -xvf Python-$PYTHON_VERSION.tgz && cd Python-$PYTHON_VERSION \
 && ./configure && make && make install \
 && apt-get update && apt-get install -y --no-install-recommends \
    libreadline-gplv2-dev \
    libncursesw5-dev \
    libssl-dev \
    libsqlite3-dev \
    tk-dev libgdbm-dev \
    libc6-dev libbz2-dev \
 && rm -rf /var/lib/apt/lists/* \
 && make && make install \
 && rm -rf ../Python-$PYTHON_VERSION* \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN curl https://s3-us-west-2.amazonaws.com/tensorflow-aws/1.13/Serving/GPU/tensorflow_model_server --output tensorflow_model_server && \
chmod 555 tensorflow_model_server && cp tensorflow_model_server /usr/bin/tensorflow_model_server && \
rm -f tensorflow_model_server

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends curl gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    boto3 \
    awscli==1.16.130 \
    cython==0.29.10 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.21.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api-gpu==1.13.0

COPY ./ /

# Expose gRPC and REST port
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.14/Dockerfile.cpu
================================================
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=1.14

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends curl gnupg2 ca-certificates git wget vim build-essential zlib1g-dev \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx nginx-module-njs python3 python3-pip python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli==1.16.196 \
    cython==0.29.12 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.14.0

COPY ./ /

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libiomp5.so -o /usr/local/lib/libiomp5.so
RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libmklml_intel.so -o /usr/local/lib/libmklml_intel.so

RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/1.14/Serving/CPU-WITH-MKL/tensorflow_model_server -o tensorflow_model_server && \
chmod 555 tensorflow_model_server && cp tensorflow_model_server /usr/bin/tensorflow_model_server && \
rm -f tensorflow_model_server

# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.14/Dockerfile.eia
================================================
FROM public.ecr.aws/e2s1w5p1/ubuntu:16.04
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG TFS_SHORT_VERSION=1.14
ARG S3_TF_VERSION=1-14-0
ARG S3_TF_EI_VERSION=1-4
ARG PYTHON=python3
ARG PYTHON_VERSION=3.6.6
ARG HEALTH_CHECK_VERSION=1.5.3

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV MODEL_BASE_PATH=/models
ENV MODEL_NAME=model
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    build-essential \
    ca-certificates \
    curl \
    git \
    gnupg2 \
    vim \
    wget \
    zlib1g-dev \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx wget nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz \
 && tar -xvf Python-$PYTHON_VERSION.tgz \
 && cd Python-$PYTHON_VERSION \
 && ./configure \
 && make \
 && make install \
 && apt-get update \
 && apt-get install -y --no-install-recommends \
    libbz2-dev \
    libc6-dev \
    libgdbm-dev \
    libncursesw5-dev \
    libreadline-gplv2-dev \
    libsqlite3-dev \
    libssl-dev \
    tk-dev \
 && rm -rf /var/lib/apt/lists/* \
 && make \
 && make install \
 && rm -rf ../Python-$PYTHON_VERSION* \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip \
 && ln -s $(which ${PYTHON}) /usr/local/bin/python

# Some TF tools expect a "python" binary
RUN pip install -U --no-cache-dir --upgrade \
    pip \
    setuptools

# cython, falcon, gunicorn, grpc
RUN pip install --no-cache-dir \
    cython==0.29.13 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.22.0 \
    docutils==0.14 \
    awscli==1.16.196 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && pip install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.14.0

COPY sagemaker /sagemaker

RUN wget https://amazonei-tools.s3.amazonaws.com/v${HEALTH_CHECK_VERSION}/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -O /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && tar -xvf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -C /opt/ \
 && rm -rf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && chmod a+x /opt/ei_tools/bin/health_check \
 && mkdir -p /opt/ei_health_check/bin \
 && ln -s /opt/ei_tools/bin/health_check /opt/ei_health_check/bin/health_check \
 && ln -s /opt/ei_tools/lib /opt/ei_health_check/lib

# Expose ports
EXPOSE 8500 8501

RUN wget https://amazonei-tensorflow.s3.amazonaws.com/tensorflow-serving/v1.14/ubuntu/archive/tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
            -O /tmp/tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
 && cd /tmp \
 && tar zxf tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
 && mv tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}/amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server \
 && chmod +x /usr/bin/tensorflow_model_server \
 && rm -rf tensorflow-serving-${S3_TF_VERSION}*

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.14/Dockerfile.gpu
================================================
FROM nvidia/cuda:10.0-base-ubuntu16.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Add arguments to achieve the version, python and url
#  PYTHON=python for 2.7
#  PYTHON=python3 for 3.5, 3.6 is not available directly on 16.04
ARG PYTHON=python3
#  PIP=pip for 2.7
#  PIP=pip3 for 3.5, 3.6 is not available directly on 16.04
ARG PIP=pip3
ARG PYTHON_VERSION=3.6.6
ARG TFS_SHORT_VERSION=1.14

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV NCCL_VERSION=2.4.7-1+cuda10.0
ENV CUDNN_VERSION=7.5.1.10-1+cuda10.0
ENV TF_TENSORRT_VERSION=5.0.2
ENV PYTHONDONTWRITEBYTECODE=1
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model

RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    ca-certificates \
    cuda-command-line-tools-10-0 \
    cuda-cublas-10-0 \
    cuda-cufft-10-0 \
    cuda-curand-10-0 \
    cuda-cusolver-10-0 \
    cuda-cusparse-10-0 \
    libcudnn7=${CUDNN_VERSION} \
    libnccl2=${NCCL_VERSION} \
    libgomp1 \
    curl \
    git \
    wget \
    vim \
    #next two lines are needed to add python-3.6 should be removed from ubuntu-16.10
    build-essential \
    zlib1g-dev \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1604-4.0.1-ga-cuda10.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
RUN apt-get update \
 && apt-get install -y --no-install-recommends nvinfer-runtime-trt-repo-ubuntu1604-${TF_TENSORRT_VERSION}-ga-cuda10.0 \
 && apt-get update \
 && apt-get install -y --no-install-recommends libnvinfer5=${TF_TENSORRT_VERSION}-1+cuda10.0 \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/* \
 && rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* \
 && rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* \
 && rm /usr/lib/x86_64-linux-gnu/libnvparsers*

RUN wget https://www.python.org/ftp/python/$PYTHON_VERSION/Python-$PYTHON_VERSION.tgz \
 && tar -xvf Python-$PYTHON_VERSION.tgz \
 && cd Python-$PYTHON_VERSION \
 && ./configure \
 && make \
 && make install \
 && apt-get update \
 && apt-get install -y --no-install-recommends libreadline-gplv2-dev libncursesw5-dev libssl-dev libsqlite3-dev tk-dev libgdbm-dev libc6-dev libbz2-dev \
 && make \
 && make install \
 && rm -rf ../Python-$PYTHON_VERSION* \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/1.14/Serving/GPU/tensorflow_model_server --output tensorflow_model_server \
 && chmod 555 tensorflow_model_server && cp tensorflow_model_server /usr/bin/tensorflow_model_server \
 && rm -f tensorflow_model_server

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends curl gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ xenial nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends nginx nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    boto3 \
    awscli==1.16.196 \
    cython==0.29.12 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api-gpu==1.14.0

COPY ./ /

# Expose gRPC and REST port
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.15/Dockerfile.cpu
================================================
FROM public.ecr.aws/ubuntu/ubuntu:18.04

LABEL maintainer="Amazon AI"
# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
LABEL com.amazonaws.sagemaker.capabilities.multi-models=true

# Add arguments to achieve the version, python and url
ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=1.15.2
ARG TF_S3_URL=https://tensorflow-aws.s3-us-west-2.amazonaws.com
ARG TF_MODEL_SERVER_SOURCE=${TF_S3_URL}/${TFS_SHORT_VERSION}/Serving/CPU-WITH-MKL/tensorflow_model_server

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# To prevent user interaction when installing time zone data package
ENV DEBIAN_FRONTEND=noninteractive

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
    ca-certificates \
    git \
    wget \
    vim \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli \
    boto3 \
    pyYAML==5.3.1 \
    cython==0.29.12 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.15.0

COPY sagemaker /sagemaker

WORKDIR /

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

RUN curl ${TF_S3_URL}/MKL-Libraries/libiomp5.so -o /usr/local/lib/libiomp5.so \
 && curl ${TF_S3_URL}/MKL-Libraries/libmklml_intel.so -o /usr/local/lib/libmklml_intel.so

RUN curl ${TF_MODEL_SERVER_SOURCE} -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.15/Dockerfile.eia
================================================
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Add arguments to achieve the version, python and url
ARG PYTHON=python3
ARG PIP=pip3
ARG HEALTH_CHECK_VERSION=1.6.3
ARG S3_TF_EI_VERSION=1-5
ARG S3_TF_VERSION=1-15-2
#This is the serving version not TF version
ARG TFS_SHORT_VERSION=1-15-0


# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# To prevent user interaction when installing time zone data package
ENV DEBIAN_FRONTEND=noninteractive

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
    ca-certificates \
    git \
    wget \
    vim \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli==1.18.32 \
    cython==0.29.16 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.23.0 \
    grpcio==1.27.2 \
    protobuf==3.11.3 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==1.15.0

COPY sagemaker /sagemaker

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

# Get EI tools
RUN wget https://amazonei-tools.s3.amazonaws.com/v${HEALTH_CHECK_VERSION}/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -O /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && tar -xvf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -C /opt/ \
 && rm -rf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && chmod a+x /opt/ei_tools/bin/health_check \
 && mkdir -p /opt/ei_health_check/bin \
 && ln -s /opt/ei_tools/bin/health_check /opt/ei_health_check/bin/health_check \
 && ln -s /opt/ei_tools/lib /opt/ei_health_check/lib

RUN wget https://amazonei-tensorflow.s3.amazonaws.com/tensorflow-serving/v1.15/ubuntu/archive/tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
 -O /tmp/tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
 && cd /tmp \
 && tar zxf tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}.tar.gz \
 && mv tensorflow-serving-${S3_TF_VERSION}-ubuntu-ei-${S3_TF_EI_VERSION}/amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server \
 && chmod +x /usr/bin/tensorflow_model_server \
 && rm -rf tensorflow-serving-${S3_TF_VERSION}*


# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/1.15/Dockerfile.gpu
================================================
FROM nvidia/cuda:10.0-base-ubuntu18.04

LABEL maintainer="Amazon AI"
# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Add arguments to achieve the version, python and url
ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=1.15.2
ARG TF_MODEL_SERVER_SOURCE=https://tensorflow-aws.s3-us-west-2.amazonaws.com/${TFS_SHORT_VERSION}/Serving/GPU/tensorflow_model_server

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV NCCL_VERSION=2.4.7-1+cuda10.0
ENV CUDNN_VERSION=7.5.1.10-1+cuda10.0
ENV TF_TENSORRT_VERSION=5.0.2
ENV TF_TENSORRT_LIB_VERSION=5.1.2
ENV PYTHONDONTWRITEBYTECODE=1
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# Prevent docker build from getting stopped by request for user interaction
ENV DEBIAN_FRONTEND=noninteractive

# https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212771
# Fix cuda repo's GPG key. Nvidia is no longer updating the machine-learning repo.
# Need to manually pull and install necessary debs to continue using these versions.
RUN rm /etc/apt/sources.list.d/cuda.list \
&& apt-key del 7fa2af80 \
&& apt-get update && apt-get install -y --no-install-recommends wget \
&& wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \
&& dpkg -i cuda-keyring_1.0-1_all.deb \
&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_${CUDNN_VERSION}_amd64.deb \
&& dpkg -i libcudnn7_${CUDNN_VERSION}_amd64.deb \
&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnccl2_${NCCL_VERSION}_amd64.deb \
&& dpkg -i libnccl2_${NCCL_VERSION}_amd64.deb \
&& rm *.deb

RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    ca-certificates \
    cuda-command-line-tools-10-0 \
    cuda-cublas-10-0 \
    cuda-cufft-10-0 \
    cuda-curand-10-0 \
    cuda-cusolver-10-0 \
    cuda-cusparse-10-0 \
    libgomp1 \
    curl \
    git \
    wget \
    vim \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    boto3 \
    awscli==1.18.34 \
    pyYAML==5.3.1 \
    cython==0.29.12 \
    falcon==2.0.0 \
    gunicorn==19.9.0 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.24.1 \
    protobuf==3.10.0 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api-gpu==1.15.0

# https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212771
# Fix cuda repo's GPG key. Nvidia is no longer updating the machine-learning repo.
# Need to manually pull and install necessary debs to continue using these versions.
RUN wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvinfer-runtime-trt-repo-ubuntu1804-${TF_TENSORRT_VERSION}-ga-cuda10.0_1-1_amd64.deb \
&& dpkg -i nvinfer-runtime-trt-repo-ubuntu1804-${TF_TENSORRT_VERSION}-ga-cuda10.0_1-1_amd64.deb \
&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer5_${TF_TENSORRT_LIB_VERSION}-1+cuda10.0_amd64.deb \
&& dpkg -i libnvinfer5_${TF_TENSORRT_LIB_VERSION}-1+cuda10.0_amd64.deb \
&& rm *.deb \
&& rm -rf /var/lib/apt/lists/* \
&& rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* \
&& rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* \
&& rm /usr/lib/x86_64-linux-gnu/libnvparsers*

COPY sagemaker /sagemaker

RUN curl ${TF_MODEL_SERVER_SOURCE} -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose gRPC and REST port
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/2.0/Dockerfile.cpu
================================================
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=2.0.1
ARG TFS_URL=https://tensorflow-aws.s3-us-west-2.amazonaws.com/${TFS_SHORT_VERSION}/Serving/CPU-WITH-MKL/tensorflow_model_server

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
ENV DEBIAN_FRONTEND=noninteractive

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
    ca-certificates \
    git \
    wget \
    vim \
    build-essential \
    zlib1g-dev \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli==1.16.303 \
    cython==0.29.14 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.26.0 \
    protobuf==3.11.1 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==2.0

COPY ./sagemaker /sagemaker

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libiomp5.so -o /usr/local/lib/libiomp5.so
RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libmklml_intel.so -o /usr/local/lib/libmklml_intel.so

RUN curl $TFS_URL -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow-2.0.1/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/2.0/Dockerfile.eia
================================================
FROM ubuntu:18.04

LABEL maintainer="Amazon AI"
# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT
# https://docs.aws.amazon.com/sagemaker/latest/dg/inference-pipeline-real-time.html
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

# Add arguments to achieve the version, python and url
ARG PYTHON=python3
ARG PIP=pip3
ARG HEALTH_CHECK_VERSION=1.6.3
ARG S3_TF_EI_VERSION=1-5
ARG S3_TF_VERSION=2-0-0


# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${S3_TF_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# To prevent user interaction when installing time zone data package
ENV DEBIAN_FRONTEND=noninteractive

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
    ca-certificates \
    git \
    wget \
    vim \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli==1.18.32 \
    cython==0.29.16 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.23.0 \
    grpcio==1.27.2 \
    protobuf==3.11.3 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==2.0.0

COPY sagemaker /sagemaker

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python \
 && ln -s /usr/local/bin/pip3 /usr/bin/pip

# Get EI tools
RUN wget https://amazonei-tools.s3.amazonaws.com/v${HEALTH_CHECK_VERSION}/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -O /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && tar -xvf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz -C /opt/ \
 && rm -rf /opt/ei_tools_${HEALTH_CHECK_VERSION}.tar.gz \
 && chmod a+x /opt/ei_tools/bin/health_check \
 && mkdir -p /opt/ei_health_check/bin \
 && ln -s /opt/ei_tools/bin/health_check /opt/ei_health_check/bin/health_check \
 && ln -s /opt/ei_tools/lib /opt/ei_health_check/lib

RUN wget https://amazonei-tensorflow.s3.amazonaws.com/tensorflow-serving/v2.0/archive/tensorflow-serving-${S3_TF_VERSION}-ei-${S3_TF_EI_VERSION}.tar.gz \
 -O /tmp/tensorflow-serving-${S3_TF_VERSION}-ei-${S3_TF_EI_VERSION}.tar.gz \
 && cd /tmp \
 && tar zxf tensorflow-serving-${S3_TF_VERSION}-ei-${S3_TF_EI_VERSION}.tar.gz \
 && mv tensorflow-serving-${S3_TF_VERSION}-ei-${S3_TF_EI_VERSION}/amazonei_tensorflow_model_server /usr/bin/tensorflow_model_server \
 && chmod +x /usr/bin/tensorflow_model_server \
 && rm -rf tensorflow-serving-${S3_TF_VERSION}*


# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow-2.0/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/2.0/Dockerfile.gpu
================================================
FROM nvidia/cuda:10.0-base-ubuntu18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=2.0.1
ARG TFS_URL=https://tensorflow-aws.s3-us-west-2.amazonaws.com/${TFS_SHORT_VERSION}/Serving/GPU/tensorflow_model_server

ENV NCCL_VERSION=2.4.7-1+cuda10.0
ENV CUDNN_VERSION=7.5.1.10-1+cuda10.0
ENV TF_TENSORRT_VERSION=5.0.2

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV PYTHONDONTWRITEBYTECODE=1
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# Fix for the interactive mode during an install in step 21
ENV DEBIAN_FRONTEND=noninteractive

RUN apt-get update \
 && apt-get install -y --no-install-recommends \
    ca-certificates \
    cuda-command-line-tools-10-0 \
    cuda-cublas-10-0 \
    cuda-cufft-10-0 \
    cuda-curand-10-0 \
    cuda-cusolver-10-0 \
    cuda-cusparse-10-0 \
    libcudnn7=${CUDNN_VERSION} \
    libnccl2=${NCCL_VERSION} \
    libgomp1 \
    curl \
    git \
    wget \
    vim \
    build-essential \
    zlib1g-dev \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# The 'apt-get install' of nvinfer-runtime-trt-repo-ubuntu1804-4.0.1-ga-cuda10.0
# adds a new list which contains libnvinfer library, so it needs another
# 'apt-get update' to retrieve that list before it can actually install the
# library.
# We don't install libnvinfer-dev since we don't need to build against TensorRT,
# and libnvinfer4 doesn't contain libnvinfer.a static library.
RUN apt-get update \
 && apt-get install -y --no-install-recommends nvinfer-runtime-trt-repo-ubuntu1804-${TF_TENSORRT_VERSION}-ga-cuda10.0 \
 && apt-get update \
 && apt-get install -y --no-install-recommends libnvinfer5=${TF_TENSORRT_VERSION}-1+cuda10.0 \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/* \
 && rm /usr/lib/x86_64-linux-gnu/libnvinfer_plugin* \
 && rm /usr/lib/x86_64-linux-gnu/libnvcaffe_parser* \
 && rm /usr/lib/x86_64-linux-gnu/libnvparsers*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    boto3 \
    awscli==1.16.303 \
    cython==0.29.14 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.26.0 \
    protobuf==3.11.1 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api-gpu==2.0

COPY ./sagemaker /sagemaker

RUN curl $TFS_URL -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose gRPC and REST port
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow-2.0.1/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/2.1/Dockerfile.cpu
================================================
FROM public.ecr.aws/ubuntu/ubuntu:18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true
LABEL com.amazonaws.sagemaker.capabilities.multi-models=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=2.1
ARG TFS_URL=https://tensorflow-aws.s3-us-west-2.amazonaws.com/2.1/Serving/CPU-WITH-MKL/tensorflow_model_server

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONDONTWRITEBYTECODE=1
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV LD_LIBRARY_PATH='/usr/local/lib:$LD_LIBRARY_PATH'
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
ENV DEBIAN_FRONTEND=noninteractive

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
    ca-certificates \
    git \
    wget \
    vim \
    build-essential \
    zlib1g-dev \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade pip setuptools

# cython, falcon, gunicorn, grpc
RUN ${PIP} install --no-cache-dir \
    awscli \
    boto3 \
    cython==0.29.14 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.27.1 \
    protobuf==3.11.1 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api==2.1.0

COPY ./sagemaker /sagemaker

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libiomp5.so -o /usr/local/lib/libiomp5.so
RUN curl https://tensorflow-aws.s3-us-west-2.amazonaws.com/MKL-Libraries/libmklml_intel.so -o /usr/local/lib/libmklml_intel.so

RUN curl $TFS_URL -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose ports
# gRPC and REST
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow-2.1/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/2.1/Dockerfile.gpu
================================================
FROM nvidia/cuda:10.1-base-ubuntu18.04

LABEL maintainer="Amazon AI"
LABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true

ARG PYTHON=python3
ARG PIP=pip3
ARG TFS_SHORT_VERSION=2.1
ARG TFS_URL=https://tensorflow-aws.s3-us-west-2.amazonaws.com/2.1/Serving/GPU/tensorflow_model_server

ENV NCCL_VERSION=2.4.7-1+cuda10.1
ENV CUDNN_VERSION=7.6.2.24-1+cuda10.1
ENV TF_TENSORRT_VERSION=5.0.2
ENV TF_TENSORRT_LIB_VERSION=6.0.1

# See http://bugs.python.org/issue19846
ENV LANG=C.UTF-8
ENV PYTHONDONTWRITEBYTECODE=1
# Python won’t try to write .pyc or .pyo files on the import of source modules
ENV PYTHONUNBUFFERED=1
ENV SAGEMAKER_TFS_VERSION="${TFS_SHORT_VERSION}"
ENV PATH="$PATH:/sagemaker"
ENV MODEL_BASE_PATH=/models
# The only required piece is the model name in order to differentiate endpoints
ENV MODEL_NAME=model
# Fix for the interactive mode during an install in step 21
ENV DEBIAN_FRONTEND=noninteractive

# https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212771
# Fix cuda repo's GPG key. Nvidia is no longer updating the machine-learning repo.
# Need to manually pull and install necessary debs to continue using these versions.
RUN rm /etc/apt/sources.list.d/cuda.list \
&& apt-key del 7fa2af80 \
&& apt-get update && apt-get install -y --no-install-recommends wget \
&& wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-keyring_1.0-1_all.deb \
&& dpkg -i cuda-keyring_1.0-1_all.deb \
&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libcudnn7_${CUDNN_VERSION}_amd64.deb \
&& dpkg -i libcudnn7_${CUDNN_VERSION}_amd64.deb \
&& wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnccl2_${NCCL_VERSION}_amd64.deb \
&& dpkg -i libnccl2_${NCCL_VERSION}_amd64.deb \
&& rm *.deb

# allow unauthenticated and allow downgrades for special libcublas library
RUN apt-get update \
 && apt-get install -y --no-install-recommends --allow-unauthenticated --allow-downgrades\
    ca-certificates \
    cuda-command-line-tools-10-1 \
    cuda-cufft-10-1 \
    cuda-curand-10-1 \
    cuda-cusolver-10-1 \
    cuda-cusparse-10-1 \
    #cuda-cublas-dev not available with 10-1, install libcublas instead
    libcublas10=10.1.0.105-1 \
    libcublas-dev=10.1.0.105-1 \
    libgomp1 \
    curl \
    git \
    wget \
    vim \
    build-essential \
    zlib1g-dev \
    python3 \
    python3-pip \
    python3-setuptools \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

RUN ${PIP} --no-cache-dir install --upgrade \
    pip \
    setuptools

# Some TF tools expect a "python" binary
RUN ln -s $(which ${PYTHON}) /usr/local/bin/python

# nginx + njs
RUN apt-get update \
 && apt-get -y install --no-install-recommends \
    curl \
    gnupg2 \
 && curl -s http://nginx.org/keys/nginx_signing.key | apt-key add - \
 && echo 'deb http://nginx.org/packages/ubuntu/ bionic nginx' >> /etc/apt/sources.list \
 && apt-get update \
 && apt-get -y install --no-install-recommends \
    nginx \
    nginx-module-njs \
 && apt-get clean \
 && rm -rf /var/lib/apt/lists/*

# https://forums.developer.nvidia.com/t/notice-cuda-linux-repository-key-rotation/212771
# Nvidia is no longer updating the machine-learning repo.
# Need to manually pull and install necessary debs to continue using these versions.
# nvinfer-runtime-trt-repo doesn't have a 1804-cuda10.1 version.
RUN wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvinfer-runtime-trt-repo-ubuntu1804-${TF_TENSORRT_VERSION}-ga-cuda10.0_1-1_amd64.deb \
 && dpkg -i nvinfer-runtime-trt-repo-ubuntu1804-${TF_TENSORRT_VERSION}-ga-cuda10.0_1-1_amd64.deb \
 && wget https://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/libnvinfer6_${TF_TENSORRT_LIB_VERSION}-1+cuda10.1_amd64.deb \
 && dpkg -i libnvinfer6_${TF_TENSORRT_LIB_VERSION}-1+cuda10.1_amd64.deb \
 && rm *.deb \
 && rm -rf /var/lib/apt/lists/*

# cython, falcon, gunicorn, grpc
RUN ${PIP} install -U --no-cache-dir \
    boto3 \
    awscli \
    cython==0.29.14 \
    falcon==2.0.0 \
    gunicorn==20.0.4 \
    gevent==1.4.0 \
    requests==2.22.0 \
    grpcio==1.27.1  \
    protobuf==3.11.1 \
# using --no-dependencies to avoid installing tensorflow binary
 && ${PIP} install --no-dependencies --no-cache-dir \
    tensorflow-serving-api-gpu==2.1.0

COPY ./sagemaker /sagemaker

RUN curl $TFS_URL -o /usr/bin/tensorflow_model_server \
 && chmod 555 /usr/bin/tensorflow_model_server

# Expose gRPC and REST port
EXPOSE 8500 8501

# Set where models should be stored in the container
RUN mkdir -p ${MODEL_BASE_PATH}

# Create a script that runs the model server so we can use environment variables
# while also passing in arguments from the docker command line
RUN echo '#!/bin/bash \n\n' > /usr/bin/tf_serving_entrypoint.sh \
 && echo '/usr/bin/tensorflow_model_server --port=8500 --rest_api_port=8501 --model_name=${MODEL_NAME} --model_base_path=${MODEL_BASE_PATH}/${MODEL_NAME} "$@"' >> /usr/bin/tf_serving_entrypoint.sh \
 && chmod +x /usr/bin/tf_serving_entrypoint.sh

ADD https://raw.githubusercontent.com/aws/aws-deep-learning-containers-utils/master/deep_learning_container.py /usr/local/bin/deep_learning_container.py

RUN chmod +x /usr/local/bin/deep_learning_container.py

RUN curl https://aws-dlc-licenses.s3.amazonaws.com/tensorflow-2.1/license.txt -o /license.txt

CMD ["/usr/bin/tf_serving_entrypoint.sh"]


================================================
FILE: docker/__init__.py
================================================


================================================
FILE: docker/build_artifacts/__init__.py
================================================


================================================
FILE: docker/build_artifacts/deep_learning_container.py
================================================
# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
import re
import json
import logging
import requests


def _validate_instance_id(instance_id):
    """
    Validate instance ID
    """
    instance_id_regex = r"^(i-\S{17})"
    compiled_regex = re.compile(instance_id_regex)
    match = compiled_regex.match(instance_id)

    if not match:
        return None

    return match.group(1)


def _retrieve_instance_id():
    """
    Retrieve instance ID from instance metadata service
    """
    instance_id = None
    url = "http://169.254.169.254/latest/meta-data/instance-id"
    response = requests_helper(url, timeout=0.1)

    if response is not None:
        instance_id = _validate_instance_id(response.text)

    return instance_id


def _retrieve_instance_region():
    """
    Retrieve instance region from instance metadata service
    """
    region = None
    valid_regions = [
        "ap-northeast-1",
        "ap-northeast-2",
        "ap-southeast-1",
        "ap-southeast-2",
        "ap-south-1",
        "ca-central-1",
        "eu-central-1",
        "eu-north-1",
        "eu-west-1",
        "eu-west-2",
        "eu-west-3",
        "sa-east-1",
        "us-east-1",
        "us-east-2",
        "us-west-1",
        "us-west-2",
    ]

    url = "http://169.254.169.254/latest/dynamic/instance-identity/document"
    response = requests_helper(url, timeout=0.1)

    if response is not None:
        response_json = json.loads(response.text)

        if response_json["region"] in valid_regions:
            region = response_json["region"]

    return region


def query_bucket():
    """
    GET request on an empty object from an Amazon S3 bucket
    """
    response = None
    instance_id = _retrieve_instance_id()
    region = _retrieve_instance_region()

    if instance_id is not None and region is not None:
        url = (
            "https://aws-deep-learning-containers-{0}.s3.{0}.amazonaws.com"
            "/dlc-containers.txt?x-instance-id={1}".format(region, instance_id)
        )
        response = requests_helper(url, timeout=0.2)

    logging.debug("Query bucket finished: {}".format(response))

    return response


def requests_helper(url, timeout):
    response = None
    try:
        response = requests.get(url, timeout=timeout)
    except requests.exceptions.RequestException as e:
        logging.error("Request exception: {}".format(e))

    return response


def main():
    """
    Invoke bucket query
    """
    # Logs are not necessary for normal run. Remove this line while debugging.
    logging.getLogger().disabled = True

    logging.basicConfig(level=logging.ERROR)
    query_bucket()


if __name__ == "__main__":
    main()


================================================
FILE: docker/build_artifacts/dockerd-entrypoint.py
================================================
# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.

import os.path
import subprocess
import shlex
import sys

if not os.path.exists("/opt/ml/input/config"):
    subprocess.call(["python", "/usr/local/bin/deep_learning_container.py", "&>/dev/null", "&"])

subprocess.check_call(shlex.split(" ".join(sys.argv[1:])))


================================================
FILE: docker/build_artifacts/sagemaker/__init__.py
================================================
# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.


================================================
FILE: docker/build_artifacts/sagemaker/multi_model_utils.py
================================================
# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
import fcntl
import signal
import time
from contextlib import contextmanager

MODEL_CONFIG_FILE = "/sagemaker/model-config.cfg"
DEFAULT_LOCK_FILE = "/sagemaker/lock-file.lock"


@contextmanager
def lock(path=DEFAULT_LOCK_FILE):
    f = open(path, "w", encoding="utf8")
    fd = f.fileno()
    fcntl.lockf(fd, fcntl.LOCK_EX)

    try:
        yield
    finally:
        time.sleep(1)
        fcntl.lockf(fd, fcntl.LOCK_UN)


@contextmanager
def timeout(seconds=60):
    def _raise_timeout_error(signum, frame):
        raise Exception(408, "Timed out after {} seconds".format(seconds))

    try:
        signal.signal(signal.SIGALRM, _raise_timeout_error)
        signal.alarm(seconds)
        yield
    finally:
        signal.alarm(0)


class MultiModelException(Exception):
    def __init__(self, code, msg):
        Exception.__init__(self, code, msg)
        self.code = code
        self.msg = msg


================================================
FILE: docker/build_artifacts/sagemaker/nginx.conf.template
================================================
load_module modules/ngx_http_js_module.so;

worker_processes auto;
daemon off;
pid /tmp/nginx.pid;
error_log  /dev/stderr %NGINX_LOG_LEVEL%;

worker_rlimit_nofile 4096;

events {
  worker_connections 2048;
}

http {
  include /etc/nginx/mime.types;
  default_type application/json;
  access_log /dev/stdout combined;
  js_import tensorflowServing.js;

  proxy_read_timeout %PROXY_READ_TIMEOUT%;  

  upstream tfs_upstream {
    %TFS_UPSTREAM%;
  }

  upstream gunicorn_upstream {
    server unix:/tmp/gunicorn.sock fail_timeout=1;
  }

  server {
    listen %NGINX_HTTP_PORT% deferred;
    client_max_body_size 0;
    client_body_buffer_size 100m;
    subrequest_output_buffer_size 100m;

    set $tfs_version %TFS_VERSION%;
    set $default_tfs_model %TFS_DEFAULT_MODEL_NAME%;

    location /tfs {
        rewrite ^/tfs/(.*) /$1  break;
        proxy_redirect off;
        proxy_pass_request_headers off;
        proxy_set_header Content-Type 'application/json';
        proxy_set_header Accept 'application/json';
        proxy_pass http://tfs_upstream;
    }

    location /ping {
        %FORWARD_PING_REQUESTS%;
    }

    location /invocations {
        %FORWARD_INVOCATION_REQUESTS%;
    }

    location /models {
        proxy_pass http://gunicorn_upstream/models;
    }

    location / {
        return 404 '{"error": "Not Found"}';
    }

    keepalive_timeout 3;
  }
}
  

================================================
FILE: docker/build_artifacts/sagemaker/python_service.py
================================================
# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.
import bisect
import importlib.util
import json
import logging
import os
import subprocess
import grpc

import falcon
import requests
import random

from multi_model_utils import lock, MultiModelException
import tfs_utils

SAGEMAKER_MULTI_MODEL_ENABLED = os.environ.get("SAGEMAKER_MULTI_MODEL", "false").lower() == "true"
MODEL_DIR = "models" if SAGEMAKER_MULTI_MODEL_ENABLED else "model"
INFERENCE_SCRIPT_PATH = f"/opt/ml/{MODEL_DIR}/code/inference.py"

SAGEMAKER_BATCHING_ENABLED = os.environ.get("SAGEMAKER_TFS_ENABLE_BATCHING", "false").lower()
MODEL_CONFIG_FILE_PATH = "/sagemaker/model-config.cfg"
TFS_GRPC_PORTS = os.environ.get("TFS_GRPC_PORTS")
TFS_REST_PORTS = os.environ.get("TFS_REST_PORTS")
SAGEMAKER_TFS_PORT_RANGE = os.environ.get("SAGEMAKER_SAFE_PORT_RANGE")
TFS_INSTANCE_COUNT = int(os.environ.get("SAGEMAKER_TFS_INSTANCE_COUNT", "1"))

logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

CUSTOM_ATTRIBUTES_HEADER = "X-Amzn-SageMaker-Custom-Attributes"


def default_handler(data, context):
    """A default inference request handler that directly send post request to TFS rest port with
    un-processed data and return un-processed response

    :param data: input data
    :param context: context instance that contains tfs_rest_uri
    :return: inference response from TFS model server
    """
    data = data.read().decode("utf-8")
    if not isinstance(data, str):
        data = json.loads(data)
    response = requests.post(context.rest_uri, data=data)
    return response.content, context.accept_header


class PythonServiceResource:
    def __init__(self):
        if SAGEMAKER_MULTI_MODEL_ENABLED:
            self._model_tfs_rest_port = {}
            self._model_tfs_grpc_port = {}
            self._model_tfs_pid = {}
            self._tfs_ports = self._parse_sagemaker_port_range_mme(SAGEMAKER_TFS_PORT_RANGE)
            # If Multi-Model mode is enabled, dependencies/handlers will be imported
            # during the _handle_load_model_post()
            self.model_handlers = {}
        else:
            self._tfs_grpc_ports = self._parse_concat_ports(TFS_GRPC_PORTS)
            self._tfs_rest_ports = self._parse_concat_ports(TFS_REST_PORTS)

            self._channels = {}
            for grpc_port in self._tfs_grpc_ports:
                # Initialize grpc channel here so gunicorn worker could have mapping
                # between each grpc port and channel
                self._setup_channel(grpc_port)

        if os.path.exists(INFERENCE_SCRIPT_PATH):
            # Single-Model Mode & Multi-Model Mode both use one inference.py
            self._handler, self._input_handler, self._output_handler = self._import_handlers()
            self._handlers = self._make_handler(
                self._handler, self._input_handler, self._output_handler
            )
        else:
            self._handlers = default_handler

        self._tfs_enable_batching = SAGEMAKER_BATCHING_ENABLED == "true"
        self._tfs_default_model_name = os.environ.get("TFS_DEFAULT_MODEL_NAME", "None")
        self._tfs_wait_time_seconds = int(os.environ.get("SAGEMAKER_TFS_WAIT_TIME_SECONDS", 300))

    def on_post(self, req, res, model_name=None):
        if model_name or "invocations" in req.uri:
            self._handle_invocation_post(req, res, model_name)
        else:
            data = json.loads(req.stream.read().decode("utf-8"))
            self._handle_load_model_post(res, data)

    def _parse_concat_ports(self, concat_ports):
        return concat_ports.split(",")

    def _pick_port(self, ports):
        return random.choice(ports)

    def _parse_sagemaker_port_range_mme(self, port_range):
        lower, upper = port_range.split("-")
        lower = int(lower)
        upper = lower + int((int(upper) - lower) * 0.9)  # only utilizing 90% of the ports
        rest_port = lower
        grpc_port = (lower + upper) // 2
        tfs_ports = {
            "rest_port": [port for port in range(rest_port, grpc_port)],
            "grpc_port": [port for port in range(grpc_port, upper)],
        }
        return tfs_ports

    def _ports_available(self):
        with lock():
            rest_ports = self._tfs_ports["rest_port"]
            grpc_ports = self._tfs_ports["grpc_port"]
        return len(rest_ports) > 0 and len(grpc_ports) > 0

    def _handle_load_model_post(self, res, data):  # noqa: C901
        model_name = data["model_name"]
        base_path = data["url"]

        # model is already loaded
        if model_name in self._model_tfs_pid:
            res.status = falcon.HTTP_409
            res.body = json.dumps({"error": "Model {} is already loaded.".format(model_name)})

        # check if there are available ports
        if not self._ports_available():
            res.status = falcon.HTTP_507
            res.body = json.dumps(
                {"error": "Memory exhausted: no available ports to load the model."}
            )
        with lock():
            self._model_tfs_rest_port[model_name] = self._tfs_ports["rest_port"].pop()
            self._model_tfs_grpc_port[model_name] = self._tfs_ports["grpc_port"].pop()

        # validate model files are in the specified base_path
        if self.validate_model_dir(base_path):
            try:
                tfs_config = tfs_utils.create_tfs_config_individual_model(model_name, base_path)
                tfs_config_file = "/sagemaker/tfs-config/{}/model-config.cfg".format(model_name)
                log.info("tensorflow serving model config: \n%s\n", tfs_config)
                os.makedirs(os.path.dirname(tfs_config_file))
                with open(tfs_config_file, "w", encoding="utf8") as f:
                    f.write(tfs_config)

                batching_config_file = "/sagemaker/batching/{}/batching-config.cfg".format(
                    model_name
                )
                if self._tfs_enable_batching:
                    tfs_utils.create_batching_config(batching_config_file)

                cmd = tfs_utils.tfs_command(
                    self._model_tfs_grpc_port[model_name],
                    self._model_tfs_rest_port[model_name],
                    tfs_config_file,
                    self._tfs_enable_batching,
                    batching_config_file,
                )
                p = subprocess.Popen(cmd.split())

                tfs_utils.wait_for_model(
                    self._model_tfs_rest_port[model_name], model_name, self._tfs_wait_time_seconds
                )

                log.info("started tensorflow serving (pid: %d)", p.pid)
                # update model name <-> tfs pid map
                self._model_tfs_pid[model_name] = p

                res.status = falcon.HTTP_200
                res.body = json.dumps(
                    {
                        "success": "Successfully loaded model {}, "
                        "listening on rest port {} "
                        "and grpc port {}.".format(
                            model_name,
                            self._model_tfs_rest_port,
                            self._model_tfs_grpc_port,
                        )
                    }
                )
            except MultiModelException as multi_model_exception:
                self._cleanup_config_file(tfs_config_file)
                self._cleanup_config_file(batching_config_file)
                if multi_model_exception.code == 409:
                    res.status = falcon.HTTP_409
                    res.body = multi_model_exception.msg
                elif multi_model_exception.code == 408:
                    res.status = falcon.HTTP_408
                    res.body = multi_model_exception.msg
                else:
                    raise MultiModelException(falcon.HTTP_500, multi_model_exception.msg)
            except FileExistsError as e:
                res.status = falcon.HTTP_409
                res.body = json.dumps(
                    {"error": "Model {} is already loaded. {}".format(model_name, str(e))}
                )
            except OSError as os_error:
                self._cleanup_config_file(tfs_config_file)
                self._cleanup_config_file(batching_config_file)
                if os_error.errno == 12:
                    raise MultiModelException(
                        falcon.HTTP_507,
                        "Memory exhausted: " "not enough memory to start TFS instance",
                    )
                else:
                    raise MultiModelException(falcon.HTTP_500, os_error.strerror)
        else:
            res.status = falcon.HTTP_404
            res.body = json.dumps(
                {
                    "error": "Could not find valid base path {} for servable {}".format(
                        base_path, model_name
                    )
                }
            )

    def _cleanup_config_file(self, config_file):
        if os.path.exists(config_file):
            os.remove(config_file)

    def _handle_invocation_post(self, req, res, model_name=None):
        if SAGEMAKER_MULTI_MODEL_ENABLED:
            if model_name:
                if model_name not in self._model_tfs_rest_port:
                    res.status = falcon.HTTP_404
                    res.body = json.dumps(
                        {"error": "Model {} is not loaded yet.".format(model_name)}
                    )
                    return
                else:
                    log.info("model name: {}".format(model_name))
                    rest_port = self._model_tfs_rest_port[model_name]
                    log.info("rest port: {}".format(str(self._model_tfs_rest_port[model_name])))
                    grpc_port = self._model_tfs_grpc_port[model_name]
                    log.info("grpc port: {}".format(str(self._model_tfs_grpc_port[model_name])))
                    data, context = tfs_utils.parse_request(
                        req,
                        rest_port,
                        grpc_port,
                        self._tfs_default_model_name,
                        model_name=model_name,
                    )
            else:
                res.status = falcon.HTTP_400
                res.body = json.dumps({"error": "Invocation request does not contain model name."})
        else:
            # Randomly pick port used for routing incoming request.
            grpc_port = self._pick_port(self._tfs_grpc_ports)
            rest_port = self._pick_port(self._tfs_rest_ports)
            data, context = tfs_utils.parse_request(
                req,
                rest_port,
                grpc_port,
                self._tfs_default_model_name,
                channel=self._channels[grpc_port],
            )

        try:
            res.status = falcon.HTTP_200

            res.body, res.content_type = self._handlers(data, context)
        except Exception as e:  # pylint: disable=broad-except
            log.exception("exception handling request: {}".format(e))
            res.status = falcon.HTTP_500
            res.body = json.dumps({"error": str(e)}).encode("utf-8")  # pylint: disable=E1101

    def _setup_channel(self, grpc_port):
        if grpc_port not in self._channels:
            log.info("Creating grpc channel for port: %s", grpc_port)
            self._channels[grpc_port] = grpc.insecure_channel("localhost:{}".format(grpc_port))

    def _import_handlers(self):
        inference_script = INFERENCE_SCRIPT_PATH
        spec = importlib.util.spec_from_file_location("inference", inference_script)
        inference = importlib.util.module_from_spec(spec)
        spec.loader.exec_module(inference)

        _custom_handler, _custom_input_handler, _custom_output_handler = None, None, None
        if hasattr(inference, "handler"):
            _custom_handler = inference.handler
        elif hasattr(inference, "input_handler") and hasattr(inference, "output_handler"):
            _custom_input_handler = inference.input_handler
            _custom_output_handler = inference.output_handler
        else:
            raise NotImplementedError("Handlers are not implemented correctly in user script.")

        return _custom_handler, _custom_input_handler, _custom_output_handler

    def _make_handler(self, custom_handler, custom_input_handler, custom_output_handler):
        if custom_handler:
            return custom_handler

        def handler(data, context):
            processed_input = custom_input_handler(data, context)
            response = requests.post(context.rest_uri, data=processed_input)
            return custom_output_handler(response, context)

        return handler

    def on_get(self, req, res, model_name=None):  # pylint: disable=W0613
        if model_name is None:
            models_info = {}
            uri = "http://localhost:{}/v1/models/{}"
            for model, port in self._model_tfs_rest_port.items():
                try:
                    info = json.loads(requests.get(uri.format(port, model)).content)
                    models_info[model] = info
                except ValueError as e:
                    log.exception("exception handling request: {}".format(e))
                    res.status = falcon.HTTP_500
                    res.body = json.dumps({"error": str(e)}).encode("utf-8")
            res.status = falcon.HTTP_200
            res.body = json.dumps(models_info)
        else:
            if model_name not in self._model_tfs_rest_port:
                res.status = falcon.HTTP_404
                res.body = json.dumps(
                    {"error": "Model {} is loaded yet.".format(model_name)}
                ).encode("utf-8")
            else:
                port = self._model_tfs_rest_port[model_name]
                uri = "http://localhost:{}/v1/models/{}".format(port, model_name)
                try:
                    info = requests.get(uri)
                    res.status = falcon.HTTP_200
                    res.body = json.dumps({"model": info}).encode("utf-8")
                except ValueError as e:
                    log.exception("exception handling GET models request.")
                    res.status = falcon.HTTP_500
                    res.body = json.dumps({"error": str(e)}).encode("utf-8")

    def on_delete(self, req, res, model_name):  # pylint: disable=W0613
        if model_name not in self._model_tfs_pid:
            res.status = falcon.HTTP_404
            res.body = json.dumps({"error": "Model {} is not loaded yet".format(model_name)})
        else:
            try:
                self._model_tfs_pid[model_name].kill()
                os.remove("/sagemaker/tfs-config/{}/model-config.cfg".format(model_name))
                os.rmdir("/sagemaker/tfs-config/{}".format(model_name))
                release_rest_port = self._model_tfs_rest_port[model_name]
                release_grpc_port = self._model_tfs_grpc_port[model_name]
                with lock():
                    bisect.insort(self._tfs_ports["rest_port"], release_rest_port)
                    bisect.insort(self._tfs_ports["grpc_port"], release_grpc_port)
                del self._model_tfs_rest_port[model_name]
                del self._model_tfs_grpc_port[model_name]
                del self._model_tfs_pid[model_name]
                res.status = falcon.HTTP_200
                res.body = json.dumps(
                    {"success": "Successfully unloaded model {}.".format(model_name)}
                )
            except OSError as error:
                res.status = falcon.HTTP_500
                res.body = json.dumps({"error": str(error)}).encode("utf-8")

    def validate_model_dir(self, model_path):
        # model base path doesn't exits
        if not os.path.exists(model_path):
            return False
        versions = []
        for _, dirs, _ in os.walk(model_path):
            for dirname in dirs:
                if dirname.isdigit():
                    versions.append(dirname)
        return self.validate_model_versions(versions)

    def validate_model_versions(self, versions):
        if not versions:
            return False
        for v in versions:
            if v.isdigit():
                # TensorFlow model server will succeed with any versions found
                # even if there are directories that's not a valid model version,
                # the loading will succeed.
                return True
        return False


class PingResource:
    def on_get(self, req, res):  # pylint: disable=W0613
        res.status = falcon.HTTP_200


class ServiceResources:
    def __init__(self):
        self._enable_model_manager = SAGEMAKER_MULTI_MODEL_ENABLED
        self._python_service_resource = PythonServiceResource()
        self._ping_resource = PingResource()

    def add_routes(self, application):
        application.add_route("/ping", self._ping_resource)
        application.add_route("/invocations", self._python_service_resource)

        if self._enable_model_manager:
            application.add_route("/models", self._python_service_resource)
            application.add_route("/models/{model_name}", self._python_service_resource)
            application.add_route("/models/{model_name}/invoke", self._python_service_resource)


app = falcon.API()
resources = ServiceResources()
resources.add_routes(app)


================================================
FILE: docker/build_artifacts/sagemaker/serve
================================================
#!/bin/bash

python3 /sagemaker/serve.py


================================================
FILE: docker/build_artifacts/sagemaker/serve.py
================================================
# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.

import boto3
import logging
import os
import re
import signal
import subprocess
import tfs_utils

from contextlib import contextmanager

logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

JS_PING = "js_content tensorflowServing.ping"
JS_INVOCATIONS = "js_content tensorflowServing.invocations"
GUNICORN_PING = "proxy_pass http://gunicorn_upstream/ping"
GUNICORN_INVOCATIONS = "proxy_pass http://gunicorn_upstream/invocations"
MULTI_MODEL = "s" if os.environ.get("SAGEMAKER_MULTI_MODEL", "False").lower() == "true" else ""
MODEL_DIR = f"model{MULTI_MODEL}"
CODE_DIR = "/opt/ml/{}/code".format(MODEL_DIR)
PYTHON_LIB_PATH = os.path.join(CODE_DIR, "lib")
REQUIREMENTS_PATH = os.path.join(CODE_DIR, "requirements.txt")
INFERENCE_PATH = os.path.join(CODE_DIR, "inference.py")


class ServiceManager(object):
    def __init__(self):
        self._state = "initializing"
        self._nginx = None
        self._tfs = []
        self._gunicorn = None
        self._gunicorn_command = None
        self._enable_python_service = False
        self._tfs_version = os.environ.get("SAGEMAKER_TFS_VERSION", "1.13")
        self._nginx_http_port = os.environ.get("SAGEMAKER_BIND_TO_PORT", "8080")
        self._nginx_loglevel = os.environ.get("SAGEMAKER_TFS_NGINX_LOGLEVEL", "error")
        self._tfs_default_model_name = os.environ.get("SAGEMAKER_TFS_DEFAULT_MODEL_NAME", "None")
        self._sagemaker_port_range = os.environ.get("SAGEMAKER_SAFE_PORT_RANGE", None)
        self._gunicorn_workers = os.environ.get("SAGEMAKER_GUNICORN_WORKERS", 1)
        self._gunicorn_threads = os.environ.get("SAGEMAKER_GUNICORN_THREADS", 1)
        self._gunicorn_loglevel = os.environ.get("SAGEMAKER_GUNICORN_LOGLEVEL", "info")
        self._tfs_config_path = "/sagemaker/model-config.cfg"
        self._tfs_batching_config_path = "/sagemaker/batching-config.cfg"

        _enable_batching = os.environ.get("SAGEMAKER_TFS_ENABLE_BATCHING", "false").lower()
        _enable_multi_model_endpoint = os.environ.get("SAGEMAKER_MULTI_MODEL", "false").lower()
        # Use this to specify memory that is needed to initialize CUDA/cuDNN and other GPU libraries
        self._tfs_gpu_margin = float(os.environ.get("SAGEMAKER_TFS_FRACTIONAL_GPU_MEM_MARGIN", 0.2))
        self._tfs_instance_count = int(os.environ.get("SAGEMAKER_TFS_INSTANCE_COUNT", 1))
        self._tfs_wait_time_seconds = int(os.environ.get("SAGEMAKER_TFS_WAIT_TIME_SECONDS", 300))
        self._tfs_inter_op_parallelism = os.environ.get("SAGEMAKER_TFS_INTER_OP_PARALLELISM", 0)
        self._tfs_intra_op_parallelism = os.environ.get("SAGEMAKER_TFS_INTRA_OP_PARALLELISM", 0)
        self._gunicorn_worker_class = os.environ.get("SAGEMAKER_GUNICORN_WORKER_CLASS", "gevent")
        self._gunicorn_timeout_seconds = int(
            os.environ.get("SAGEMAKER_GUNICORN_TIMEOUT_SECONDS", 30)
        )
        self._nginx_proxy_read_timeout_seconds = int(
            os.environ.get("SAGEMAKER_NGINX_PROXY_READ_TIMEOUT_SECONDS", 60))

        # Nginx proxy read timeout should not be less than the GUnicorn timeout. If it is, this
        # can result in upstream time out errors.
        if self._gunicorn_timeout_seconds > self._nginx_proxy_read_timeout_seconds:
            log.info(
                "GUnicorn timeout was higher than Nginx proxy read timeout."
                " Setting Nginx proxy read timeout from {} seconds to {} seconds"
                " to match GUnicorn timeout.".format(
                    self._nginx_proxy_read_timeout_seconds, self._gunicorn_timeout_seconds
                )
            )
            self._nginx_proxy_read_timeout_seconds = self._gunicorn_timeout_seconds

        if os.environ.get("OMP_NUM_THREADS") is None:
            os.environ["OMP_NUM_THREADS"] = "1"

        if _enable_multi_model_endpoint not in ["true", "false"]:
            raise ValueError("SAGEMAKER_MULTI_MODEL must be 'true' or 'false'")
        self._tfs_enable_multi_model_endpoint = _enable_multi_model_endpoint == "true"

        self._need_python_service()
        log.info("PYTHON SERVICE: {}".format(str(self._enable_python_service)))

        if _enable_batching not in ["true", "false"]:
            raise ValueError("SAGEMAKER_TFS_ENABLE_BATCHING must be 'true' or 'false'")
        self._tfs_enable_batching = _enable_batching == "true"

        if _enable_multi_model_endpoint not in ["true", "false"]:
            raise ValueError("SAGEMAKER_MULTI_MODEL must be 'true' or 'false'")
        self._tfs_enable_multi_model_endpoint = _enable_multi_model_endpoint == "true"

        self._use_gunicorn = self._enable_python_service or self._tfs_enable_multi_model_endpoint

        if self._sagemaker_port_range is not None:
            parts = self._sagemaker_port_range.split("-")
            low = int(parts[0])
            hi = int(parts[1])
            self._tfs_grpc_ports = []
            self._tfs_rest_ports = []
            if low + 2 * self._tfs_instance_count > hi:
                raise ValueError(
                    "not enough ports available in SAGEMAKER_SAFE_PORT_RANGE ({})".format(
                        self._sagemaker_port_range
                    )
                )
            # select non-overlapping grpc and rest ports based on tfs instance count
            for i in range(self._tfs_instance_count):
                self._tfs_grpc_ports.append(str(low + 2 * i))
                self._tfs_rest_ports.append(str(low + 2 * i + 1))
            # concat selected ports respectively in order to pass them to python service
            self._tfs_grpc_concat_ports = self._concat_ports(self._tfs_grpc_ports)
            self._tfs_rest_concat_ports = self._concat_ports(self._tfs_rest_ports)
        else:
            # just use the standard default ports
            self._tfs_grpc_ports = ["9000"]
            self._tfs_rest_ports = ["8501"]
            # provide single concat port here for default case
            self._tfs_grpc_concat_ports = "9000"
            self._tfs_rest_concat_ports = "8501"

        # set environment variable for python service
        os.environ["TFS_GRPC_PORTS"] = self._tfs_grpc_concat_ports
        os.environ["TFS_REST_PORTS"] = self._tfs_rest_concat_ports

    def _need_python_service(self):
        if os.path.exists(INFERENCE_PATH):
            self._enable_python_service = True
        if os.environ.get("SAGEMAKER_MULTI_MODEL_UNIVERSAL_BUCKET") and os.environ.get(
            "SAGEMAKER_MULTI_MODEL_UNIVERSAL_PREFIX"
        ):
            self._enable_python_service = True

    def _concat_ports(self, ports):
        str_ports = [str(port) for port in ports]
        concat_str_ports = ",".join(str_ports)
        return concat_str_ports

    def _create_tfs_config(self):
        models = tfs_utils.find_models()

        if not models:
            raise ValueError("no SavedModel bundles found!")

        if self._tfs_default_model_name == "None":
            default_model = os.path.basename(models[0])
            if default_model:
                self._tfs_default_model_name = default_model
                log.info("using default model name: {}".format(self._tfs_default_model_name))
            else:
                log.info("no default model detected")

        # config (may) include duplicate 'config' keys, so we can't just dump a dict
        config = "model_config_list: {\n"
        for m in models:
            config += "  config: {\n"
            config += "    name: '{}'\n".format(os.path.basename(m))
            config += "    base_path: '{}'\n".format(m)
            config += "    model_platform: 'tensorflow'\n"

            config += "    model_version_policy: {\n"
            config += "      specific: {\n"
            for version in tfs_utils.find_model_versions(m):
                config += "        versions: {}\n".format(version)
            config += "      }\n"
            config += "    }\n"

            config += "  }\n"
        config += "}\n"

        log.info("tensorflow serving model config: \n%s\n", config)

        with open(self._tfs_config_path, "w", encoding="utf8") as f:
            f.write(config)

    def _setup_gunicorn(self):
        python_path_content = []
        python_path_option = ""

        bucket = os.environ.get("SAGEMAKER_MULTI_MODEL_UNIVERSAL_BUCKET", None)
        prefix = os.environ.get("SAGEMAKER_MULTI_MODEL_UNIVERSAL_PREFIX", None)

        if not os.path.exists(CODE_DIR) and bucket and prefix:
            self._download_scripts(bucket, prefix)

        if self._enable_python_service:
            lib_path_exists = os.path.exists(PYTHON_LIB_PATH)
            requirements_exists = os.path.exists(REQUIREMENTS_PATH)
            python_path_content = ["/opt/ml/model/code"]
            python_path_option = "--pythonpath "

            if lib_path_exists:
                python_path_content.append(PYTHON_LIB_PATH)

            if requirements_exists:
                if lib_path_exists:
                    log.warning(
                        "loading modules in '{}', ignoring requirements.txt".format(PYTHON_LIB_PATH)
                    )
                else:
                    log.info("installing packages from requirements.txt...")
                    pip_install_cmd = "pip3 install -r {}".format(REQUIREMENTS_PATH)
                    try:
                        subprocess.check_call(pip_install_cmd.split())
                    except subprocess.CalledProcessError:
                        log.error("failed to install required packages, exiting.")
                        self._stop()
                        raise ChildProcessError("failed to install required packages.")

        gunicorn_command = (
            "gunicorn -b unix:/tmp/gunicorn.sock -k {} --chdir /sagemaker "
            "--workers {} --threads {} --log-level {} --timeout {} "
            "{}{} -e TFS_GRPC_PORTS={} -e TFS_REST_PORTS={} "
            "-e SAGEMAKER_MULTI_MODEL={} -e SAGEMAKER_SAFE_PORT_RANGE={} "
            "-e SAGEMAKER_TFS_WAIT_TIME_SECONDS={} "
            "python_service:app"
        ).format(
            self._gunicorn_worker_class,
            self._gunicorn_workers,
            self._gunicorn_threads,
            self._gunicorn_loglevel,
            self._gunicorn_timeout_seconds,
            python_path_option,
            ",".join(python_path_content),
            self._tfs_grpc_concat_ports,
            self._tfs_rest_concat_ports,
            self._tfs_enable_multi_model_endpoint,
            self._sagemaker_port_range,
            self._tfs_wait_time_seconds,
        )

        log.info("gunicorn command: {}".format(gunicorn_command))
        self._gunicorn_command = gunicorn_command

    def _download_scripts(self, bucket, prefix):
        log.info("checking boto session region ...")
        boto_session = boto3.session.Session()
        boto_region = boto_session.region_name
        if boto_region in ("us-iso-east-1", "us-gov-west-1"):
            raise ValueError("Universal scripts is not supported in us-iso-east-1 or us-gov-west-1")

        log.info("downloading universal scripts ...")
        client = boto3.client("s3")
        resource = boto3.resource("s3")
        # download files
        paginator = client.get_paginator("list_objects")
        for result in paginator.paginate(Bucket=bucket, Delimiter="/", Prefix=prefix):
            for file in result.get("Contents", []):
                destination = os.path.join(CODE_DIR, file.get("Key"))
                if not os.path.exists(os.path.dirname(destination)):
                    os.makedirs(os.path.dirname(destination))
                resource.meta.client.download_file(bucket, file.get("Key"), destination)

    def _create_nginx_tfs_upstream(self):
        indentation = "    "
        tfs_upstream = ""
        for port in self._tfs_rest_ports:
            tfs_upstream += "{}server localhost:{};\n".format(indentation, port)
        tfs_upstream = tfs_upstream[len(indentation) : -2]

        return tfs_upstream

    def _create_nginx_config(self):
        template = self._read_nginx_template()
        pattern = re.compile(r"%(\w+)%")

        template_values = {
            "TFS_VERSION": self._tfs_version,
            "TFS_UPSTREAM": self._create_nginx_tfs_upstream(),
            "TFS_DEFAULT_MODEL_NAME": self._tfs_default_model_name,
            "NGINX_HTTP_PORT": self._nginx_http_port,
            "NGINX_LOG_LEVEL": self._nginx_loglevel,
            "FORWARD_PING_REQUESTS": GUNICORN_PING if self._use_gunicorn else JS_PING,
            "FORWARD_INVOCATION_REQUESTS": GUNICORN_INVOCATIONS
            if self._use_gunicorn
            else JS_INVOCATIONS,
            "PROXY_READ_TIMEOUT": str(self._nginx_proxy_read_timeout_seconds),
        }

        config = pattern.sub(lambda x: template_values[x.group(1)], template)
        log.info("nginx config: \n%s\n", config)

        with open("/sagemaker/nginx.conf", "w", encoding="utf8") as f:
            f.write(config)

    def _read_nginx_template(self):
        with open("/sagemaker/nginx.conf.template", "r", encoding="utf8") as f:
            template = f.read()
            if not template:
                raise ValueError("failed to read nginx.conf.template")

            return template

    def _enable_per_process_gpu_memory_fraction(self):
        nvidia_smi_exist = os.path.exists("/usr/bin/nvidia-smi")
        if self._tfs_instance_count > 1 and nvidia_smi_exist:
            return True

        return False

    def _get_number_of_gpu_on_host(self):
        nvidia_smi_exist = os.path.exists("/usr/bin/nvidia-smi")
        if nvidia_smi_exist:
            return len(subprocess.check_output(['nvidia-smi', '-L'])
                       .decode('utf-8').strip().split('\n'))

        return 0

    def _calculate_per_process_gpu_memory_fraction(self):
        return round((1 - self._tfs_gpu_margin) / float(self._tfs_instance_count), 4)

    def _start_tfs(self):
        self._log_version("tensorflow_model_server --version", "tensorflow version info:")

        for i in range(self._tfs_instance_count):
            p = self._start_single_tfs(i)
            self._tfs.append(p)

    def _start_gunicorn(self):
        self._log_version("gunicorn --version", "gunicorn version info:")
        env = os.environ.copy()
        env["TFS_DEFAULT_MODEL_NAME"] = self._tfs_default_model_name
        p = subprocess.Popen(self._gunicorn_command.split(), env=env)
        log.info("started gunicorn (pid: %d)", p.pid)
        self._gunicorn = p

    def _start_nginx(self):
        self._log_version("/usr/sbin/nginx -V", "nginx version info:")
        p = subprocess.Popen("/usr/sbin/nginx -c /sagemaker/nginx.conf".split())
        log.info("started nginx (pid: %d)", p.pid)
        self._nginx = p

    def _log_version(self, command, message):
        try:
            output = (
                subprocess.check_output(command.split(), stderr=subprocess.STDOUT)
                .decode("utf-8", "backslashreplace")
                .strip()
            )
            log.info("{}\n{}".format(message, output))
        except subprocess.CalledProcessError:
            log.warning("failed to run command: %s", command)

    def _stop(self, *args):  # pylint: disable=W0613
        self._state = "stopping"
        log.info("stopping services")
        try:
            os.kill(self._nginx.pid, signal.SIGQUIT)
        except OSError:
            pass
        try:
            if self._gunicorn:
                os.kill(self._gunicorn.pid, signal.SIGTERM)
        except OSError:
            pass
        try:
            for tfs in self._tfs:
                os.kill(tfs.pid, signal.SIGTERM)
        except OSError:
            pass

        self._state = "stopped"
        log.info("stopped")

    def _wait_for_gunicorn(self):
        while True:
            if os.path.exists("/tmp/gunicorn.sock"):
                log.info("gunicorn server is ready!")
                return

    def _wait_for_tfs(self):
        for i in range(self._tfs_instance_count):
            tfs_utils.wait_for_model(
                self._tfs_rest_ports[i], self._tfs_default_model_name, self._tfs_wait_time_seconds
            )

    @contextmanager
    def _timeout(self, seconds):
        def _raise_timeout_error(signum, frame):
            raise TimeoutError("time out after {} seconds".format(seconds))

        try:
            signal.signal(signal.SIGALRM, _raise_timeout_error)
            signal.alarm(seconds)
            yield
        finally:
            signal.alarm(0)

    def _is_tfs_process(self, pid):
        for p in self._tfs:
            if p.pid == pid:
                return True
        return False

    def _find_tfs_process(self, pid):
        for index, p in enumerate(self._tfs):
            if p.pid == pid:
                return index
        return None

    def _restart_single_tfs(self, pid):
        instance_id = self._find_tfs_process(pid)
        if instance_id is None:
            raise ValueError("Cannot find tfs with pid: {};".format(pid))
        p = self._start_single_tfs(instance_id)
        self._tfs[instance_id] = p

    def _start_single_tfs(self, instance_id):
        cmd = tfs_utils.tfs_command(
            self._tfs_grpc_ports[instance_id],
            self._tfs_rest_ports[instance_id],
            self._tfs_config_path,
            self._tfs_enable_batching,
            self._tfs_batching_config_path,
            tfs_intra_op_parallelism=self._tfs_intra_op_parallelism,
            tfs_inter_op_parallelism=self._tfs_inter_op_parallelism,
            tfs_enable_gpu_memory_fraction=self._enable_per_process_gpu_memory_fraction(),
            tfs_gpu_memory_fraction=self._calculate_per_process_gpu_memory_fraction(),
        )
        log.info("tensorflow serving command: {}".format(cmd))

        num_gpus = self._get_number_of_gpu_on_host()
        if num_gpus > 1:
            # utilizing multi-gpu
            worker_env = os.environ.copy()
            worker_env["CUDA_VISIBLE_DEVICES"] = str(instance_id % num_gpus)
            p = subprocess.Popen(cmd.split(), env=worker_env)
            log.info("started tensorflow serving (pid: {}) on GPU {}"
                     .format(p.pid, instance_id % num_gpus))
        else:
            # cpu and single gpu
            p = subprocess.Popen(cmd.split())
            log.info("started tensorflow serving (pid: {})".format(p.pid))

        return p

    def _monitor(self):
        while True:
            pid, status = os.wait()

            if self._state != "started":
                break

            if pid == self._nginx.pid:
                log.warning("unexpected nginx exit (status: {}). restarting.".format(status))
                self._start_nginx()

            elif self._is_tfs_process(pid):
                log.warning(
                    "unexpected tensorflow serving exit (status: {}). restarting.".format(status)
                )
                try:
                    self._restart_single_tfs(pid)
                except (ValueError, OSError) as error:
                    log.error("Failed to restart tensorflow serving. {}".format(error))

            elif self._gunicorn and pid == self._gunicorn.pid:
                log.warning("unexpected gunicorn exit (status: {}). restarting.".format(status))
                self._start_gunicorn()

    def start(self):
        log.info("starting services")
        self._state = "starting"
        signal.signal(signal.SIGTERM, self._stop)

        if self._tfs_enable_batching:
            log.info("batching is enabled")
            tfs_utils.create_batching_config(self._tfs_batching_config_path)

        if self._tfs_enable_multi_model_endpoint:
            log.info("multi-model endpoint is enabled, TFS model servers will be started later")
        else:
            self._create_tfs_config()
            self._start_tfs()
            self._wait_for_tfs()

        self._create_nginx_config()

        if self._use_gunicorn:
            self._setup_gunicorn()
            self._start_gunicorn()
            # make sure gunicorn is up
            with self._timeout(seconds=self._gunicorn_timeout_seconds):
                self._wait_for_gunicorn()

        self._start_nginx()
        self._state = "started"
        self._monitor()
        self._stop()


if __name__ == "__main__":
    ServiceManager().start()


================================================
FILE: docker/build_artifacts/sagemaker/tensorflowServing.js
================================================
var tfs_base_uri = '/tfs/v1/models/'
var custom_attributes_header = 'X-Amzn-SageMaker-Custom-Attributes'

function invocations(r) {
    var ct = r.headersIn['Content-Type']

    if ('application/json' == ct || 'application/jsonlines' == ct || 'application/jsons' == ct) {
        json_request(r)
    } else if ('text/csv' == ct) {
        csv_request(r)
    } else {
        return_error(r, 415, 'Unsupported Media Type: ' + (ct || 'Unknown'))
    }
}

function ping(r) {
    var uri = make_tfs_uri(r, false)

    function callback (reply) {
        if (reply.status == 200 && reply.responseBody.includes('"AVAILABLE"')) {
            r.return(200)
        } else {
            r.error('failed ping' + reply.responseBody)
            r.return(502)
        }
    }

    r.subrequest(uri, callback)
}

function ping_without_model(r) {
    // hack for TF 1.11 and MME
    // for TF 1.11, send an arbitrary fixed request to the default model.
    // if response is 400, the model is ok (but input was bad), so return 200
    // for MME, the default model name is None and does not exist
    // also return 200 in unlikely case our request was really valid

    var uri = make_tfs_uri(r, true)
    var options = {
        method: 'POST',
        body: '{"instances": "invalid"}'
    }

    function callback (reply) {
        if (reply.status == 200 || reply.status == 400 ||
        reply.responseBody.includes('Servable not found for request: Latest(None)')) {
            r.return(200)
        } else {
            r.error('failed ping' + reply.responseBody)
            r.return(502)
        }
    }

    r.subrequest(uri, options, callback)
}

function return_error(r, code, message) {
    if (message) {
        r.return(code, '{"error": "' + message + '"}')
    } else {
        r.return(code)
    }
}

function tfs_json_request(r, json) {
    var uri = make_tfs_uri(r, true)
    var options = {
        method: 'POST',
        body: json
    }

    var accept = r.headersIn.Accept
    function callback (reply) {
        var body = reply.responseBody
        if (reply.status == 400) {
            // "fix" broken json escaping in \'instances\' message
            body = body.replace("\\'instances\\'", "'instances'")
        }

        if (accept != undefined) {
            var content_types = accept.trim().replace(" ", "").split(",")
            if (content_types.includes('application/jsonlines') || content_types.includes('application/json')) {
                body = body.replace(/\n/g, '')
                r.headersOut['Content-Type'] = content_types[0]
            }
        }
        r.return(reply.status, body)
    }

    r.subrequest(uri, options, callback)

}

function make_tfs_uri(r, with_method) {
    var attributes = parse_custom_attributes(r)

    var uri = tfs_base_uri + attributes['tfs-model-name']
    if ('tfs-model-version' in attributes) {
        uri += '/versions/' + attributes['tfs-model-version']
    }

    if (with_method) {
        uri += ':' + (attributes['tfs-method'] || 'predict')
    }

    return uri
}

function parse_custom_attributes(r) {
    var attributes = {}
    var kv_pattern = /tfs-[a-z\-]+=[^,]+/g
    var header = r.headersIn[custom_attributes_header]
    if (header) {
        var matches = header.match(kv_pattern)
        if (matches) {
            for (var i = 0; i < matches.length; i++) {
                var kv = matches[i].split('=')
                if (kv.length === 2) {
                    attributes[kv[0]] = kv[1]
                }
            }
        }
    }

    // for MME invocations, tfs-model-name is in the uri, or use default_tfs_model
    if (!attributes['tfs-model-name']) {
        var uri_pattern = /\/models\/[^,]+\/invoke/g
        var model_name = r.uri.match(uri_pattern)
        if (model_name[0]) {
            model_name = r.uri.replace('/models/', '').replace('/invoke', '')
            attributes['tfs-model-name'] = model_name
        } else {
            attributes['tfs-model-name'] = r.variables.default_tfs_model
        }
    }

    return attributes
}

function json_request(r) {
    var data = r.requestBody

    if (is_tfs_json(data)) {
        tfs_json_request(r, data)
    } else if (is_json_lines(data)) {
        json_lines_request(r, data)
    } else {
        generic_json_request(r, data)
    }
}

function is_tfs_json(data) {
    return /"(instances|inputs|examples)"\s*:/.test(data)
}

function is_json_lines(data) {
    // objects separated only by (optional) whitespace means jsons/json-lines
    return /[}\]]\s*[\[{]/.test(data)
}

function generic_json_request(r, data) {
    if (! /^\s*\[\s*\[/.test(data)) {
        data = '[' + data + ']'
    }

    var json = '{"instances":' + data + '}'
    tfs_json_request(r, json)
}

function json_lines_request(r, data) {
    var lines = data.trim().split(/\r?\n/)
    var builder = []
    builder.push('{"instances":')
    if (lines.length != 1) {
        builder.push('[')
    }

    for (var i = 0; i < lines.length; i++) {
        var line = lines[i].trim()
        if (line) {
            var instance = (i == 0) ? '' : ','
            instance += line
            builder.push(instance)
        }
    }

    builder.push(lines.length == 1 ? '}' : ']}')
    tfs_json_request(r, builder.join(''))
}

function csv_request(r) {
    var data = r.requestBody
    // look for initial quote or numeric-only data in 1st field
    var needs_quotes = data.search(/^\s*("|[\d.Ee+\-]+.*)/) != 0
    var lines = data.trim().split(/\r?\n/)
    var builder = []
    builder.push('{"instances":[')

    for (var i = 0; i < lines.length; i++) {
        var line = lines[i].trim()
        if (line) {
            var line_builder = []
            // Only wrap line in brackets if there are multiple columns.
            // If there's only one column and it has a string with a comma,
            // the input will be wrapped in an extra set of brackets.
            var has_multiple_columns = line.search(',') != -1

            if (has_multiple_columns) {
                line_builder.push('[')
            }

            if (needs_quotes) {
                line_builder.push('"')
                line_builder.push(line.replace('"', '\\"').replace(',', '","'))
                line_builder.push('"')
            } else {
                line_builder.push(line)
            }

            if (has_multiple_columns) {
                line_builder.push(']')
            }

            var json_line = line_builder.join('')
            builder.push(json_line)

            if (i != lines.length - 1)
                builder.push(',')
        }
    }

    builder.push(']}')
    tfs_json_request(r, builder.join(''))
}

export default {invocations, ping, ping_without_model, return_error,
    tfs_json_request, make_tfs_uri, parse_custom_attributes,
    json_request, is_tfs_json, is_json_lines, generic_json_request,
    json_lines_request, csv_request};


================================================
FILE: docker/build_artifacts/sagemaker/tfs_utils.py
================================================
# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License"). You
# may not use this file except in compliance with the License. A copy of
# the License is located at
#
#     http://aws.amazon.com/apache2.0/
#
# or in the "license" file accompanying this file. This file is
# distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF
# ANY KIND, either express or implied. See the License for the specific
# language governing permissions and limitations under the License.

import logging
import multiprocessing
import os
import re
import requests
import time
import json

from multi_model_utils import timeout
from urllib3.util.retry import Retry
from urllib3.exceptions import NewConnectionError, MaxRetryError
from collections import namedtuple

logging.basicConfig(level=logging.INFO)
log = logging.getLogger(__name__)

DEFAULT_CONTENT_TYPE = "application/json"
DEFAULT_ACCEPT_HEADER = "application/json"
CUSTOM_ATTRIBUTES_HEADER = "X-Amzn-SageMaker-Custom-Attributes"

Context = namedtuple(
    "Context",
    "model_name, model_version, method, rest_uri, grpc_port, channel, "
    "custom_attributes, request_content_type, accept_header, content_length",
)


def parse_request(req, rest_port, grpc_port, default_model_name, model_name=None, channel=None):
    tfs_attributes = parse_tfs_custom_attributes(req)
    tfs_uri = make_tfs_uri(rest_port, tfs_attributes, default_model_name, model_name)

    if not model_name:
        model_name = tfs_attributes.get("tfs-model-name")

    context = Context(
        model_name,
        tfs_attributes.get("tfs-model-version"),
        tfs_attributes.get("tfs-method"),
        tfs_uri,
        grpc_port,
        channel,
        req.get_header(CUSTOM_ATTRIBUTES_HEADER),
        req.get_header("Content-Type") or DEFAULT_CONTENT_TYPE,
        req.get_header("Accept") or DEFAULT_ACCEPT_HEADER,
        req.content_length,
    )

    data = req.stream
    return data, context


def make_tfs_uri(port, attributes, default_model_name, model_name=None):
    log.info("sagemaker tfs attributes: \n{}".format(attributes))

    tfs_model_name = model_name or attributes.get("tfs-model-name", default_model_name)
    tfs_model_version = attributes.get("tfs-model-version")
    tfs_method = attributes.get("tfs-method", "predict")

    uri = "http://localhost:{}/v1/models/{}".format(port, tfs_model_name)
    if tfs_model_version:
        uri += "/versions/" + tfs_model_version
    uri += ":" + tfs_method
    return uri


def parse_tfs_custom_attributes(req):
    attributes = {}
    header = req.get_header(CUSTOM_ATTRIBUTES_HEADER)
    if header:
        matches = re.findall(r"(tfs-[a-z\-]+=[^,]+)", header)
        attributes = dict(attribute.split("=") for attribute in matches)
    return attributes


def create_tfs_config_individual_model(model_name, base_path):
    config = "model_config_list: {\n"
    config += "  config: {\n"
    config += "    name: '{}'\n".format(model_name)
    config += "    base_path: '{}'\n".format(base_path)
    config += "    model_platform: 'tensorflow'\n"

    config += "    model_version_policy: {\n"
    config += "      specific: {\n"
    for version in find_model_versions(base_path):
        config += "        versions: {}\n".format(version)
    config += "      }\n"
    config += "    }\n"

    config += "  }\n"
    config += "}\n"
    return config


def tfs_command(
    tfs_grpc_port,
    tfs_rest_port,
    tfs_config_path,
    tfs_enable_batching,
    tfs_batching_config_file,
    tfs_intra_op_parallelism=None,
    tfs_inter_op_parallelism=None,
    tfs_enable_gpu_memory_fraction=False,
    tfs_gpu_memory_fraction=None,
):
    cmd = (
        "tensorflow_model_server "
        "--port={} "
        "--rest_api_port={} "
        "--model_config_file={} "
        "--max_num_load_retries=0 {} {} {} {}".format(
            tfs_grpc_port,
            tfs_rest_port,
            tfs_config_path,
            get_tfs_batching_args(tfs_enable_batching, tfs_batching_config_file),
            get_tensorflow_intra_op_parallelism_args(tfs_intra_op_parallelism),
            get_tensorflow_inter_op_parallelism_args(tfs_inter_op_parallelism),
            get_tfs_gpu_mem_args(tfs_enable_gpu_memory_fraction, tfs_gpu_memory_fraction),
        )
    )
    return cmd


def find_models():
    base_path = "/opt/ml/model"
    models = []
    for f in _find_saved_model_files(base_path):
        parts = f.split("/")
        if len(parts) >= 6 and re.match(r"^\d+$", parts[-2]):
            model_path = "/".join(parts[0:-2])
            if model_path not in models:
                models.append(model_path)
    return models


def find_model_versions(model_path):
    """Remove leading zeros from the version number, returns list of versions"""
    return [
        version[:-1].lstrip("0") + version[-1]
        for version in os.listdir(model_path)
        if version.isnumeric()
    ]


def _find_saved_model_files(path):
    for e in os.scandir(path):
        if e.is_dir():
            yield from _find_saved_model_files(os.path.join(path, e.name))
        else:
            if e.name == "saved_model.pb":
                yield os.path.join(path, e.name)


def get_tfs_batching_args(enable_batching, tfs_batching_config):
    if enable_batching:
        return "--enable_batching=true " "--batching_parameters_file={}".format(tfs_batching_config)
    else:
        return ""


def get_tensorflow_intra_op_parallelism_args(tfs_intra_op_parallelism):
    if tfs_intra_op_parallelism:
        return "--tensorflow_intra_op_parallelism={}".format(tfs_intra_op_parallelism)
    else:
        return ""


def get_tensorflow_inter_op_parallelism_args(tfs_inter_op_parallelism):
    if tfs_inter_op_parallelism:
        return "--tensorflow_inter_op_parallelism={}".format(tfs_inter_op_parallelism)
    else:
        return ""


def get_tfs_gpu_mem_args(enable_gpu_memory_fraction, gpu_memory_fraction):
    if enable_gpu_memory_fraction and gpu_memory_fraction:
        return "--per_process_gpu_memory_fraction={}".format(gpu_memory_fraction)
    else:
        return ""


def create_batching_config(batching_config_file):
    class _BatchingParameter:
        def __init__(self, key, env_var, value, defaulted_message):
            self.key = key
            self.env_var = env_var
            self.value = value
            self.defaulted_message = defaulted_message

    cpu_count = multiprocessing.cpu_count()
    batching_parameters = [
        _BatchingParameter(
            "max_batch_size",
            "SAGEMAKER_TFS_MAX_BATCH_SIZE",
            8,
            "max_batch_size defaulted to {}. Set {} to override default. "
            "Tuning this parameter may yield better performance.",
        ),
        _BatchingParameter(
            "batch_timeout_micros",
            "SAGEMAKER_TFS_BATCH_TIMEOUT_MICROS",
            1000,
            "batch_timeout_micros defaulted to {}. Set {} to override "
            "default. Tuning this parameter may yield better performance.",
        ),
        _BatchingParameter(
            "num_batch_threads",
            "SAGEMAKER_TFS_NUM_BATCH_THREADS",
            cpu_count,
            "num_batch_threads defaulted to {}," "the number of CPUs. Set {} to override default.",
        ),
        _BatchingParameter(
            "max_enqueued_batches",
            "SAGEMAKER_TFS_MAX_ENQUEUED_BATCHES",
            # Batch limits number of concurrent requests, which limits number
            # of enqueued batches, so this can be set high for Batch
            100000000 if "SAGEMAKER_BATCH" in os.environ else cpu_count,
            "max_enqueued_batches defaulted to {}. Set {} to override default. "
            "Tuning this parameter may be necessary to tune out-of-memory "
            "errors occur.",
        ),
    ]

    warning_message = ""
    for batching_parameter in batching_parameters:
        if batching_parameter.env_var in os.environ:
            batching_parameter.value = os.environ[batching_parameter.env_var]
        else:
            warning_message += batching_parameter.defaulted_message.format(
                batching_parameter.value, batching_parameter.env_var
            )
            warning_message += "\n"
    if warning_message:
        log.warning(warning_message)

    config = ""
    for batching_parameter in batching_parameters:
        config += "%s { value: %s }\n" % (batching_parameter.key, batching_parameter.value)

    log.info("batching config: \n%s\n", config)
    with open(batching_config_file, "w", encoding="utf8") as f:
        f.write(config)


def wait_for_model(rest_port, model_name, timeout_seconds, wait_interval_seconds=5):
    tfs_url = "http://localhost:{}/v1/models/{}".format(rest_port, model_name)

    with timeout(timeout_seconds):
        while True:
            try:
                session = requests.Session()
                retries = Retry(total=9, backoff_factor=0.1)
                session.mount("http://", requests.adapters.HTTPAdapter(max_retries=retries))
                log.info("Trying to connect with model server: {}".format(tfs_url))
                response = session.get(tfs_url)
                log.info(response)
                if response.status_code == 200:
                    versions = json.loads(response.content)["model_version_status"]
                    if all(version["state"] == "AVAILABLE" for version in versions):
                        break
            except (
                ConnectionRefusedError,
                NewConnectionError,
                MaxRetryError,
                requests.exceptions.ConnectionError,
            ):
                log.warning("model: {} is not available yet ".format(tfs_url))
                time.sleep(wait_interval_seconds)

    log.info("model: {} is available now".format(tfs_url))


================================================
FILE: scripts/build-all.sh
================================================
#!/bin/bash
#
# Build all the docker images.

set -euo pipefail

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

${DIR}/build.sh --version 1.14.0 --arch eia
${DIR}/build.sh --version 1.15.0 --arch cpu
${DIR}/build.sh --version 1.15.0 --arch gpu
${DIR}/build.sh --version 2.1.0 --arch cpu
${DIR}/build.sh --version 2.1.0 --arch gpu


================================================
FILE: scripts/build.sh
================================================
#!/bin/bash
#
# Build the docker images.

set -euo pipefail

source scripts/shared.sh

parse_std_args "$@"

get_ei_executable

echo "pulling previous image for layer cache... "
aws ecr get-login-password --region ${aws_region} \
    | docker login \
        --password-stdin \
        --username AWS \
        "${aws_account}.dkr.ecr.${aws_region}.amazonaws.com/${repository}" &>/dev/null || echo 'warning: ecr login failed'
docker pull $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$full_version-$device &>/dev/null || echo 'warning: pull failed'
docker logout https://$aws_account.dkr.ecr.$aws_region.amazonaws.com &>/dev/null

echo "building image... "
cp -r docker/build_artifacts/* docker/$short_version/
docker build \
    --cache-from $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$full_version-$device \
    --build-arg TFS_VERSION=$full_version \
    --build-arg TFS_SHORT_VERSION=$short_version \
    -f docker/$short_version/Dockerfile.$arch \
    -t $repository:$full_version-$device \
    -t $repository:$short_version-$device \
    docker/$short_version/

remove_ei_executable


================================================
FILE: scripts/curl.sh
================================================
#!/bin/bash
#
# Some example curl requests to try on local docker containers.

curl -X POST --data-binary @test/resources/inputs/test.json -H 'Content-Type: application/json' -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=half_plus_three' http://localhost:8080/invocations
curl -X POST --data-binary @test/resources/inputs/test-gcloud.jsons -H 'Content-Type: application/json' -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=half_plus_three' http://localhost:8080/invocations
curl -X POST --data-binary @test/resources/inputs/test-generic.json -H 'Content-Type: application/json' -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=half_plus_three' http://localhost:8080/invocations
curl -X POST --data-binary @test/resources/inputs/test.csv -H 'Content-Type: text/csv' -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=half_plus_three' http://localhost:8080/invocations
curl -X POST --data-binary @test/resources/inputs/test-cifar.json -H 'Content-Type: application/json' -H 'X-Amzn-SageMaker-Custom-Attributes: tfs-model-name=cifar' http://localhost:8080/invocations

================================================
FILE: scripts/publish-all.sh
================================================
#!/bin/bash
#
# Publish all images to your ECR account.

set -euo pipefail

DIR="$( cd "$( dirname "${BASH_SOURCE[0]}" )" >/dev/null 2>&1 && pwd )"

${DIR}/publish.sh --version 1.14.0 --arch eia
${DIR}/publish.sh --version 1.15.0 --arch cpu
${DIR}/publish.sh --version 1.15.0 --arch gpu
${DIR}/publish.sh --version 2.1.0 --arch cpu
${DIR}/publish.sh --version 2.1.0 --arch gpu


================================================
FILE: scripts/publish.sh
================================================
#!/bin/bash
#
# Publish images to your ECR account.

set -euo pipefail

source scripts/shared.sh

parse_std_args "$@"

aws ecr get-login-password --region ${aws_region} \
    | docker login \
        --password-stdin \
        --username AWS \
        "${aws_account}.dkr.ecr.${aws_region}.amazonaws.com/${repository}"
docker tag $repository:$full_version-$device $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$full_version-$device
docker tag $repository:$full_version-$device $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$short_version-$device
docker push $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$full_version-$device
docker push $aws_account.dkr.ecr.$aws_region.amazonaws.com/$repository:$short_version-$device
docker logout https://$aws_account.dkr.ecr.$aws_region.amazonaws.com


================================================
FILE: scripts/shared.sh
================================================
#!/bin/bash
#
# Utility functions for build/test scripts.

function error() {
    >&2 echo $1
    >&2 echo "usage: $0 [--version <major-version>] [--arch (cpu*|gpu|eia)] [--region <aws-region>]"
    exit 1
}

function get_default_region() {
    if [ -n "${AWS_DEFAULT_REGION:-}" ]; then
        echo "$AWS_DEFAULT_REGION"
    else
        aws configure get region
    fi
}

function get_full_version() {
    echo $1 | sed 's#^\([0-9][0-9]*\.[0-9][0-9]*\)$#\1.0#'
}

function get_short_version() {
    echo $1 | sed 's#\([0-9][0-9]*\.[0-9][0-9]*\)\.[0-9][0-9]*#\1#'
}

function get_aws_account() {
    aws --region $AWS_DEFAULT_REGION sts --endpoint-url https://sts.$AWS_DEFAULT_REGION.amazonaws.com get-caller-identity --query 'Account' --output text
}

function get_ei_executable() {
    [[ $arch != 'eia' ]] && return

    if [[ -z $(aws s3 ls 's3://amazonei-tensorflow/tensorflow-serving/v'${short_version}'/ubuntu/latest/') ]]; then
        echo 'ERROR: cannot find this version in S3 bucket.'
        exit 1
    fi

    tmpdir=$(mktemp -d)
    tar_file=$(aws s3 ls "s3://amazonei-tensorflow/tensorflow-serving/v${short_version}/ubuntu/latest/" | awk '{print $4}')
    aws s3 cp "s3://amazonei-tensorflow/tensorflow-serving/v${short_version}/ubuntu/latest/${tar_file}" "$tmpdir/$tar_file"

    tar -C "$tmpdir" -xf "$tmpdir/$tar_file"

    find "$tmpdir"

Download .txt

gitextract_7xv872_6/

├── .github/
│   ├── ISSUE_TEMPLATE/
│   │   ├── bug_report.md
│   │   ├── config.yml
│   │   ├── documentation-request.md
│   │   └── feature_request.md
│   └── PULL_REQUEST_TEMPLATE.md
├── .gitignore
├── .jshintrc
├── .pylintrc
├── CHANGELOG.md
├── CODE_OF_CONDUCT.md
├── CONTRIBUTING.md
├── LICENSE
├── NOTICE
├── README.md
├── VERSION
├── buildspec.yml
├── docker/
│   ├── 1.11/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.12/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.13/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.14/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 1.15/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 2.0/
│   │   ├── Dockerfile.cpu
│   │   ├── Dockerfile.eia
│   │   └── Dockerfile.gpu
│   ├── 2.1/
│   │   ├── Dockerfile.cpu
│   │   └── Dockerfile.gpu
│   ├── __init__.py
│   └── build_artifacts/
│       ├── __init__.py
│       ├── deep_learning_container.py
│       ├── dockerd-entrypoint.py
│       └── sagemaker/
│           ├── __init__.py
│           ├── multi_model_utils.py
│           ├── nginx.conf.template
│           ├── python_service.py
│           ├── serve
│           ├── serve.py
│           ├── tensorflowServing.js
│           └── tfs_utils.py
├── scripts/
│   ├── build-all.sh
│   ├── build.sh
│   ├── curl.sh
│   ├── publish-all.sh
│   ├── publish.sh
│   ├── shared.sh
│   ├── start.sh
│   └── stop.sh
├── test/
│   ├── conftest.py
│   ├── data/
│   │   └── batch.csv
│   ├── integration/
│   │   ├── local/
│   │   │   ├── conftest.py
│   │   │   ├── multi_model_endpoint_test_utils.py
│   │   │   ├── test_container.py
│   │   │   ├── test_multi_model_endpoint.py
│   │   │   ├── test_multi_tfs.py
│   │   │   ├── test_nginx_config.py
│   │   │   ├── test_pre_post_processing.py
│   │   │   ├── test_pre_post_processing_mme.py
│   │   │   └── test_tfs_batching.py
│   │   └── sagemaker/
│   │       ├── conftest.py
│   │       ├── test_ei.py
│   │       ├── test_tfs.py
│   │       └── util.py
│   ├── perf/
│   │   ├── ab.sh
│   │   ├── create-endpoint.sh
│   │   ├── create-model.sh
│   │   ├── data_generator.py
│   │   ├── delete-endpoint.sh
│   │   ├── ec2-perftest.sh
│   │   └── perftest_endpoint.py
│   ├── resources/
│   │   ├── examples/
│   │   │   ├── test1/
│   │   │   │   └── inference.py
│   │   │   ├── test2/
│   │   │   │   └── inference.py
│   │   │   ├── test3/
│   │   │   │   ├── inference.py
│   │   │   │   └── requirements.txt
│   │   │   ├── test4/
│   │   │   │   ├── inference.py
│   │   │   │   └── lib/
│   │   │   │       └── dummy_module/
│   │   │   │           └── __init__.py
│   │   │   └── test5/
│   │   │       ├── inference.py
│   │   │       ├── lib/
│   │   │       │   └── dummy_module/
│   │   │       │       └── __init__.py
│   │   │       └── requirements.txt
│   │   ├── inputs/
│   │   │   ├── test-cifar.json
│   │   │   ├── test-gcloud.jsons
│   │   │   ├── test-generic.json
│   │   │   ├── test-large.csv
│   │   │   ├── test.csv
│   │   │   └── test.json
│   │   ├── mme/
│   │   │   ├── cifar/
│   │   │   │   └── 1540855709/
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   ├── half_plus_three/
│   │   │   │   ├── 00000123/
│   │   │   │   │   ├── assets/
│   │   │   │   │   │   └── foo.txt
│   │   │   │   │   ├── saved_model.pb
│   │   │   │   │   └── variables/
│   │   │   │   │       ├── variables.data-00000-of-00001
│   │   │   │   │       └── variables.index
│   │   │   │   └── 00000124/
│   │   │   │       ├── assets/
│   │   │   │       │   └── foo.txt
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   ├── half_plus_two/
│   │   │   │   └── 00000123/
│   │   │   │       ├── saved_model.pb
│   │   │   │       └── variables/
│   │   │   │           ├── variables.data-00000-of-00001
│   │   │   │           └── variables.index
│   │   │   └── invalid_version/
│   │   │       └── abcde/
│   │   │           └── dummy.txt
│   │   ├── mme_universal_script/
│   │   │   ├── code/
│   │   │   │   ├── inference.py
│   │   │   │   └── requirements.txt
│   │   │   └── half_plus_three/
│   │   │       └── model/
│   │   │           └── half_plus_three/
│   │   │               ├── 00000123/
│   │   │               │   ├── assets/
│   │   │               │   │   └── foo.txt
│   │   │               │   ├── saved_model.pb
│   │   │               │   └── variables/
│   │   │               │       ├── variables.data-00000-of-00001
│   │   │               │       └── variables.index
│   │   │               └── 00000124/
│   │   │                   ├── assets/
│   │   │                   │   └── foo.txt
│   │   │                   ├── saved_model.pb
│   │   │                   └── variables/
│   │   │                       ├── variables.data-00000-of-00001
│   │   │                       └── variables.index
│   │   └── models/
│   │       └── half_plus_three/
│   │           ├── .00000111/
│   │           │   └── .hidden_file
│   │           ├── 00000123/
│   │           │   ├── assets/
│   │           │   │   └── foo.txt
│   │           │   ├── saved_model.pb
│   │           │   └── variables/
│   │           │       ├── variables.data-00000-of-00001
│   │           │       └── variables.index
│   │           └── 00000124/
│   │               ├── assets/
│   │               │   └── foo.txt
│   │               ├── saved_model.pb
│   │               └── variables/
│   │                   ├── variables.data-00000-of-00001
│   │                   └── variables.index
│   └── unit/
│       ├── test_deep_learning_container.py
│       └── test_proxy_client.py
└── tox.ini

Download .txt

SYMBOL INDEX (272 symbols across 29 files)

FILE: docker/build_artifacts/deep_learning_container.py
  function _validate_instance_id (line 19) | def _validate_instance_id(instance_id):
  function _retrieve_instance_id (line 33) | def _retrieve_instance_id():
  function _retrieve_instance_region (line 47) | def _retrieve_instance_region():
  function query_bucket (line 83) | def query_bucket():
  function requests_helper (line 103) | def requests_helper(url, timeout):
  function main (line 113) | def main():

FILE: docker/build_artifacts/sagemaker/multi_model_utils.py
  function lock (line 23) | def lock(path=DEFAULT_LOCK_FILE):
  function timeout (line 36) | def timeout(seconds=60):
  class MultiModelException (line 48) | class MultiModelException(Exception):
    method __init__ (line 49) | def __init__(self, code, msg):

FILE: docker/build_artifacts/sagemaker/python_service.py
  function default_handler (line 45) | def default_handler(data, context):
  class PythonServiceResource (line 60) | class PythonServiceResource:
    method __init__ (line 61) | def __init__(self):
    method on_post (line 93) | def on_post(self, req, res, model_name=None):
    method _parse_concat_ports (line 100) | def _parse_concat_ports(self, concat_ports):
    method _pick_port (line 103) | def _pick_port(self, ports):
    method _parse_sagemaker_port_range_mme (line 106) | def _parse_sagemaker_port_range_mme(self, port_range):
    method _ports_available (line 118) | def _ports_available(self):
    method _handle_load_model_post (line 124) | def _handle_load_model_post(self, res, data):  # noqa: C901
    method _cleanup_config_file (line 224) | def _cleanup_config_file(self, config_file):
    method _handle_invocation_post (line 228) | def _handle_invocation_post(self, req, res, model_name=None):
    method _setup_channel (line 274) | def _setup_channel(self, grpc_port):
    method _import_handlers (line 279) | def _import_handlers(self):
    method _make_handler (line 296) | def _make_handler(self, custom_handler, custom_input_handler, custom_o...
    method on_get (line 307) | def on_get(self, req, res, model_name=None):  # pylint: disable=W0613
    method on_delete (line 339) | def on_delete(self, req, res, model_name):  # pylint: disable=W0613
    method validate_model_dir (line 364) | def validate_model_dir(self, model_path):
    method validate_model_versions (line 375) | def validate_model_versions(self, versions):
  class PingResource (line 387) | class PingResource:
    method on_get (line 388) | def on_get(self, req, res):  # pylint: disable=W0613
  class ServiceResources (line 392) | class ServiceResources:
    method __init__ (line 393) | def __init__(self):
    method add_routes (line 398) | def add_routes(self, application):

FILE: docker/build_artifacts/sagemaker/serve.py
  class ServiceManager (line 39) | class ServiceManager(object):
    method __init__ (line 40) | def __init__(self):
    method _need_python_service (line 136) | def _need_python_service(self):
    method _concat_ports (line 144) | def _concat_ports(self, ports):
    method _create_tfs_config (line 149) | def _create_tfs_config(self):
    method _setup_gunicorn (line 186) | def _setup_gunicorn(self):
    method _download_scripts (line 245) | def _download_scripts(self, bucket, prefix):
    method _create_nginx_tfs_upstream (line 264) | def _create_nginx_tfs_upstream(self):
    method _create_nginx_config (line 273) | def _create_nginx_config(self):
    method _read_nginx_template (line 296) | def _read_nginx_template(self):
    method _enable_per_process_gpu_memory_fraction (line 304) | def _enable_per_process_gpu_memory_fraction(self):
    method _get_number_of_gpu_on_host (line 311) | def _get_number_of_gpu_on_host(self):
    method _calculate_per_process_gpu_memory_fraction (line 319) | def _calculate_per_process_gpu_memory_fraction(self):
    method _start_tfs (line 322) | def _start_tfs(self):
    method _start_gunicorn (line 329) | def _start_gunicorn(self):
    method _start_nginx (line 337) | def _start_nginx(self):
    method _log_version (line 343) | def _log_version(self, command, message):
    method _stop (line 354) | def _stop(self, *args):  # pylint: disable=W0613
    method _wait_for_gunicorn (line 375) | def _wait_for_gunicorn(self):
    method _wait_for_tfs (line 381) | def _wait_for_tfs(self):
    method _timeout (line 388) | def _timeout(self, seconds):
    method _is_tfs_process (line 399) | def _is_tfs_process(self, pid):
    method _find_tfs_process (line 405) | def _find_tfs_process(self, pid):
    method _restart_single_tfs (line 411) | def _restart_single_tfs(self, pid):
    method _start_single_tfs (line 418) | def _start_single_tfs(self, instance_id):
    method _monitor (line 447) | def _monitor(self):
    method start (line 471) | def start(self):

FILE: docker/build_artifacts/sagemaker/tensorflowServing.js
  function invocations (line 4) | function invocations(r) {
  function ping (line 16) | function ping(r) {
  function ping_without_model (line 31) | function ping_without_model(r) {
  function return_error (line 57) | function return_error(r, code, message) {
  function tfs_json_request (line 65) | function tfs_json_request(r, json) {
  function make_tfs_uri (line 94) | function make_tfs_uri(r, with_method) {
  function parse_custom_attributes (line 109) | function parse_custom_attributes(r) {
  function json_request (line 140) | function json_request(r) {
  function is_tfs_json (line 152) | function is_tfs_json(data) {
  function is_json_lines (line 156) | function is_json_lines(data) {
  function generic_json_request (line 161) | function generic_json_request(r, data) {
  function json_lines_request (line 170) | function json_lines_request(r, data) {
  function csv_request (line 191) | function csv_request(r) {

FILE: docker/build_artifacts/sagemaker/tfs_utils.py
  function parse_request (line 41) | def parse_request(req, rest_port, grpc_port, default_model_name, model_n...
  function make_tfs_uri (line 65) | def make_tfs_uri(port, attributes, default_model_name, model_name=None):
  function parse_tfs_custom_attributes (line 79) | def parse_tfs_custom_attributes(req):
  function create_tfs_config_individual_model (line 88) | def create_tfs_config_individual_model(model_name, base_path):
  function tfs_command (line 107) | def tfs_command(
  function find_models (line 136) | def find_models():
  function find_model_versions (line 148) | def find_model_versions(model_path):
  function _find_saved_model_files (line 157) | def _find_saved_model_files(path):
  function get_tfs_batching_args (line 166) | def get_tfs_batching_args(enable_batching, tfs_batching_config):
  function get_tensorflow_intra_op_parallelism_args (line 173) | def get_tensorflow_intra_op_parallelism_args(tfs_intra_op_parallelism):
  function get_tensorflow_inter_op_parallelism_args (line 180) | def get_tensorflow_inter_op_parallelism_args(tfs_inter_op_parallelism):
  function get_tfs_gpu_mem_args (line 187) | def get_tfs_gpu_mem_args(enable_gpu_memory_fraction, gpu_memory_fraction):
  function create_batching_config (line 194) | def create_batching_config(batching_config_file):
  function wait_for_model (line 257) | def wait_for_model(rest_port, model_name, timeout_seconds, wait_interval...

FILE: test/integration/local/conftest.py
  function pytest_addoption (line 20) | def pytest_addoption(parser):
  function docker_base_name (line 28) | def docker_base_name(request):
  function framework_version (line 33) | def framework_version(request):
  function processor (line 38) | def processor(request):
  function runtime_config (line 43) | def runtime_config(request, processor):
  function tag (line 51) | def tag(request, framework_version, processor):
  function skip_by_device_type (line 59) | def skip_by_device_type(request, processor):

FILE: test/integration/local/multi_model_endpoint_test_utils.py
  function make_headers (line 21) | def make_headers(content_type="application/json", method="predict", vers...
  function make_invocation_request (line 32) | def make_invocation_request(data, model_name, content_type="application/...
  function make_list_model_request (line 38) | def make_list_model_request():
  function make_get_model_request (line 43) | def make_get_model_request(model_name):
  function make_load_model_request (line 48) | def make_load_model_request(data, content_type="application/json"):
  function make_unload_model_request (line 56) | def make_unload_model_request(model_name):

FILE: test/integration/local/test_container.py
  function volume (line 27) | def volume():
  function container (line 39) | def container(request, docker_base_name, tag, runtime_config):
  function make_request (line 76) | def make_request(data, content_type="application/json", method="predict"...
  function test_predict (line 89) | def test_predict():
  function test_predict_twice (line 98) | def test_predict_twice():
  function test_predict_specific_versions (line 109) | def test_predict_specific_versions():
  function test_predict_two_instances (line 121) | def test_predict_two_instances():
  function test_predict_instance_jsonlines_input_error (line 130) | def test_predict_instance_jsonlines_input_error():
  function test_predict_jsons_json_content_type (line 145) | def test_predict_jsons_json_content_type():
  function test_predict_jsonlines (line 151) | def test_predict_jsonlines():
  function test_predict_jsons (line 157) | def test_predict_jsons():
  function test_predict_jsons_2 (line 163) | def test_predict_jsons_2():
  function test_predict_generic_json (line 169) | def test_predict_generic_json():
  function test_predict_generic_json_two_instances (line 175) | def test_predict_generic_json_two_instances():
  function test_predict_csv (line 181) | def test_predict_csv():
  function test_predict_csv_with_zero (line 187) | def test_predict_csv_with_zero():
  function test_predict_csv_one_instance_three_values_with_zero (line 193) | def test_predict_csv_one_instance_three_values_with_zero():
  function test_predict_csv_one_instance_three_values (line 199) | def test_predict_csv_one_instance_three_values():
  function test_predict_csv_two_instances_three_values (line 205) | def test_predict_csv_two_instances_three_values():
  function test_predict_csv_three_instances (line 211) | def test_predict_csv_three_instances():
  function test_predict_csv_wide_categorical_input (line 217) | def test_predict_csv_wide_categorical_input():
  function test_regress (line 230) | def test_regress():
  function test_regress_one_instance (line 240) | def test_regress_one_instance():
  function test_predict_bad_input (line 253) | def test_predict_bad_input():
  function test_predict_bad_input_instances (line 258) | def test_predict_bad_input_instances():
  function test_predict_no_custom_attributes_header (line 264) | def test_predict_no_custom_attributes_header():
  function test_predict_with_jsonlines (line 278) | def test_predict_with_jsonlines():
  function test_predict_with_multiple_accept_types (line 292) | def test_predict_with_multiple_accept_types():

FILE: test/integration/local/test_multi_model_endpoint.py
  function volume (line 34) | def volume():
  function container (line 46) | def container(request, docker_base_name, tag, runtime_config):
  function test_ping (line 77) | def test_ping():
  function test_container_start_invocation_fail (line 83) | def test_container_start_invocation_fail():
  function test_list_models_empty (line 94) | def test_list_models_empty():
  function test_delete_unloaded_model (line 102) | def test_delete_unloaded_model():
  function test_delete_model (line 111) | def test_delete_model():
  function test_load_two_models (line 138) | def test_load_two_models():
  function test_load_one_model_two_times (line 180) | def test_load_one_model_two_times():
  function test_load_non_existing_model (line 196) | def test_load_non_existing_model():
  function test_bad_model_reqeust (line 209) | def test_bad_model_reqeust():
  function test_invalid_model_version (line 219) | def test_invalid_model_version():

FILE: test/integration/local/test_multi_tfs.py
  function volume (line 27) | def volume():
  function container (line 39) | def container(request, docker_base_name, tag, runtime_config):
  function make_request (line 78) | def make_request(data, content_type="application/json", method="predict"...
  function test_predict (line 91) | def test_predict():

FILE: test/integration/local/test_nginx_config.py
  function volume (line 21) | def volume():
  function test_run_nginx_with_default_parameters (line 33) | def test_run_nginx_with_default_parameters(docker_base_name, tag, runtim...
  function test_run_nginx_with_env_var_parameters (line 62) | def test_run_nginx_with_env_var_parameters(docker_base_name, tag, runtim...
  function test_run_nginx_with_higher_gunicorn_parameter (line 92) | def test_run_nginx_with_higher_gunicorn_parameter(docker_base_name, tag,...

FILE: test/integration/local/test_pre_post_processing.py
  function volume (line 30) | def volume(tmpdir_factory, request):
  function container (line 51) | def container(volume, docker_base_name, tag, runtime_config):
  function make_headers (line 80) | def make_headers(content_type, method, version=None):
  function test_predict_json (line 91) | def test_predict_json():
  function test_zero_content (line 98) | def test_zero_content():
  function test_large_input (line 106) | def test_large_input():
  function test_csv_input (line 117) | def test_csv_input():
  function test_predict_specific_versions (line 124) | def test_predict_specific_versions():
  function test_unsupported_content_type (line 132) | def test_unsupported_content_type():
  function test_ping_service (line 140) | def test_ping_service():

FILE: test/integration/local/test_pre_post_processing_mme.py
  function volume (line 34) | def volume():
  function container (line 46) | def container(docker_base_name, tag, runtime_config):
  function model (line 77) | def model():
  function test_ping_service (line 87) | def test_ping_service():
  function test_predict_json (line 93) | def test_predict_json(model):
  function test_zero_content (line 101) | def test_zero_content():
  function test_large_input (line 110) | def test_large_input():
  function test_csv_input (line 122) | def test_csv_input():
  function test_specific_versions (line 130) | def test_specific_versions():
  function test_unsupported_content_type (line 141) | def test_unsupported_content_type():

FILE: test/integration/local/test_tfs_batching.py
  function volume (line 21) | def volume():
  function test_run_tfs_with_batching_parameters (line 32) | def test_run_tfs_with_batching_parameters(docker_base_name, tag, runtime...

FILE: test/integration/sagemaker/conftest.py
  function pytest_addoption (line 48) | def pytest_addoption(parser):
  function pytest_configure (line 58) | def pytest_configure(config):
  function region (line 74) | def region(request):
  function registry (line 79) | def registry(request, region):
  function boto_session (line 92) | def boto_session(region):
  function sagemaker_client (line 97) | def sagemaker_client(boto_session):
  function sagemaker_runtime_client (line 102) | def sagemaker_runtime_client(boto_session):
  function unique_name_from_base (line 106) | def unique_name_from_base(base, max_length=63):
  function model_name (line 115) | def model_name():
  function skip_gpu_instance_restricted_regions (line 120) | def skip_gpu_instance_restricted_regions(region, instance_type):
  function skip_by_device_type (line 127) | def skip_by_device_type(request, instance_type):

FILE: test/integration/sagemaker/test_ei.py
  function version (line 25) | def version(request):
  function repo (line 30) | def repo(request):
  function tag (line 35) | def tag(request, version):
  function image_uri (line 40) | def image_uri(registry, region, repo, tag):
  function instance_type (line 45) | def instance_type(request, region):
  function accelerator_type (line 50) | def accelerator_type(request):
  function model_data (line 55) | def model_data(region):
  function input_data (line 61) | def input_data():
  function skip_if_no_accelerator (line 66) | def skip_if_no_accelerator(accelerator_type):
  function skip_if_non_supported_ei_region (line 72) | def skip_if_non_supported_ei_region(region):
  function test_invoke_endpoint (line 79) | def test_invoke_endpoint(boto_session, sagemaker_client, sagemaker_runti...

FILE: test/integration/sagemaker/test_tfs.py
  function version (line 24) | def version(request):
  function repo (line 29) | def repo(request):
  function tag (line 34) | def tag(request, version, instance_type):
  function image_uri (line 43) | def image_uri(registry, region, repo, tag):
  function instance_type (line 48) | def instance_type(request, region):
  function accelerator_type (line 53) | def accelerator_type():
  function tfs_model (line 58) | def tfs_model(region, boto_session):
  function python_model_with_requirements (line 65) | def python_model_with_requirements(region, boto_session):
  function python_model_with_lib (line 72) | def python_model_with_lib(region, boto_session):
  function test_tfs_model (line 78) | def test_tfs_model(boto_session, sagemaker_client,
  function test_batch_transform (line 87) | def test_batch_transform(region, boto_session, sagemaker_client,
  function test_python_model_with_requirements (line 102) | def test_python_model_with_requirements(boto_session, sagemaker_client,
  function test_python_model_with_lib (line 122) | def test_python_model_with_lib(boto_session, sagemaker_client,

FILE: test/integration/sagemaker/util.py
  function image_uri (line 26) | def image_uri(registry, region, repo, tag):
  function _execution_role (line 30) | def _execution_role(boto_session):
  function sagemaker_model (line 35) | def sagemaker_model(boto_session, sagemaker_client, image_uri, model_nam...
  function _production_variants (line 51) | def _production_variants(model_name, instance_type, accelerator_type):
  function _test_bucket (line 65) | def _test_bucket(region, boto_session):
  function find_or_put_model_data (line 75) | def find_or_put_model_data(region, boto_session, local_path):
  function sagemaker_endpoint (line 109) | def sagemaker_endpoint(sagemaker_client, model_name, instance_type, acce...
  function _create_transform_job_request (line 138) | def _create_transform_job_request(model_name, batch_output, batch_input,...
  function _read_batch_output (line 164) | def _read_batch_output(region, boto_session, bucket, model_name):
  function _wait_for_transform_job (line 171) | def _wait_for_transform_job(region, boto_session, sagemaker_client, mode...
  function run_batch_transform_job (line 194) | def run_batch_transform_job(region, boto_session, model_data, image_uri,
  function invoke_endpoint (line 217) | def invoke_endpoint(sagemaker_runtime_client, endpoint_name, input_data):
  function create_and_invoke_endpoint (line 226) | def create_and_invoke_endpoint(boto_session, sagemaker_client, sagemaker...

FILE: test/perf/data_generator.py
  function generate_json (line 13) | def generate_json(shape, payload_size):
  function _generate_json_recursively (line 26) | def _generate_json_recursively(shape):
  function generate_jsonlines (line 35) | def generate_jsonlines(shape, payload_size):
  function _get_num_records_for_json_payload (line 45) | def _get_num_records_for_json_payload(payload_size, one_record_size):
  function generate_csv (line 49) | def generate_csv(shape, payload_size):
  function _random_input (line 66) | def _random_input(n):
  function _map_payload_size_given_unit (line 71) | def _map_payload_size_given_unit(payload_size, unit_of_payload):
  function generate_data (line 75) | def generate_data(content_type, shape, payload_size, unit_of_payload='B'):

FILE: test/perf/perftest_endpoint.py
  class PerfTester (line 22) | class PerfTester(object):
    method __init__ (line 23) | def __init__(self):
    method test_worker (line 30) | def test_worker(self, id, args, count, test_data, error_counts):
    method test (line 46) | def test(self, args, count, test_data):
    method report (line 69) | def report(self, args):
    method parse_args (line 84) | def parse_args(self, args):
    method run (line 94) | def run(self, args):
  function _read_file (line 102) | def _read_file(path):
  function _random_payload (line 107) | def _random_payload(size_in_kb):

FILE: test/resources/examples/test1/inference.py
  function input_handler (line 22) | def input_handler(data, context):
  function output_handler (line 47) | def output_handler(data, context):

FILE: test/resources/examples/test2/inference.py
  function handler (line 24) | def handler(data, context):
  function _process_input (line 39) | def _process_input(data, context):
  function _process_output (line 55) | def _process_output(data, context):

FILE: test/resources/examples/test3/inference.py
  function handler (line 29) | def handler(data, context):
  function _process_input (line 47) | def _process_input(data, context):
  function _process_output (line 63) | def _process_output(data, context):

FILE: test/resources/examples/test4/inference.py
  function handler (line 27) | def handler(data, context):
  function _process_input (line 53) | def _process_input(data, context):
  function _process_output (line 69) | def _process_output(data, context):

FILE: test/resources/examples/test5/inference.py
  function handler (line 27) | def handler(data, context):
  function _process_input (line 53) | def _process_input(data, context):
  function _process_output (line 69) | def _process_output(data, context):

FILE: test/resources/mme_universal_script/code/inference.py
  function input_handler (line 24) | def input_handler(data, context):
  function output_handler (line 49) | def output_handler(data, context):

FILE: test/unit/test_deep_learning_container.py
  function fixture_valid_instance_id (line 23) | def fixture_valid_instance_id(requests_mock):
  function fixture_invalid_instance_id (line 29) | def fixture_invalid_instance_id(requests_mock):
  function fixture_none_instance_id (line 34) | def fixture_none_instance_id(requests_mock):
  function fixture_invalid_region (line 39) | def fixture_invalid_region(requests_mock):
  function fixture_valid_region (line 45) | def fixture_valid_region(requests_mock):
  function test_retrieve_instance_id (line 50) | def test_retrieve_instance_id(fixture_valid_instance_id):
  function test_retrieve_none_instance_id (line 55) | def test_retrieve_none_instance_id(fixture_none_instance_id):
  function test_retrieve_invalid_instance_id (line 60) | def test_retrieve_invalid_instance_id(fixture_invalid_instance_id):
  function test_retrieve_invalid_region (line 65) | def test_retrieve_invalid_region(fixture_invalid_region):
  function test_retrieve_valid_region (line 70) | def test_retrieve_valid_region(fixture_valid_region):
  function test_query_bucket (line 75) | def test_query_bucket(requests_mock, fixture_valid_region, fixture_valid...
  function test_query_bucket_region_none (line 85) | def test_query_bucket_region_none(fixture_invalid_region, fixture_valid_...
  function test_query_bucket_instance_id_none (line 92) | def test_query_bucket_instance_id_none(requests_mock, fixture_valid_regi...
  function test_query_bucket_instance_id_invalid (line 99) | def test_query_bucket_instance_id_invalid(requests_mock, fixture_valid_r...
  function test_HTTP_error_on_S3 (line 106) | def test_HTTP_error_on_S3(requests_mock, fixture_valid_region, fixture_v...
  function test_connection_error_on_S3 (line 122) | def test_connection_error_on_S3(requests_mock, fixture_valid_region, fix...
  function test_timeout_error_on_S3 (line 139) | def test_timeout_error_on_S3(requests_mock, fixture_valid_region, fixtur...

FILE: test/unit/test_proxy_client.py
  function create_sagemaker_folder (line 9) | def create_sagemaker_folder(tmpdir):
  function test_grpc_add_model_no_config_file (line 16) | def test_grpc_add_model_no_config_file():
  function test_grpc_add_model_call (line 26) | def test_grpc_add_model_call(channel, ReloadConfigRequest):
  function test_grpc_delete_model_call (line 69) | def test_grpc_delete_model_call(channel, ReloadConfigRequest):

Download .json

Condensed preview — 130 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (3,352K chars).

[
  {
    "path": ".github/ISSUE_TEMPLATE/bug_report.md",
    "chars": 721,
    "preview": "---\nname: Bug report\nabout: File a report to help us reproduce and fix the problem\ntitle: ''\nlabels: ''\nassignees: ''\n\n-"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/config.yml",
    "chars": 195,
    "preview": "blank_issues_enabled: false\ncontact_links:\n  - name: Ask a question\n    url: https://stackoverflow.com/questions/tagged/"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/documentation-request.md",
    "chars": 522,
    "preview": "---\nname: Documentation request\nabout: Request improved documentation\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**What di"
  },
  {
    "path": ".github/ISSUE_TEMPLATE/feature_request.md",
    "chars": 606,
    "preview": "---\nname: Feature request\nabout: Suggest new functionality for this toolkit\ntitle: ''\nlabels: ''\nassignees: ''\n\n---\n\n**D"
  },
  {
    "path": ".github/PULL_REQUEST_TEMPLATE.md",
    "chars": 169,
    "preview": "*Issue #, if available:*\n\n*Description of changes:*\n\n\nBy submitting this pull request, I confirm that my contribution is"
  },
  {
    "path": ".gitignore",
    "chars": 78,
    "preview": "__pycache__\n.tox/\nlog.txt\n.idea/\nnode_modules/\npackage.json\npackage-lock.json\n"
  },
  {
    "path": ".jshintrc",
    "chars": 36,
    "preview": "{\n  \"asi\": true,\n  \"esversion\": 6\n}\n"
  },
  {
    "path": ".pylintrc",
    "chars": 2533,
    "preview": "[MASTER]\n\nignore=\n    tensorflow_serving,\n    tensorflow-2.1,\n    tensorflow-2.2\n\n[MESSAGES CONTROL]\n\ndisable=\n    C, # "
  },
  {
    "path": "CHANGELOG.md",
    "chars": 8158,
    "preview": "# Changelog\n\n## v1.8.4 (2021-06-30)\n\n### Bug Fixes and Other Changes\n\n * modify the way port number passing\n\n## v1.8.3 ("
  },
  {
    "path": "CODE_OF_CONDUCT.md",
    "chars": 311,
    "preview": "## Code of Conduct\nThis project has adopted the [Amazon Open Source Code of Conduct](https://aws.github.io/code-of-condu"
  },
  {
    "path": "CONTRIBUTING.md",
    "chars": 3602,
    "preview": "# Contributing Guidelines\n\nThank you for your interest in contributing to our project. Whether it's a bug report, new fe"
  },
  {
    "path": "LICENSE",
    "chars": 11358,
    "preview": "\n                                 Apache License\n                           Version 2.0, January 2004\n                  "
  },
  {
    "path": "NOTICE",
    "chars": 112,
    "preview": "Sagemaker TensorFlow Serving Container\nCopyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved. \n"
  },
  {
    "path": "README.md",
    "chars": 31118,
    "preview": "# ![image](https://user-images.githubusercontent.com/56273942/202568467-0ee721bb-1424-4efd-88fc-31b4f2a59dc6.png) DEPREC"
  },
  {
    "path": "VERSION",
    "chars": 11,
    "preview": "1.8.5.dev0\n"
  },
  {
    "path": "buildspec.yml",
    "chars": 2130,
    "preview": "version: 0.2\n\nphases:\n  pre_build:\n    commands:\n      - start-dockerd\n\n      # fix permissions dropped by CodePipeline\n"
  },
  {
    "path": "docker/1.11/Dockerfile.cpu",
    "chars": 993,
    "preview": "ARG TFS_VERSION\n\nFROM tensorflow/serving:${TFS_VERSION} as tfs\nFROM ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilit"
  },
  {
    "path": "docker/1.11/Dockerfile.eia",
    "chars": 973,
    "preview": "FROM ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG TFS_SHORT_VERSION\n\n# nginx + "
  },
  {
    "path": "docker/1.11/Dockerfile.gpu",
    "chars": 2546,
    "preview": "ARG TFS_VERSION\n\nFROM tensorflow/serving:${TFS_VERSION}-gpu as tfs\nFROM nvidia/cuda:9.0-base-ubuntu16.04\nLABEL com.amazo"
  },
  {
    "path": "docker/1.12/Dockerfile.cpu",
    "chars": 993,
    "preview": "ARG TFS_VERSION\n\nFROM tensorflow/serving:${TFS_VERSION} as tfs\nFROM ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilit"
  },
  {
    "path": "docker/1.12/Dockerfile.eia",
    "chars": 973,
    "preview": "FROM ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG TFS_SHORT_VERSION\n\n# nginx + "
  },
  {
    "path": "docker/1.12/Dockerfile.gpu",
    "chars": 2547,
    "preview": "ARG TFS_VERSION\n\nFROM tensorflow/serving:${TFS_VERSION}-gpu as tfs\nFROM nvidia/cuda:9.0-base-ubuntu16.04\nLABEL com.amazo"
  },
  {
    "path": "docker/1.13/Dockerfile.cpu",
    "chars": 2913,
    "preview": "FROM ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG"
  },
  {
    "path": "docker/1.13/Dockerfile.eia",
    "chars": 1174,
    "preview": "FROM ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG PIP=pip3\nARG TFS_SHORT_VERSIO"
  },
  {
    "path": "docker/1.13/Dockerfile.gpu",
    "chars": 4820,
    "preview": "FROM nvidia/cuda:10.0-base-ubuntu16.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-b"
  },
  {
    "path": "docker/1.14/Dockerfile.cpu",
    "chars": 2836,
    "preview": "FROM ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG"
  },
  {
    "path": "docker/1.14/Dockerfile.eia",
    "chars": 3788,
    "preview": "FROM public.ecr.aws/e2s1w5p1/ubuntu:16.04\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG TFS_S"
  },
  {
    "path": "docker/1.14/Dockerfile.gpu",
    "chars": 4760,
    "preview": "FROM nvidia/cuda:10.0-base-ubuntu16.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-b"
  },
  {
    "path": "docker/1.15/Dockerfile.cpu",
    "chars": 3625,
    "preview": "FROM public.ecr.aws/ubuntu/ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\n# Specify LABEL for inference pipelines to use SAG"
  },
  {
    "path": "docker/1.15/Dockerfile.eia",
    "chars": 4083,
    "preview": "FROM ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\n# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT\n# "
  },
  {
    "path": "docker/1.15/Dockerfile.gpu",
    "chars": 5643,
    "preview": "FROM nvidia/cuda:10.0-base-ubuntu18.04\n\nLABEL maintainer=\"Amazon AI\"\n# Specify LABEL for inference pipelines to use SAGE"
  },
  {
    "path": "docker/2.0/Dockerfile.cpu",
    "chars": 3230,
    "preview": "FROM ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-bind-to-port=true\n\nARG"
  },
  {
    "path": "docker/2.0/Dockerfile.eia",
    "chars": 3972,
    "preview": "FROM ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\n# Specify LABEL for inference pipelines to use SAGEMAKER_BIND_TO_PORT\n# "
  },
  {
    "path": "docker/2.0/Dockerfile.gpu",
    "chars": 4354,
    "preview": "FROM nvidia/cuda:10.0-base-ubuntu18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-b"
  },
  {
    "path": "docker/2.1/Dockerfile.cpu",
    "chars": 3296,
    "preview": "FROM public.ecr.aws/ubuntu/ubuntu:18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-"
  },
  {
    "path": "docker/2.1/Dockerfile.gpu",
    "chars": 5459,
    "preview": "FROM nvidia/cuda:10.1-base-ubuntu18.04\n\nLABEL maintainer=\"Amazon AI\"\nLABEL com.amazonaws.sagemaker.capabilities.accept-b"
  },
  {
    "path": "docker/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "docker/build_artifacts/__init__.py",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "docker/build_artifacts/deep_learning_container.py",
    "chars": 3214,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/dockerd-entrypoint.py",
    "chars": 833,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/sagemaker/__init__.py",
    "chars": 570,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/sagemaker/multi_model_utils.py",
    "chars": 1473,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/sagemaker/nginx.conf.template",
    "chars": 1382,
    "preview": "load_module modules/ngx_http_js_module.so;\n\nworker_processes auto;\ndaemon off;\npid /tmp/nginx.pid;\nerror_log  /dev/stder"
  },
  {
    "path": "docker/build_artifacts/sagemaker/python_service.py",
    "chars": 17824,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/sagemaker/serve",
    "chars": 41,
    "preview": "#!/bin/bash\n\npython3 /sagemaker/serve.py\n"
  },
  {
    "path": "docker/build_artifacts/sagemaker/serve.py",
    "chars": 20894,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "docker/build_artifacts/sagemaker/tensorflowServing.js",
    "chars": 6898,
    "preview": "var tfs_base_uri = '/tfs/v1/models/'\nvar custom_attributes_header = 'X-Amzn-SageMaker-Custom-Attributes'\n\nfunction invoc"
  },
  {
    "path": "docker/build_artifacts/sagemaker/tfs_utils.py",
    "chars": 9881,
    "preview": "# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Version"
  },
  {
    "path": "scripts/build-all.sh",
    "chars": 356,
    "preview": "#!/bin/bash\n#\n# Build all the docker images.\n\nset -euo pipefail\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )\" >/dev/nul"
  },
  {
    "path": "scripts/build.sh",
    "chars": 1117,
    "preview": "#!/bin/bash\n#\n# Build the docker images.\n\nset -euo pipefail\n\nsource scripts/shared.sh\n\nparse_std_args \"$@\"\n\nget_ei_execu"
  },
  {
    "path": "scripts/curl.sh",
    "chars": 1091,
    "preview": "#!/bin/bash\n#\n# Some example curl requests to try on local docker containers.\n\ncurl -X POST --data-binary @test/resource"
  },
  {
    "path": "scripts/publish-all.sh",
    "chars": 377,
    "preview": "#!/bin/bash\n#\n# Publish all images to your ECR account.\n\nset -euo pipefail\n\nDIR=\"$( cd \"$( dirname \"${BASH_SOURCE[0]}\" )"
  },
  {
    "path": "scripts/publish.sh",
    "chars": 828,
    "preview": "#!/bin/bash\n#\n# Publish images to your ECR account.\n\nset -euo pipefail\n\nsource scripts/shared.sh\n\nparse_std_args \"$@\"\n\na"
  },
  {
    "path": "scripts/shared.sh",
    "chars": 2794,
    "preview": "#!/bin/bash\n#\n# Utility functions for build/test scripts.\n\nfunction error() {\n    >&2 echo $1\n    >&2 echo \"usage: $0 [-"
  },
  {
    "path": "scripts/start.sh",
    "chars": 552,
    "preview": "#!/bin/bash\n#\n# Start a local docker container.\n\nset -euo pipefail\n\nsource scripts/shared.sh\n\nparse_std_args \"$@\"\n\nif [ "
  },
  {
    "path": "scripts/stop.sh",
    "chars": 194,
    "preview": "#!/bin/bash\n#\n# Stop a local docker container.\n\nset -euo pipefail\n\nsource scripts/shared.sh\n\nparse_std_args \"$@\"\n\ndocker"
  },
  {
    "path": "test/conftest.py",
    "chars": 570,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/data/batch.csv",
    "chars": 140,
    "preview": "1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0, 5.0\n1.0, 2.0"
  },
  {
    "path": "test/integration/local/conftest.py",
    "chars": 2075,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/multi_model_endpoint_test_utils.py",
    "chars": 2186,
    "preview": "# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Version"
  },
  {
    "path": "test/integration/local/test_container.py",
    "chars": 9268,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/test_multi_model_endpoint.py",
    "chars": 7160,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/test_multi_tfs.py",
    "chars": 3317,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/test_nginx_config.py",
    "chars": 4530,
    "preview": "# Copyright 2019-2022 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/test_pre_post_processing.py",
    "chars": 4931,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/local/test_pre_post_processing_mme.py",
    "chars": 4740,
    "preview": "# Copyright 2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Version"
  },
  {
    "path": "test/integration/local/test_tfs_batching.py",
    "chars": 2672,
    "preview": "# Copyright 2019-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/sagemaker/conftest.py",
    "chars": 4033,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/sagemaker/test_ei.py",
    "chars": 2698,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/sagemaker/test_tfs.py",
    "chars": 5403,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/integration/sagemaker/util.py",
    "chars": 8609,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/perf/ab.sh",
    "chars": 1428,
    "preview": "#!/bin/bash\n\nab -k -n 10000 -c 16 -p test/resources/inputs/test.json -T 'application/json' http://localhost:8080/tfs/v1/"
  },
  {
    "path": "test/perf/create-endpoint.sh",
    "chars": 610,
    "preview": "#!/bin/bash\n\ninstance_type=\"${1:-c5.xlarge}\"\nif [[ \"$instance_type\" == p* ]]; then\n    arch='gpu'\nelse\n    arch='cpu'\nfi"
  },
  {
    "path": "test/perf/create-model.sh",
    "chars": 1310,
    "preview": "#!/bin/bash\n\nset -e\n\narch=${1:-'cpu'}\naws_region=$(aws configure get region)\naws_account=$(aws --region $aws_region sts "
  },
  {
    "path": "test/perf/data_generator.py",
    "chars": 5301,
    "preview": "import argparse\nimport math\nimport random\nimport sys\n\n_CONTENT_TYPE_CSV = 'text/csv'\n_CONTENT_TYPE_JSON = 'application/j"
  },
  {
    "path": "test/perf/delete-endpoint.sh",
    "chars": 197,
    "preview": "#!/bin/bash\n\nendpoint=${1-'sagemaker-tensorflow-serving-cpu-c5-xlarge'}\naws sagemaker delete-endpoint --endpoint-name $e"
  },
  {
    "path": "test/perf/ec2-perftest.sh",
    "chars": 4152,
    "preview": "#!/bin/bash\n\nfor i in $(seq 1 5); do python perftest_endpoint.py --count 5000 --warmup 100 --workers 4 --model sm-c5xl >"
  },
  {
    "path": "test/perf/perftest_endpoint.py",
    "chars": 5132,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test1/inference.py",
    "chars": 2292,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test2/inference.py",
    "chars": 2100,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test3/inference.py",
    "chars": 2305,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test3/requirements.txt",
    "chars": 14,
    "preview": "Pillow>=6.2.2\n"
  },
  {
    "path": "test/resources/examples/test4/inference.py",
    "chars": 2420,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test4/lib/dummy_module/__init__.py",
    "chars": 20,
    "preview": "__version__ = '0.1'\n"
  },
  {
    "path": "test/resources/examples/test5/inference.py",
    "chars": 2420,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/examples/test5/lib/dummy_module/__init__.py",
    "chars": 20,
    "preview": "__version__ = '0.1'\n"
  },
  {
    "path": "test/resources/examples/test5/requirements.txt",
    "chars": 13,
    "preview": "Pillow>=6.2.2"
  },
  {
    "path": "test/resources/inputs/test-cifar.json",
    "chars": 14404,
    "preview": "[[[[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,1.0,1.0],[1.0,"
  },
  {
    "path": "test/resources/inputs/test-gcloud.jsons",
    "chars": 210,
    "preview": "{\"x\": [1.0,2.0,5.0]}\n{\"x\": [1.0,2.0,5.0]}\n{\"x\": [1.0,2.0,5.0]}\n{\"x\": [1.0,2.0,5.0]}\n{\"x\": [1.0,2.0,5.0]}\n{\"x\": [1.0,2.0,"
  },
  {
    "path": "test/resources/inputs/test-generic.json",
    "chars": 13,
    "preview": "[1.0,2.0,5.0]"
  },
  {
    "path": "test/resources/inputs/test-large.csv",
    "chars": 3015743,
    "preview": "1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,1.0,2.0,5.0,"
  },
  {
    "path": "test/resources/inputs/test.csv",
    "chars": 120,
    "preview": "1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n1.0,2.0,5.0\n"
  },
  {
    "path": "test/resources/inputs/test.json",
    "chars": 213,
    "preview": "{\n  \"instances\": [\n    [1.0,2.0,5.0],\n    [1.0,2.0,5.0],\n    [1.0,2.0,5.0],\n    [1.0,2.0,5.0],\n    [1.0,2.0,5.0],\n    [1"
  },
  {
    "path": "test/resources/mme/half_plus_three/00000123/assets/foo.txt",
    "chars": 19,
    "preview": "asset-file-contents"
  },
  {
    "path": "test/resources/mme/half_plus_three/00000124/assets/foo.txt",
    "chars": 19,
    "preview": "asset-file-contents"
  },
  {
    "path": "test/resources/mme/invalid_version/abcde/dummy.txt",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "test/resources/mme_universal_script/code/inference.py",
    "chars": 2304,
    "preview": "# Copyright 2018-2020 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Ve"
  },
  {
    "path": "test/resources/mme_universal_script/code/requirements.txt",
    "chars": 13,
    "preview": "Pillow>=6.2.2"
  },
  {
    "path": "test/resources/mme_universal_script/half_plus_three/model/half_plus_three/00000123/assets/foo.txt",
    "chars": 19,
    "preview": "asset-file-contents"
  },
  {
    "path": "test/resources/mme_universal_script/half_plus_three/model/half_plus_three/00000124/assets/foo.txt",
    "chars": 20,
    "preview": "asset-file-contents\n"
  },
  {
    "path": "test/resources/models/half_plus_three/.00000111/.hidden_file",
    "chars": 0,
    "preview": ""
  },
  {
    "path": "test/resources/models/half_plus_three/00000123/assets/foo.txt",
    "chars": 19,
    "preview": "asset-file-contents"
  },
  {
    "path": "test/resources/models/half_plus_three/00000124/assets/foo.txt",
    "chars": 20,
    "preview": "asset-file-contents\n"
  },
  {
    "path": "test/unit/test_deep_learning_container.py",
    "chars": 6178,
    "preview": "# Copyright 2018 Amazon.com, Inc. or its affiliates. All Rights Reserved.\n#\n# Licensed under the Apache License, Version"
  },
  {
    "path": "test/unit/test_proxy_client.py",
    "chars": 4110,
    "preview": "import unittest.mock as mock\nimport pytest\nfrom tensorflow_serving.config import model_server_config_pb2\n\nfrom container"
  },
  {
    "path": "tox.ini",
    "chars": 2252,
    "preview": "# Tox (http://tox.testrun.org/) is a tool for running tests\n# in multiple virtualenvs. This configuration file will run "
  }
]

// ... and 24 more files (download for full content)

About this extraction

This page contains the full source code of the aws/sagemaker-tensorflow-serving-container GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 130 files (3.2 MB), approximately 839.3k tokens, and a symbol index with 272 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo