Repository: CVxTz/EEG_classification Branch: master Commit: f65060d6adb8 Files: 17 Total size: 73.1 KB Directory structure: gitextract_jfjg8v0s/ ├── .gitignore ├── LICENSE ├── README.md ├── code/ │ ├── baseline.py │ ├── cnn_crf_model.py │ ├── cnn_crf_model_20_folds.py │ ├── cnn_model.py │ ├── eda.py │ ├── lstm_model.py │ ├── models.py │ ├── run.sh │ └── utils.py ├── deepsleepnet_data/ │ ├── dhedfreader.py │ ├── download_physionet.sh │ ├── prepare_physionet.py │ └── readme.md └── requirements.txt ================================================ FILE CONTENTS ================================================ ================================================ FILE: .gitignore ================================================ # Byte-compiled / optimized / DLL files __pycache__/ *.py[cod] *$py.class # C extensions *.so # Distribution / packaging .Python build/ develop-eggs/ dist/ downloads/ eggs/ .eggs/ lib/ lib64/ parts/ sdist/ var/ wheels/ *.egg-info/ .installed.cfg *.egg MANIFEST # PyInstaller # Usually these files are written by a python script from a template # before PyInstaller builds the exe, so as to inject date/other infos into it. *.manifest *.spec # Installer logs pip-log.txt pip-delete-this-directory.txt # Unit test / coverage reports htmlcov/ .tox/ .coverage .coverage.* .cache nosetests.xml coverage.xml *.cover .hypothesis/ .pytest_cache/ # Translations *.mo *.pot # Django stuff: *.log local_settings.py db.sqlite3 # Flask stuff: instance/ .webassets-cache # Scrapy stuff: .scrapy # Sphinx documentation docs/_build/ # PyBuilder target/ # Jupyter Notebook .ipynb_checkpoints # pyenv .python-version # celery beat schedule file celerybeat-schedule # SageMath parsed files *.sage.py # Environments .env .venv env/ venv/ ENV/ env.bak/ venv.bak/ # Spyder project settings .spyderproject .spyproject # Rope project settings .ropeproject # mkdocs documentation /site # mypy .mypy_cache/ ================================================ FILE: LICENSE ================================================ Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "[]" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright [yyyy] [name of copyright owner] Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: README.md ================================================ [![DOI](https://zenodo.org/badge/DOI/10.5281/zenodo.4060151.svg)](https://doi.org/10.5281/zenodo.4060151) # EEG_classification Description of the approach : https://towardsdatascience.com/sleep-stage-classification-from-single-channel-eeg-using-convolutional-neural-networks-5c710d92d38e Sleep Stage Classification from Single Channel EEG using Convolutional Neural Networks ***** Photo by [Paul M](https://unsplash.com/photos/7i9yLoUgoP8?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText) on [Unsplash](https://unsplash.com/search/photos/owl?utm_source=unsplash&utm_medium=referral&utm_content=creditCopyText) Quality Sleep is an important part of a healthy lifestyle as lack of it can cause a list of [issues](https://www.webmd.com/sleep-disorders/features/10-results-sleep-loss#1) like a higher risk of cancer and chronic fatigue. This means that having the tools to automatically and easily monitor sleep can be powerful to help people sleep better.
Doctors use a recording of a signal called EEG which measures the electrical activity of the brain using an electrode to understand sleep stages of a patient and make a diagnosis about the quality if their sleep. In this post we will train a neural network to do the sleep stage classification automatically from EEGs. ### **Data** In our input we have a sequence of 30s epochs of EEG where each epoch has a label [{“W”, “N1”, “N2”, “N3”, “REM”}](https://en.wikipedia.org/wiki/Sleep_cycle). Fig 1 : EEG Epoch Fig 2 : Sleep stages through the night This post is based on a publicly available EEG Sleep data ( [Sleep-EDF](https://www.physionet.org/physiobank/database/sleep-edfx/) ) that was done on 20 subject, 19 of which have 2 full nights of sleep. We use the pre-processing scripts available in this [repo](https://github.com/akaraspt/deepsleepnet) and split the train/test so that no study subject is in both at the same time. The general objective is to go from a 1D sequence like in fig 1 and predict the output hypnogram like in fig 2. ### Model Description Recent approaches [[1]](https://arxiv.org/pdf/1703.04046.pdf) use a sub-model that encodes each epoch into a 1D vector of fixed size and then a second sequential sub-model that maps each epoch’s vector into a class from [{“W”, “N1”, “N2”, “N3”, “REM”}](https://en.wikipedia.org/wiki/Sleep_cycle). Here we use a 1D CNN to encode each Epoch and then another 1D CNN or LSTM that labels the sequence of epochs to create the final [hypnogram](https://en.wikipedia.org/wiki/Hypnogram). This allows the prediction for an epoch to take into account the context. Sub-model 1 : Epoch encoder Sub-model 2 : Sequential model for epoch classification The full model takes as input the sequence of EEG epochs ( 30 seconds each) where the sub-model 1 is applied to each epoch using the TimeDistributed Layer of [Keras](https://keras.io/) which produces a sequence of vectors. The sequence of vectors is then fed into a another sub-model like an LSTM or a CNN that produces the sequence of output labels.
We also use a linear Chain [CRF](https://en.wikipedia.org/wiki/Conditional_random_field) for one of the models and show that it can improve the performance. ### Training Procedure The full model is trained end-to-end from scratch using Adam optimizer with an initial learning rate of 1e⁻³ that is reduced each time the validation accuracy plateaus using the ReduceLROnPlateau Keras Callbacks. Accuracy Training curves ### Results We compare 3 different models : * CNN-CNN : This ones used a 1D CNN for the epoch encoding and then another 1D CNN for the sequence labeling. * CNN-CNN-CRF : This model used a 1D CNN for the epoch encoding and then a 1D CNN-CRF for the sequence labeling. * CNN-LSTM : This ones used a 1D CNN for the epoch encoding and then an LSTM for the sequence labeling. We evaluate each model on an independent test set and get the following results : * CNN-CNN : F1 = 0.81, ACCURACY = 0.87 * CNN-CNN-CRF : F1 = 0.82, ACCURACY =0.89 * CNN-LSTM : F1 = 0.71, ACCURACY = 0.76 The CNN-CNN-CRF outperforms the two other models because the CRF helps learn the transition probabilities between classes. The LSTM based model does not work as well because it is most sensitive to hyper-parameters like the optimizer and the batch size and requires extensive tuning to perform well. Ground Truth Hypnogram Predicted Hypnogram using CNN-CNN-CRF Source code available here : [https://github.com/CVxTz/EEG_classification](https://github.com/CVxTz/EEG_classification) I look forward to your suggestions and feedback. [[1] DeepSleepNet: a Model for Automatic Sleep Stage Scoring based on Raw Single-Channel EEG](https://arxiv.org/pdf/1703.04046.pdf) How to cite: ``` @software{mansar_youness_2020_4060151, author = {Mansar Youness}, title = {CVxTz/EEG\_classification: v1.0}, month = sep, year = 2020, publisher = {Zenodo}, version = {v1.0}, doi = {10.5281/zenodo.4060151}, url = {https://doi.org/10.5281/zenodo.4060151} } ``` ================================================ FILE: code/baseline.py ================================================ import numpy as np from glob import glob import os from sklearn.model_selection import train_test_split base_path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy" files = glob(os.path.join(base_path, "*.npz")) train_val, test = train_test_split(files, test_size=0.15, random_state=1337) train, val = train_test_split(train_val, test_size=0.1, random_state=1337) train_dict = {k: np.load(k) for k in train} test_dict = {k: np.load(k) for k in test} val_dict = {k: np.load(k) for k in val} ================================================ FILE: code/cnn_crf_model.py ================================================ from models import get_model_cnn_crf import numpy as np from utils import gen, chunker, WINDOW_SIZE, rescale_array from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau from sklearn.metrics import f1_score, accuracy_score, classification_report from glob import glob import os from sklearn.model_selection import train_test_split from tqdm import tqdm import matplotlib.pyplot as plt base_path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy" files = sorted(glob(os.path.join(base_path, "*.npz"))) ids = sorted(list(set([x.split("/")[-1][:5] for x in files]))) #split by test subject train_ids, test_ids = train_test_split(ids, test_size=0.15, random_state=1338) train_val, test = [x for x in files if x.split("/")[-1][:5] in train_ids],\ [x for x in files if x.split("/")[-1][:5] in test_ids] train, val = train_test_split(train_val, test_size=0.1, random_state=1337) train_dict = {k: np.load(k) for k in train} test_dict = {k: np.load(k) for k in test} val_dict = {k: np.load(k) for k in val} model = get_model_cnn_crf() file_path = "cnn_crf_model.h5" # model.load_weights(file_path) checkpoint = ModelCheckpoint(file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max') early = EarlyStopping(monitor="val_acc", mode="max", patience=20, verbose=1) redonplat = ReduceLROnPlateau(monitor="val_acc", mode="max", patience=5, verbose=2) callbacks_list = [checkpoint, early, redonplat] # early model.fit_generator(gen(train_dict, aug=False), validation_data=gen(val_dict), epochs=100, verbose=2, steps_per_epoch=1000, validation_steps=300, callbacks=callbacks_list) model.load_weights(file_path) preds = [] gt = [] for record in tqdm(test_dict): all_rows = test_dict[record]['x'] record_y_gt = [] record_y_pred = [] for batch_hyp in chunker(range(all_rows.shape[0])): X = all_rows[min(batch_hyp):max(batch_hyp)+1, ...] Y = test_dict[record]['y'][min(batch_hyp):max(batch_hyp)+1] X = np.expand_dims(X, 0) X = rescale_array(X) Y_pred = model.predict(X) Y_pred = Y_pred.argmax(axis=-1).ravel().tolist() gt += Y.ravel().tolist() preds += Y_pred record_y_gt += Y.ravel().tolist() record_y_pred += Y_pred # fig_1 = plt.figure(figsize=(12, 6)) # plt.plot(record_y_gt) # plt.title("Sleep Stages") # plt.ylabel("Classes") # plt.xlabel("Time") # plt.show() # # fig_2 = plt.figure(figsize=(12, 6)) # plt.plot(record_y_pred) # plt.title("Predicted Sleep Stages") # plt.ylabel("Classes") # plt.xlabel("Time") # plt.show() f1 = f1_score(gt, preds, average="macro") print("Seq Test f1 score : %s "% f1) acc = accuracy_score(gt, preds) print("Seq Test accuracy score : %s "% acc) print(classification_report(gt, preds)) ================================================ FILE: code/cnn_crf_model_20_folds.py ================================================ from models import get_model_cnn_crf import numpy as np from utils import gen, chunker, WINDOW_SIZE, rescale_array from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau from sklearn.metrics import f1_score, accuracy_score, classification_report from glob import glob import os from sklearn.model_selection import train_test_split from tqdm import tqdm import matplotlib.pyplot as plt base_path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy" files = sorted(glob(os.path.join(base_path, "*.npz"))) ids = list(set([x.split("/")[-1][:5] for x in files])) list_f1 = [] list_acc = [] preds = [] gt = [] for id in ids: test_ids = {id} train_ids = set([x.split("/")[-1][:5] for x in files]) - test_ids train_val, test = [x for x in files if x.split("/")[-1][:5] in train_ids],\ [x for x in files if x.split("/")[-1][:5] in test_ids] train, val = train_test_split(train_val, test_size=0.1, random_state=1337) train_dict = {k: np.load(k) for k in train} test_dict = {k: np.load(k) for k in test} val_dict = {k: np.load(k) for k in val} model = get_model_cnn_crf(lr=0.0001) file_path = "cnn_crf_model_20_folds.h5" # model.load_weights(file_path) checkpoint = ModelCheckpoint(file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max') early = EarlyStopping(monitor="val_acc", mode="max", patience=20, verbose=1) redonplat = ReduceLROnPlateau(monitor="val_acc", mode="max", patience=5, verbose=2) callbacks_list = [checkpoint, redonplat] # early model.fit_generator(gen(train_dict, aug=False), validation_data=gen(val_dict), epochs=40, verbose=2, steps_per_epoch=1000, validation_steps=300, callbacks=callbacks_list) model.load_weights(file_path) for record in tqdm(test_dict): all_rows = test_dict[record]['x'] record_y_gt = [] record_y_pred = [] for batch_hyp in chunker(range(all_rows.shape[0])): X = all_rows[min(batch_hyp):max(batch_hyp)+1, ...] Y = test_dict[record]['y'][min(batch_hyp):max(batch_hyp)+1] X = np.expand_dims(X, 0) X = rescale_array(X) Y_pred = model.predict(X) Y_pred = Y_pred.argmax(axis=-1).ravel().tolist() gt += Y.ravel().tolist() preds += Y_pred record_y_gt += Y.ravel().tolist() record_y_pred += Y_pred f1 = f1_score(gt, preds, average="macro") acc = accuracy_score(gt, preds) print("acc %s, f1 %s"%(acc, f1)) ================================================ FILE: code/cnn_model.py ================================================ from models import get_model_cnn import numpy as np from utils import gen, chunker, WINDOW_SIZE, rescale_array from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau from sklearn.metrics import f1_score, accuracy_score, classification_report from glob import glob import os from sklearn.model_selection import train_test_split from tqdm import tqdm base_path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy" files = sorted(glob(os.path.join(base_path, "*.npz"))) ids = sorted(list(set([x.split("/")[-1][:5] for x in files]))) #split by test subject train_ids, test_ids = train_test_split(ids, test_size=0.15, random_state=1338) train_val, test = [x for x in files if x.split("/")[-1][:5] in train_ids],\ [x for x in files if x.split("/")[-1][:5] in test_ids] train, val = train_test_split(train_val, test_size=0.1, random_state=1337) train_dict = {k: np.load(k) for k in train} test_dict = {k: np.load(k) for k in test} val_dict = {k: np.load(k) for k in val} model = get_model_cnn() file_path = "cnn_model.h5" # model.load_weights(file_path) checkpoint = ModelCheckpoint(file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max') early = EarlyStopping(monitor="val_acc", mode="max", patience=20, verbose=1) redonplat = ReduceLROnPlateau(monitor="val_acc", mode="max", patience=5, verbose=2) callbacks_list = [checkpoint, early, redonplat] # early model.fit_generator(gen(train_dict, aug=False), validation_data=gen(val_dict), epochs=100, verbose=2, steps_per_epoch=1000, validation_steps=300, callbacks=callbacks_list) model.load_weights(file_path) preds = [] gt = [] for record in tqdm(test_dict): all_rows = test_dict[record]['x'] for batch_hyp in chunker(range(all_rows.shape[0])): X = all_rows[min(batch_hyp):max(batch_hyp)+1, ...] Y = test_dict[record]['y'][min(batch_hyp):max(batch_hyp)+1] X = np.expand_dims(X, 0) X = rescale_array(X) Y_pred = model.predict(X) Y_pred = Y_pred.argmax(axis=-1).ravel().tolist() gt += Y.ravel().tolist() preds += Y_pred f1 = f1_score(gt, preds, average="macro") print("Seq Test f1 score : %s "% f1) acc = accuracy_score(gt, preds) print("Seq Test accuracy score : %s "% acc) print(classification_report(gt, preds)) ================================================ FILE: code/eda.py ================================================ import os import h5py import numpy as np import matplotlib.pyplot as plt import datetime as dt import collections import librosa path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy/SC4061E0.npz" data = np.load(path) x = data['x'] y = data['y'] fig_1 = plt.figure(figsize=(12, 6)) plt.plot(x[100, ...].ravel()) plt.title("EEG Epoch") plt.ylabel("Amplitude") plt.xlabel("Time") plt.show() fig_2 = plt.figure(figsize=(12, 6)) plt.plot(y.ravel()) plt.title("Sleep Stages") plt.ylabel("Classes") plt.xlabel("Time") plt.show() ================================================ FILE: code/lstm_model.py ================================================ from models import get_model_lstm import numpy as np from utils import gen, chunker, WINDOW_SIZE, rescale_array from keras.callbacks import ModelCheckpoint, EarlyStopping, ReduceLROnPlateau from sklearn.metrics import f1_score, accuracy_score, classification_report from glob import glob import os from sklearn.model_selection import train_test_split from tqdm import tqdm base_path = "/media/ml/data_ml/EEG/deepsleepnet/data_npy" files = sorted(glob(os.path.join(base_path, "*.npz"))) ids = sorted(list(set([x.split("/")[-1][:5] for x in files]))) #split by test subject train_ids, test_ids = train_test_split(ids, test_size=0.15, random_state=1338) train_val, test = [x for x in files if x.split("/")[-1][:5] in train_ids],\ [x for x in files if x.split("/")[-1][:5] in test_ids] train, val = train_test_split(train_val, test_size=0.1, random_state=1337) train_dict = {k: np.load(k) for k in train} test_dict = {k: np.load(k) for k in test} val_dict = {k: np.load(k) for k in val} model = get_model_lstm() file_path = "lstm_model.h5" # model.load_weights(file_path) checkpoint = ModelCheckpoint(file_path, monitor='val_acc', verbose=1, save_best_only=True, mode='max') early = EarlyStopping(monitor="val_acc", mode="max", patience=20, verbose=1) redonplat = ReduceLROnPlateau(monitor="val_acc", mode="max", patience=5, verbose=2) callbacks_list = [checkpoint, early, redonplat] # early model.fit_generator(gen(train_dict, aug=False), validation_data=gen(val_dict), epochs=100, verbose=2, steps_per_epoch=1000, validation_steps=300, callbacks=callbacks_list) model.load_weights(file_path) preds = [] gt = [] for record in tqdm(test_dict): all_rows = test_dict[record]['x'] for batch_hyp in chunker(range(all_rows.shape[0])): X = all_rows[min(batch_hyp):max(batch_hyp)+1, ...] Y = test_dict[record]['y'][min(batch_hyp):max(batch_hyp)+1] X = np.expand_dims(X, 0) X = rescale_array(X) Y_pred = model.predict(X) Y_pred = Y_pred.argmax(axis=-1).ravel().tolist() gt += Y.ravel().tolist() preds += Y_pred f1 = f1_score(gt, preds, average="macro") print("Seq Test f1 score : %s "% f1) acc = accuracy_score(gt, preds) print("Seq Test accuracy score : %s "% acc) print(classification_report(gt, preds)) ================================================ FILE: code/models.py ================================================ from keras import optimizers, losses, activations, models from keras.layers import Dense, Input, Dropout, Convolution1D, MaxPool1D, GlobalMaxPool1D, GlobalAveragePooling1D, \ concatenate, SpatialDropout1D, TimeDistributed, Bidirectional, LSTM from keras_contrib.layers import CRF from utils import WINDOW_SIZE def get_model(): nclass = 5 inp = Input(shape=(3000, 1)) img_1 = Convolution1D(16, kernel_size=5, activation=activations.relu, padding="valid")(inp) img_1 = Convolution1D(16, kernel_size=5, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(256, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(256, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = GlobalMaxPool1D()(img_1) img_1 = Dropout(rate=0.01)(img_1) dense_1 = Dropout(rate=0.01)(Dense(64, activation=activations.relu, name="dense_1")(img_1)) dense_1 = Dropout(rate=0.05)(Dense(64, activation=activations.relu, name="dense_2")(dense_1)) dense_1 = Dense(nclass, activation=activations.softmax, name="dense_3")(dense_1) model = models.Model(inputs=inp, outputs=dense_1) opt = optimizers.Adam(0.001) model.compile(optimizer=opt, loss=losses.sparse_categorical_crossentropy, metrics=['acc']) model.summary() return model def get_base_model(): inp = Input(shape=(3000, 1)) img_1 = Convolution1D(16, kernel_size=5, activation=activations.relu, padding="valid")(inp) img_1 = Convolution1D(16, kernel_size=5, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(32, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = MaxPool1D(pool_size=2)(img_1) img_1 = SpatialDropout1D(rate=0.01)(img_1) img_1 = Convolution1D(256, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = Convolution1D(256, kernel_size=3, activation=activations.relu, padding="valid")(img_1) img_1 = GlobalMaxPool1D()(img_1) img_1 = Dropout(rate=0.01)(img_1) dense_1 = Dropout(0.01)(Dense(64, activation=activations.relu, name="dense_1")(img_1)) base_model = models.Model(inputs=inp, outputs=dense_1) opt = optimizers.Adam(0.001) base_model.compile(optimizer=opt, loss=losses.sparse_categorical_crossentropy, metrics=['acc']) #model.summary() return base_model def get_model_cnn(): nclass = 5 seq_input = Input(shape=(None, 3000, 1)) base_model = get_base_model() # for layer in base_model.layers: # layer.trainable = False encoded_sequence = TimeDistributed(base_model)(seq_input) encoded_sequence = SpatialDropout1D(rate=0.01)(Convolution1D(128, kernel_size=3, activation="relu", padding="same")(encoded_sequence)) encoded_sequence = Dropout(rate=0.05)(Convolution1D(128, kernel_size=3, activation="relu", padding="same")(encoded_sequence)) #out = TimeDistributed(Dense(nclass, activation="softmax"))(encoded_sequence) out = Convolution1D(nclass, kernel_size=3, activation="softmax", padding="same")(encoded_sequence) model = models.Model(seq_input, out) model.compile(optimizers.Adam(0.001), losses.sparse_categorical_crossentropy, metrics=['acc']) model.summary() return model def get_model_lstm(): nclass = 5 seq_input = Input(shape=(None, 3000, 1)) base_model = get_base_model() for layer in base_model.layers: layer.trainable = False encoded_sequence = TimeDistributed(base_model)(seq_input) encoded_sequence = Bidirectional(LSTM(100, return_sequences=True))(encoded_sequence) encoded_sequence = Dropout(rate=0.5)(encoded_sequence) encoded_sequence = Bidirectional(LSTM(100, return_sequences=True))(encoded_sequence) #out = TimeDistributed(Dense(nclass, activation="softmax"))(encoded_sequence) out = Convolution1D(nclass, kernel_size=1, activation="softmax", padding="same")(encoded_sequence) model = models.Model(seq_input, out) model.compile(optimizers.Adam(0.001), losses.sparse_categorical_crossentropy, metrics=['acc']) model.summary() return model def get_model_cnn_crf(lr=0.001): nclass = 5 seq_input = Input(shape=(None, 3000, 1)) base_model = get_base_model() # for layer in base_model.layers: # layer.trainable = False encoded_sequence = TimeDistributed(base_model)(seq_input) encoded_sequence = SpatialDropout1D(rate=0.01)(Convolution1D(128, kernel_size=3, activation="relu", padding="same")(encoded_sequence)) encoded_sequence = Dropout(rate=0.05)(Convolution1D(128, kernel_size=3, activation="linear", padding="same")(encoded_sequence)) #out = TimeDistributed(Dense(nclass, activation="softmax"))(encoded_sequence) # out = Convolution1D(nclass, kernel_size=3, activation="linear", padding="same")(encoded_sequence) crf = CRF(nclass, sparse_target=True) out = crf(encoded_sequence) model = models.Model(seq_input, out) model.compile(optimizers.Adam(lr), crf.loss_function, metrics=[crf.accuracy]) model.summary() return model ================================================ FILE: code/run.sh ================================================ #python cnn_model.py > cnn_logs.txt #python cnn_crf_model.py > cnn_crf_logs.txt #python lstm_model.py > lstm_logs.txt python cnn_crf_model_20_folds.py > cnn_crf_folds_logs.txt ================================================ FILE: code/utils.py ================================================ import h5py import numpy as np import random WINDOW_SIZE = 100 def rescale_array(X): X = X / 20 X = np.clip(X, -5, 5) return X def aug_X(X): scale = 1 + np.random.uniform(-0.1, 0.1) offset = np.random.uniform(-0.1, 0.1) noise = np.random.normal(scale=0.05, size=X.shape) X = scale * X + offset + noise return X def gen(dict_files, aug=False): while True: record_name = random.choice(list(dict_files.keys())) batch_data = dict_files[record_name] all_rows = batch_data['x'] for i in range(10): start_index = random.choice(range(all_rows.shape[0]-WINDOW_SIZE)) X = all_rows[start_index:start_index+WINDOW_SIZE, ...] Y = batch_data['y'][start_index:start_index+WINDOW_SIZE] X = np.expand_dims(X, 0) Y = np.expand_dims(Y, -1) Y = np.expand_dims(Y, 0) if aug: X = aug_X(X) X = rescale_array(X) yield X, Y def chunker(seq, size=WINDOW_SIZE): return (seq[pos:pos + size] for pos in range(0, len(seq), size)) ================================================ FILE: deepsleepnet_data/dhedfreader.py ================================================ #Source : https://github.com/akaraspt/deepsleepnet ''' Reader for EDF+ files. TODO: - add support for log-transformed channels: http://www.edfplus.info/specs/edffloat.html and test with data generated with http://www.edfplus.info/downloads/software/NeuroLoopGain.zip. - check annotations with Schalk's Physiobank data. Copyright (c) 2012 Boris Reuderink. ''' import re, datetime, operator, logging import numpy as np from collections import namedtuple EVENT_CHANNEL = 'EDF Annotations' log = logging.getLogger(__name__) class EDFEndOfData: pass def tal(tal_str): '''Return a list with (onset, duration, annotation) tuples for an EDF+ TAL stream. ''' exp = '(?P[+\-]\d+(?:\.\d*)?)' + \ '(?:\x15(?P\d+(?:\.\d*)?))?' + \ '(\x14(?P[^\x00]*))?' + \ '(?:\x14\x00)' def annotation_to_list(annotation): return unicode(annotation, 'utf-8').split('\x14') if annotation else [] def parse(dic): return ( float(dic['onset']), float(dic['duration']) if dic['duration'] else 0., annotation_to_list(dic['annotation'])) return [parse(m.groupdict()) for m in re.finditer(exp, tal_str)] def edf_header(f): h = {} assert f.tell() == 0 # check file position assert f.read(8) == '0 ' # recording info) h['local_subject_id'] = f.read(80).strip() h['local_recording_id'] = f.read(80).strip() # parse timestamp (day, month, year) = [int(x) for x in re.findall('(\d+)', f.read(8))] (hour, minute, sec)= [int(x) for x in re.findall('(\d+)', f.read(8))] h['date_time'] = str(datetime.datetime(year + 2000, month, day, hour, minute, sec)) # misc header_nbytes = int(f.read(8)) subtype = f.read(44)[:5] h['EDF+'] = subtype in ['EDF+C', 'EDF+D'] h['contiguous'] = subtype != 'EDF+D' h['n_records'] = int(f.read(8)) h['record_length'] = float(f.read(8)) # in seconds nchannels = h['n_channels'] = int(f.read(4)) # read channel info channels = range(h['n_channels']) h['label'] = [f.read(16).strip() for n in channels] h['transducer_type'] = [f.read(80).strip() for n in channels] h['units'] = [f.read(8).strip() for n in channels] h['physical_min'] = np.asarray([float(f.read(8)) for n in channels]) h['physical_max'] = np.asarray([float(f.read(8)) for n in channels]) h['digital_min'] = np.asarray([float(f.read(8)) for n in channels]) h['digital_max'] = np.asarray([float(f.read(8)) for n in channels]) h['prefiltering'] = [f.read(80).strip() for n in channels] h['n_samples_per_record'] = [int(f.read(8)) for n in channels] f.read(32 * nchannels) # reserved assert f.tell() == header_nbytes return h class BaseEDFReader: def __init__(self, file): self.file = file def read_header(self): self.header = h = edf_header(self.file) # calculate ranges for rescaling self.dig_min = h['digital_min'] self.phys_min = h['physical_min'] phys_range = h['physical_max'] - h['physical_min'] dig_range = h['digital_max'] - h['digital_min'] assert np.all(phys_range > 0) assert np.all(dig_range > 0) self.gain = phys_range / dig_range def read_raw_record(self): '''Read a record with data and return a list containing arrays with raw bytes. ''' result = [] for nsamp in self.header['n_samples_per_record']: samples = self.file.read(nsamp * 2) if len(samples) != nsamp * 2: raise EDFEndOfData result.append(samples) return result def convert_record(self, raw_record): '''Convert a raw record to a (time, signals, events) tuple based on information in the header. ''' h = self.header dig_min, phys_min, gain = self.dig_min, self.phys_min, self.gain time = float('nan') signals = [] events = [] for (i, samples) in enumerate(raw_record): if h['label'][i] == EVENT_CHANNEL: ann = tal(samples) time = ann[0][0] events.extend(ann[1:]) # print(i, samples) # exit() else: # 2-byte little-endian integers dig = np.fromstring(samples, ' 0: remove_idx = np.hstack(remove_idx) select_idx = np.setdiff1d(np.arange(len(raw_ch_df)), remove_idx) else: select_idx = np.arange(len(raw_ch_df)) print "after remove unwanted: {}".format(select_idx.shape) # Select only the data with labels print "before intersect label: {}".format(select_idx.shape) label_idx = np.hstack(label_idx) select_idx = np.intersect1d(select_idx, label_idx) print "after intersect label: {}".format(select_idx.shape) # Remove extra index if len(label_idx) > len(select_idx): print "before remove extra labels: {}, {}".format(select_idx.shape, labels.shape) extra_idx = np.setdiff1d(label_idx, select_idx) # Trim the tail if np.all(extra_idx > select_idx[-1]): n_trims = len(select_idx) % int(EPOCH_SEC_SIZE * sampling_rate) n_label_trims = int(math.ceil(n_trims / (EPOCH_SEC_SIZE * sampling_rate))) select_idx = select_idx[:-n_trims] labels = labels[:-n_label_trims] print "after remove extra labels: {}, {}".format(select_idx.shape, labels.shape) # Remove movement and unknown stages if any raw_ch = raw_ch_df.values[select_idx] # Verify that we can split into 30-s epochs if len(raw_ch) % (EPOCH_SEC_SIZE * sampling_rate) != 0: raise Exception("Something wrong") n_epochs = len(raw_ch) / (EPOCH_SEC_SIZE * sampling_rate) # Get epochs and their corresponding labels x = np.asarray(np.split(raw_ch, n_epochs)).astype(np.float32) y = labels.astype(np.int32) assert len(x) == len(y) # Select on sleep periods w_edge_mins = 30 nw_idx = np.where(y != stage_dict["W"])[0] start_idx = nw_idx[0] - (w_edge_mins * 2) end_idx = nw_idx[-1] + (w_edge_mins * 2) if start_idx < 0: start_idx = 0 if end_idx >= len(y): end_idx = len(y) - 1 select_idx = np.arange(start_idx, end_idx+1) print("Data before selection: {}, {}".format(x.shape, y.shape)) x = x[select_idx] y = y[select_idx] print("Data after selection: {}, {}".format(x.shape, y.shape)) # Save filename = ntpath.basename(psg_fnames[i]).replace("-PSG.edf", ".npz") save_dict = { "x": x, "y": y, "fs": sampling_rate, "ch_label": select_ch, "header_raw": h_raw, "header_annotation": h_ann, } np.savez(os.path.join(args.output_dir, filename), **save_dict) print "\n=======================================\n" if __name__ == "__main__": main() ================================================ FILE: deepsleepnet_data/readme.md ================================================ The files in this folders were copied with minor modifications from #Source : https://github.com/akaraspt/deepsleepnet #To get the dataset : cd data chmod +x download_physionet.sh ./download_physionet.sh ###Those scripts taken from the deepsleepnet only work with python2 python2 prepare_physionet.py --data_dir data --output_dir data/eeg_fpz_cz --select_ch 'EEG Fpz-Cz' This subfolder is under the following license : Copyright 2017 Akara Supratak and Hao Dong. All rights reserved. Apache License Version 2.0, January 2004 http://www.apache.org/licenses/ TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION 1. Definitions. "License" shall mean the terms and conditions for use, reproduction, and distribution as defined by Sections 1 through 9 of this document. "Licensor" shall mean the copyright owner or entity authorized by the copyright owner that is granting the License. "Legal Entity" shall mean the union of the acting entity and all other entities that control, are controlled by, or are under common control with that entity. For the purposes of this definition, "control" means (i) the power, direct or indirect, to cause the direction or management of such entity, whether by contract or otherwise, or (ii) ownership of fifty percent (50%) or more of the outstanding shares, or (iii) beneficial ownership of such entity. "You" (or "Your") shall mean an individual or Legal Entity exercising permissions granted by this License. "Source" form shall mean the preferred form for making modifications, including but not limited to software source code, documentation source, and configuration files. "Object" form shall mean any form resulting from mechanical transformation or translation of a Source form, including but not limited to compiled object code, generated documentation, and conversions to other media types. "Work" shall mean the work of authorship, whether in Source or Object form, made available under the License, as indicated by a copyright notice that is included in or attached to the work (an example is provided in the Appendix below). "Derivative Works" shall mean any work, whether in Source or Object form, that is based on (or derived from) the Work and for which the editorial revisions, annotations, elaborations, or other modifications represent, as a whole, an original work of authorship. For the purposes of this License, Derivative Works shall not include works that remain separable from, or merely link (or bind by name) to the interfaces of, the Work and Derivative Works thereof. "Contribution" shall mean any work of authorship, including the original version of the Work and any modifications or additions to that Work or Derivative Works thereof, that is intentionally submitted to Licensor for inclusion in the Work by the copyright owner or by an individual or Legal Entity authorized to submit on behalf of the copyright owner. For the purposes of this definition, "submitted" means any form of electronic, verbal, or written communication sent to the Licensor or its representatives, including but not limited to communication on electronic mailing lists, source code control systems, and issue tracking systems that are managed by, or on behalf of, the Licensor for the purpose of discussing and improving the Work, but excluding communication that is conspicuously marked or otherwise designated in writing by the copyright owner as "Not a Contribution." "Contributor" shall mean Licensor and any individual or Legal Entity on behalf of whom a Contribution has been received by Licensor and subsequently incorporated within the Work. 2. Grant of Copyright License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable copyright license to reproduce, prepare Derivative Works of, publicly display, publicly perform, sublicense, and distribute the Work and such Derivative Works in Source or Object form. 3. Grant of Patent License. Subject to the terms and conditions of this License, each Contributor hereby grants to You a perpetual, worldwide, non-exclusive, no-charge, royalty-free, irrevocable (except as stated in this section) patent license to make, have made, use, offer to sell, sell, import, and otherwise transfer the Work, where such license applies only to those patent claims licensable by such Contributor that are necessarily infringed by their Contribution(s) alone or by combination of their Contribution(s) with the Work to which such Contribution(s) was submitted. If You institute patent litigation against any entity (including a cross-claim or counterclaim in a lawsuit) alleging that the Work or a Contribution incorporated within the Work constitutes direct or contributory patent infringement, then any patent licenses granted to You under this License for that Work shall terminate as of the date such litigation is filed. 4. Redistribution. You may reproduce and distribute copies of the Work or Derivative Works thereof in any medium, with or without modifications, and in Source or Object form, provided that You meet the following conditions: (a) You must give any other recipients of the Work or Derivative Works a copy of this License; and (b) You must cause any modified files to carry prominent notices stating that You changed the files; and (c) You must retain, in the Source form of any Derivative Works that You distribute, all copyright, patent, trademark, and attribution notices from the Source form of the Work, excluding those notices that do not pertain to any part of the Derivative Works; and (d) If the Work includes a "NOTICE" text file as part of its distribution, then any Derivative Works that You distribute must include a readable copy of the attribution notices contained within such NOTICE file, excluding those notices that do not pertain to any part of the Derivative Works, in at least one of the following places: within a NOTICE text file distributed as part of the Derivative Works; within the Source form or documentation, if provided along with the Derivative Works; or, within a display generated by the Derivative Works, if and wherever such third-party notices normally appear. The contents of the NOTICE file are for informational purposes only and do not modify the License. You may add Your own attribution notices within Derivative Works that You distribute, alongside or as an addendum to the NOTICE text from the Work, provided that such additional attribution notices cannot be construed as modifying the License. You may add Your own copyright statement to Your modifications and may provide additional or different license terms and conditions for use, reproduction, or distribution of Your modifications, or for any such Derivative Works as a whole, provided Your use, reproduction, and distribution of the Work otherwise complies with the conditions stated in this License. 5. Submission of Contributions. Unless You explicitly state otherwise, any Contribution intentionally submitted for inclusion in the Work by You to the Licensor shall be under the terms and conditions of this License, without any additional terms or conditions. Notwithstanding the above, nothing herein shall supersede or modify the terms of any separate license agreement you may have executed with Licensor regarding such Contributions. 6. Trademarks. This License does not grant permission to use the trade names, trademarks, service marks, or product names of the Licensor, except as required for reasonable and customary use in describing the origin of the Work and reproducing the content of the NOTICE file. 7. Disclaimer of Warranty. Unless required by applicable law or agreed to in writing, Licensor provides the Work (and each Contributor provides its Contributions) on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied, including, without limitation, any warranties or conditions of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A PARTICULAR PURPOSE. You are solely responsible for determining the appropriateness of using or redistributing the Work and assume any risks associated with Your exercise of permissions under this License. 8. Limitation of Liability. In no event and under no legal theory, whether in tort (including negligence), contract, or otherwise, unless required by applicable law (such as deliberate and grossly negligent acts) or agreed to in writing, shall any Contributor be liable to You for damages, including any direct, indirect, special, incidental, or consequential damages of any character arising as a result of this License or out of the use or inability to use the Work (including but not limited to damages for loss of goodwill, work stoppage, computer failure or malfunction, or any and all other commercial damages or losses), even if such Contributor has been advised of the possibility of such damages. 9. Accepting Warranty or Additional Liability. While redistributing the Work or Derivative Works thereof, You may choose to offer, and charge a fee for, acceptance of support, warranty, indemnity, or other liability obligations and/or rights consistent with this License. However, in accepting such obligations, You may act only on Your own behalf and on Your sole responsibility, not on behalf of any other Contributor, and only if You agree to indemnify, defend, and hold each Contributor harmless for any liability incurred by, or claims asserted against, such Contributor by reason of your accepting any such warranty or additional liability. END OF TERMS AND CONDITIONS APPENDIX: How to apply the Apache License to your work. To apply the Apache License to your work, attach the following boilerplate notice, with the fields enclosed by brackets "{}" replaced with your own identifying information. (Don't include the brackets!) The text should be enclosed in the appropriate comment syntax for the file format. We also recommend that a file or class name and description of purpose be included on the same "printed page" as the copyright notice for easier identification within third-party archives. Copyright 2017 Akara Supratak and Hao Dong Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License. ================================================ FILE: requirements.txt ================================================ librosa==0.5.1 numpy==1.15.2 Keras==2.2.2 tqdm==4.23.2 keras_contrib==2.0.8 h5py==2.8.0 matplotlib==2.1.0 scikit_learn==0.20.0