Copy disabled (too large)
Download .txt
Showing preview only (13,869K chars total). Download the full file to get everything.
Repository: vietnlp/etnlp
Branch: master
Commit: 88862f63d4a8
Files: 59
Total size: 13.1 MB
Directory structure:
gitextract_cg5yuxp9/
├── .gitignore
├── README.md
└── src/
├── codes/
│ ├── 00.run_etnlp_preprocessing.sh
│ ├── 01.run_etnlp_evaluator.sh
│ ├── 02.run_etnlp_extractor.sh
│ ├── 03.run_etnlp_visualizer_inter.sh
│ ├── 04.run_etnlp_visualizer_sbs.sh
│ ├── api/
│ │ ├── __init__.py
│ │ ├── embedding_evaluator.py
│ │ ├── embedding_extractor.py
│ │ ├── embedding_preprocessing.py
│ │ └── embedding_visualizer.py
│ ├── embeddings/
│ │ ├── __init__.py
│ │ ├── embedding_configs.py
│ │ ├── embedding_models.py
│ │ └── embedding_utils.py
│ ├── etnlp_api.py
│ ├── requirements.txt
│ ├── setup.py
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── emb_utils.py
│ │ ├── embedding_io.py
│ │ ├── eval_utils.py
│ │ ├── file_utils.py
│ │ ├── string_utils.py
│ │ ├── vectors.py
│ │ └── word.py
│ └── visualizer/
│ ├── README.md
│ ├── __init__.py
│ ├── outof_w2vec.dict
│ ├── static/
│ │ └── style.css
│ ├── templates/
│ │ ├── app.html
│ │ └── search.html
│ └── visualizer_sbs.py
├── data/
│ ├── embedding_analogies/
│ │ ├── english/
│ │ │ └── english-word-analogy.txt
│ │ ├── portuguese/
│ │ │ ├── LX-4WAnalogies-ETNLP.txt
│ │ │ ├── LX-4WAnalogies.txt
│ │ │ ├── POST_TAG_vocabulary.txt
│ │ │ ├── evaluator_results.txt
│ │ │ └── vocab.txt
│ │ └── vi/
│ │ ├── Multi_evaluator_results.txt
│ │ ├── analogy_list_vi_ner.txt
│ │ └── elmo_results_out_dict.txt
│ ├── embedding_dicts/
│ │ ├── C2V.vec
│ │ ├── ELMO_23.vec
│ │ ├── FastText_23.vec
│ │ ├── MULTI_23.vec
│ │ ├── W2V_C2V_23.vec
│ │ ├── baomoi_c2v_dims_300.vec
│ │ └── vn_elmo_medium_c2v.vec
│ ├── glove2vec_dicts/
│ │ ├── glove1.vec
│ │ ├── glove1_w2v.vec
│ │ ├── glove2.vec
│ │ └── glove2_w2v.vec
│ └── vocab.txt
└── examples/
├── test1_etnlp_preprocessing.py
├── test2_etnlp_extractor.py
├── test3_etnlp_evaluator.py
└── test4_etnlp_visualizer.py
================================================
FILE CONTENTS
================================================
================================================
FILE: .gitignore
================================================
__pycache__/
*.py[cod]
*$py.class
# C extensions
*.so
# Distribution / packaging
.Python
build/
dist/
develop-eggs/
downloads/
eggs/
.eggs/
lib/
lib64/
parts/
sdist/
var/
wheels/
pip-wheel-metadata/
share/python-wheels/
*.egg-info/
.installed.cfg
*.egg
MANIFEST
# PyInstaller
# Usually these files are written by a python script from a template
# before PyInstaller builds the exe, so as to inject date/other infos into it.
*.manifest
*.spec
# Installer logs
pip-log.txt
pip-delete-this-directory.txt
# Unit test / coverage reports
htmlcov/
.tox/
.nox/
.coverage
.coverage.*
.cache
nosetests.xml
coverage.xml
*.cover
.hypothesis/
.pytest_cache/
# Translations
*.mo
*.pot
# Django stuff:
*.log
local_settings.py
db.sqlite3
# Flask stuff:
instance/
.webassets-cache
# Scrapy stuff:
.scrapy
# Sphinx documentation
docs/_build/
# PyBuilder
target/
# Jupyter Notebook
.ipynb_checkpoints
# IPython
profile_default/
ipython_config.py
# pyenv
.python-version
# pipenv
# According to pypa/pipenv#598, it is recommended to include Pipfile.lock in version control.
# However, in case of collaboration, if having platform-specific dependencies or dependencies
# having no cross-platform support, pipenv may install dependencies that don’t work, or not
# install all needed dependencies.
#Pipfile.lock
# celery beat schedule file
celerybeat-schedule
# SageMath parsed files
*.sage.py
# Environments
.env
.venv
env/
venv/
ENV/
env.bak/
venv.bak/
# Spyder project settings
.spyderproject
.spyproject
# Rope project settings
.ropeproject
# mkdocs documentation
/site
# mypy
.mypy_cache/
.dmypy.json
dmypy.json
# Pyre type checker
.pyre/
================================================
FILE: README.md
================================================
ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings
=====
# Table of contents
1. [Introduction](#introduction)
2. [More about ETNLP](#moreaboutETNLP)
3. [Installation and How to Use](#installation_and_howtouse)
4. [Download Resources](#Download_Resources)
# I. Overview <a name="introduction"></a>
## A glimpse of ETNLP:
- Github: https://github.com/vietnlp/etnlp
- Video: https://vimeo.com/317599106
- Paper: https://arxiv.org/abs/1903.04433
# II. How do I cite ETNLP?
Please CITE paper the Arxiv paper whenever ETNLP (or the pre-trained embeddings) is used to produce published results or incorporated into other software:
```
@inproceedings{vu:2019n,
title={ETNLP: A Visual-Aided Systematic Approach to Select Pre-Trained Embeddings for a Downstream Task},
author={Vu, Xuan-Son and Vu, Thanh and Tran, Son N and Jiang, Lili},
booktitle={Proceedings of the International Conference Recent Advances in Natural Language Processing (RANLP)},
year={2019}
}
```
# III. More about ETNLP <a name="moreaboutETNLP"></a>:
## 1. Embedding Evaluator:
To compare quality of embedding models on the word analogy task.
- Input: a pre-trained embedding vector file (word2vec format), and word analogy file.
- Output: (1) evaluate quality of the embedding model based on the MAP/P@10 score, (2) Paired t-tests to show significant level between different word embeddings.
### 1.1. Note: The word analogy list is created by:
- Adopt from the English list by selecting suitable categories and translating to the target language (i.e., Vietnamese).
- Removing inappropriate categories (i.e., category 6, 10, 11, 14) in the target language (i.e., Vietnamese).
- Adding custom category that is suitable for the target language (e.g., cities and their zones in Vietnam for Vietnamese).
Since most of this process is automatically done, it can be applied in other languages as well.
### 1.2. Selected categories for Vietnamese:
> 1. capital-common-countries
> 2. capital-world
> 3. currency: E.g., Algeria | dinar | Angola | kwanza
> 4. city-in-zone (Vietnam's cities and its zone)
> 5. family (boy|girl | brother | sister)
> 6. gram1-adjective-to-adverb (NOT USED)
> 7. gram2-opposite (e.g., acceptable | unacceptable | aware | unaware)
> 8. gram3-comparative (e.g., bad | worse | big | bigger)
> 9. gram4-superlative (e.g., bad | worst | big | biggest)
> 10. gram5-present-participle (NOT USED)
> 11. gram6-nationality-adjective-nguoi-tieng (e.g., Albania | Albanian | Argentina | Argentinean)
> 12. gram7-past-tense (NOT USED)
> 13. gram8-plural-cac-nhung (e.g., banana | bananas | bird | birds) (NOT USED)
> 14. gram9-plural-verbs (NOT USED)
### 1.3 Evaluation results (in details)
* Analogy: Word Analogy Task
* NER (w): NER task with hyper-parameters selected from the best F1 on validation set.
* NER (w.o): NER task without selecting hyper-parameters from the validation set.
| Model | NER.w | NER.w.o | Analogy |
|------------------------------ |------------- | ------------------ |------------------ |
| BiLC3 + w2v | 89.01 | 89.41 | 0.4796 |
| BiLC3 + Bert_Base | 88.26 | 89.91 | 0.4609 |
| BiLC3 + w2v_c2v | 89.46 | 89.46 | 0.4796 |
| BiLC3 + fastText | 89.65 | 89.84 | 0.4970 |
| BiLC3 + Elmo | 89.67 | 90.84 | **0.4999** |
| BiLC3 + MULTI_WC_F_E_B | **91.09** | **91.75** | 0.4906|
## 2. Embedding Extractor: To extract embedding vectors for other tasks.
- Input: (1) list of input embeddings, (2) a vocabulary file.
- Output: embedding vectors of the given vocab file in `.txt`, i.e., each line conains the embedding for a word. The file then be compressed in .gz format. This format is widely used in existing NLP Toolkits (e.g., Reimers et al. [1]).
### Extra options:
- `-input-c2v`: character embedding file
- `solveoov:1`: to solve OOV words of the 1st embedding. Similarly for more than one embedding: e.g., `solveoov:1:2`.
[1] Nils Reimers and Iryna Gurevych, Reporting Score Distributions Makes a Difference: Performance Study of LSTM-networks for Sequence Tagging, 2017, http://arxiv.org/abs/1707.09861, arXiv.
## 3. Visualizer: to explore the embedding space and compare between different embeddings.
### Screenshot of viewing multiple-embeddings side-by-side (Vietnamese):

### Screenshot of viewing each embedding interactively (Vietnamese):

### Screenshot of viewing each embedding side-by-side (English):

# IV. Installation and How to use ETNLP <a name="installation_and_howtouse"></a>
## 1. Installation:
From source codes (Python 3.6.x):
> 1. cd src/codes/
> 2. pip install -r requirements.txt
> 3. python setup.py install
From pip (python 3.6.x)
> 1. sudo apt-get install python3-dev
> 2. pip install cython
> 3. pip install git+git://github.com/vietnlp/etnlp.git
OR:
> 1. pip install etnlp
## 2. Examples
> 1. cd src/examples
> 2. python test1_etnlp_preprocessing.py
> 3. python test2_etnlp_extractor.py
> 4. python test3_etnlp_evaluator.py
> 5. python test4_etnlp_visualizer.py
### Example of using Fasttext-Sent2Vec:
- 01. Install: https://github.com/epfml/sent2vec
```
01. git clone https://github.com/epfml/sent2vec
02. cd sent2vec; pip install .
```
- 02. Extract embeddings for sentences (no requirement for tokenization before extracting embedding of sentences).
```
import sent2vec
model = sent2vec.Sent2vecModel()
model.load_model('opendata_wiki_lowercase_words.bin')
emb = model.embed_sentence("tôi là sinh viên đh công nghệ, đại học quôc gia hà nội")
embs = model.embed_sentences(["tôi là sinh viên", "tôi là nhà thơ", "tôi là bác sĩ"])
```
### 3. Visualization
Side-by-side visualization:
> 1. sh src/codes/04.run_etnlp_visualizer_sbs.sh
Interactive visualization:
> 1. sh src/codes/04.run_etnlp_visualizer_inter.sh
# V. Available Lexical Resources <a name="Download_Resources"></a>
## 1. Word Analogy List for Vietnamese
| Word Analogy List | Download Link (NER Task)| Download Link (General)|
|------------------------------|---------------|---------------|
| Vietnamese (This work) | [Link1](https://drive.google.com/file/d/1eA5yvla4BhAIfWsmZherT1GEW6gzDC-1/view?usp=sharing)| [Link1](https://drive.google.com/file/d/1YJ9d5rVKMMKF1xWWZi26_sNpgULTvxwg/view?usp=sharing)|
| English (Mirkolov et al. [2]) | [Link2]| [Link2](https://drive.google.com/file/d/10rWxGu8-nbQmYC8wrIussSZjY0lDh6RP/view?usp=sharing)|
| Portuguese (Hartmann et al. [3]) | [Link3]| [Link3](https://github.com/nathanshartmann/portuguese_word_embeddings/blob/master/analogies/testset/LX-4WAnalogies.txt)|
## 2. Multiple pre-trained embedding models for Vietnamese
- Training data: Wiki in Vietnamese:
| # of sentences | # of tokenized words|
|------------------------------|---------------|
| 6,685,621 | 114,997,587 |
- Download Pre-trained Embeddings: <br>
(Note: The MULTI_WC_F_E_B is the concatenation of four embeddings: W2V_C2V, fastText, ELMO, and Bert_Base.)
| Embedding Model | Download Link (NER Task) | Download Link (AIVIVN SentiTask) | Download Link (General) |
|------------------------------|---------------|---------------|---------------|
| w2v | [Link1](https://drive.google.com/file/d/1LHaZ8LXxteHzod42naqJZYCwwq5mI9aL/view?usp=sharing) (dim=300)| [Link1] | [Link1] |
| w2v_c2v | [Link2](https://drive.google.com/file/d/1-M9Tb9l8mNmP3RKxZiZNK1Vpbng2yw4l/view?usp=sharing) (dim=300)| [Link2] | [Link2] |
| fastText | [Link3](https://drive.google.com/file/d/1dHCPhKFjtDjbrUeeymheDnlhjtaljPGE/view?usp=sharing) (dim=300)| [Link3] | [Link3] |
| fastText-[Sent2Vec](https://github.com/epfml/sent2vec) | [Link3]| [Link3] | [Link3](https://drive.google.com/file/d/1BzL1mpdfqCCJioCdAlTVshbrz0lGfP2D/view?usp=sharing) (dim=300, 6GB, trained on 20GB of [news data](https://github.com/binhvq/news-corpus) and Wiki-data of ETNLP. |
| Elmo | [Link4](https://drive.google.com/file/d/1zDaSD8NsZNXGyd9iVOxTcb7CP61Ixo-r/view?usp=sharing) (dim=1024)| [Link4](https://drive.google.com/file/d/1jVJtF0f6SbtUd-t3bnywP6mFnz0QXPIx/view?usp=sharing) (dim=1024)| [Link4](https://drive.google.com/file/d/1XPsTzg1Gex-Hh2nl9344YlZc1orOVBDp/view?usp=sharing) (dim=1024, 731MB and 1.9GB after extraction.)|
| Bert_base | [Link5](https://drive.google.com/file/d/16fRkmIHiB16OlM8WdFmoApGtLMf6YJJ8/view?usp=sharing) (dim=768)| [Link5] | [Link5] |
| MULTI_WC_F_E_B | [Link6](https://drive.google.com/file/d/1gq7b8hs31VzoeO3n3C__ftlDnE_iBZW2/view?usp=sharing) (dim=2392)| [Link6] | [Link6] |
# VI. Versioning
For transparency and insight into our release cycle, and for striving to maintain backward compatibility, ETNLP will be maintained under the Semantic Versioning guidelines as much as possible.
Releases will be numbered with the following format:
`<major>.<minor>.<patch>`
And constructed with the following guidelines:
* Breaking backward compatibility bumps the major (and resets the minor and patch)
* New additions without breaking backward compatibility bumps the minor (and resets the patch)
* Bug fixes and misc changes bumps the patch
For more information on SemVer, please visit http://semver.org/.
================================================
FILE: src/codes/00.run_etnlp_preprocessing.sh
================================================
#!/bin/sh
export PYTHONPATH="$PYTHONPATH:$PWD"
INPUT_FILES="../data/glove2vec_dicts/glove1.vec;../data/glove2vec_dicts/glove2.vec"
OUTPUT_FILES="../data/glove2vec_dicts/glove1_w2v.vec;../data/glove2vec_dicts/glove2_w2v.vec"
# do_normalize: use this flag to normalize in case of multiple embeddings.
python ./etnlp_api.py -input $INPUT_FILES -output $OUTPUT_FILES -args "glove2w2v"
================================================
FILE: src/codes/01.run_etnlp_evaluator.sh
================================================
#!/bin/sh
export PYTHONPATH="$PYTHONPATH:$PWD"
INPUT_FILES="../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/FastText_23.vec;../data/embedding_dicts/W2V_C2V_23.vec;../data/embedding_dicts/MULTI_23.vec"
ANALOGY_FILE="../data/embedding_analogies/vi/solveable_analogies_vi.txt"
OUT_FILE="../data/embedding_analogies/vi/Multi_evaluator_results.txt"
python ./etnlp_api.py -input $INPUT_FILES -output $OUT_FILE -analoglist $ANALOGY_FILE -args eval
================================================
FILE: src/codes/02.run_etnlp_extractor.sh
================================================
#!/bin/sh
export PYTHONPATH="$PYTHONPATH:$PWD"
INPUT_FILES="../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/FastText_23.vec;../data/embedding_dicts/W2V_C2V_23.vec;../data/embedding_dicts/MULTI_23.vec"
C2V="../data/embedding_dicts/C2V.vec"
OUTPUT="../data/embedding_dicts/MULTI_W_F_B_E.vec"
VOCAB_FILE="../data/vocab.txt"
python ./etnlp_api.py -input $INPUT_FILES -vocab $VOCAB_FILE -input_c2v $C2V -args "extract" -output $OUTPUT
================================================
FILE: src/codes/03.run_etnlp_visualizer_inter.sh
================================================
#!/bin/sh
export PYTHONPATH="$PYTHONPATH:$PWD"
INPUT_FILES="../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/FastText_23.vec;../data/embedding_dicts/W2V_C2V_23.vec;../data/embedding_dicts/MULTI_23.vec"
python3 ./etnlp_api.py -input $INPUT_FILES -args visualizer -port 8889
================================================
FILE: src/codes/04.run_etnlp_visualizer_sbs.sh
================================================
#!/bin/sh
export PYTHONPATH="$PYTHONPATH:$PWD"
INPUT_FILES="../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/FastText_23.vec;../data/embedding_dicts/W2V_C2V_23.vec;../data/embedding_dicts/MULTI_23.vec"
# python ./visualizer/visualizer_sbs.py -input $INPUT_FILES -args visualizer
python3 ./visualizer/visualizer_sbs.py $INPUT_FILES
================================================
FILE: src/codes/api/__init__.py
================================================
================================================
FILE: src/codes/api/embedding_evaluator.py
================================================
import logging
import gensim
import argparse
from gensim.models.keyedvectors import WordEmbeddingsKeyedVectors, Word2VecKeyedVectors
from gensim import utils, matutils
from six import string_types
from numpy import dot, float32 as REAL, array, ndarray, argmax
from utils import embedding_io, emb_utils
from embeddings.embedding_configs import EmbeddingConfigs
logger = logging.getLogger(__name__)
class new_Word2VecKeyedVectors(Word2VecKeyedVectors):
def __init__(self, vector_size):
super(Word2VecKeyedVectors, self).__init__(vector_size=vector_size)
def most_similar(self, positive=None, negative=None, topn=10, restrict_vocab=None, indexer=None):
"""
Find the top-N most similar words. Positive words contribute positively towards the
similarity, negative words negatively.
This method computes cosine similarity between a simple mean of the projection
weight vectors of the given words and the vectors for each word in the model.
The method corresponds to the `word-analogy` and `distance` scripts in the original
word2vec implementation.
If topn is False, most_similar returns the vector of similarity scores.
`restrict_vocab` is an optional integer which limits the range of vectors which
are searched for most-similar values. For example, restrict_vocab=10000 would
only check the first 10000 word vectors in the vocabulary order. (This may be
meaningful if you've sorted the vocabulary by descending frequency.)
Example::
>>> trained_model.most_similar(positive=['woman', 'king'], negative=['man'])
[('queen', 0.50882536), ...]
"""
if positive is None:
positive = []
if negative is None:
negative = []
self.init_sims()
if isinstance(positive, string_types) and not negative:
# allow calls like most_similar('dog'), as a shorthand for most_similar(['dog'])
positive = [positive]
# add weights for each word, if not already present; default to 1.0 for positive and -1.0 for negative words
positive = [
(word, 1.0) if isinstance(word, string_types + (ndarray,)) else word
for word in positive
]
negative = [
(word, -1.0) if isinstance(word, string_types + (ndarray,)) else word
for word in negative
]
# compute the weighted average of all words
all_words, mean = set(), []
for word, weight in positive + negative:
if isinstance(word, ndarray):
mean.append(weight * word)
else:
mean.append(weight * self.word_vec(word, use_norm=True))
if word in self.vocab:
all_words.add(self.vocab[word].index)
if not mean:
raise ValueError("cannot compute similarity with no input")
mean = matutils.unitvec(array(mean).mean(axis=0)).astype(REAL)
if indexer is not None:
return indexer.most_similar(mean, topn)
limited = self.syn0norm if restrict_vocab is None else self.syn0norm[:restrict_vocab]
dists = dot(limited, mean)
if not topn:
return dists
best = matutils.argsort(dists, topn=topn + len(all_words), reverse=True)
# ignore (don't return) words from the input
result = [(self.index2word[sim], float(dists[sim])) for sim in best if sim not in all_words]
return result[:topn]
def new_accuracy(self, questions, restrict_vocab=30000, most_similar=most_similar, case_insensitive=True):
"""
Compute accuracy of the model. `questions` is a filename where lines are
4-tuples of words, split into sections by ": SECTION NAME" lines.
See questions-words.txt in
https://storage.googleapis.com/google-code-archive-source/v2/code.google.com/word2vec/source-archive.zip
for an example.
The accuracy is reported (=printed to log and returned as a list) for each
section separately, plus there's one aggregate summary at the end.
Use `restrict_vocab` to ignore all questions containing a word not in the first `restrict_vocab`
words (default 30,000). This may be meaningful if you've sorted the vocabulary by descending frequency.
In case `case_insensitive` is True, the first `restrict_vocab` words are taken first, and then
case normalization is performed.
Use `case_insensitive` to convert all words in questions and vocab to their uppercase form before
evaluating the accuracy (default True). Useful in case of case-mismatch between training tokens
and question words. In case of multiple case variants of a single word, the vector for the first
occurrence (also the most frequent if vocabulary is sorted) is taken.
This method corresponds to the `compute-accuracy` script of the original C word2vec.
"""
print("INFO: Using new accuracy")
ok_vocab = [(w, self.vocab[w]) for w in self.index2word[:restrict_vocab]]
ok_vocab = {w.upper(): v for w, v in reversed(ok_vocab)} if case_insensitive else dict(ok_vocab)
oov_counter, idx_cnt, is_vn_counter = 0, 0, 0
sections, section = [], None
for line_no, line in enumerate(utils.smart_open(questions)):
# TODO: use level3 BLAS (=evaluate multiple questions at once), for speed
line = utils.to_unicode(line)
if line.startswith(': '):
# a new section starts => store the old section
if section:
sections.append(section)
self.log_accuracy(section)
section = {'section': line.lstrip(': ').strip(), 'correct': [], 'incorrect': []}
else:
# Count number of analogy to check
idx_cnt += 1
if not section:
raise ValueError("missing section header before line #%i in %s" % (line_no, questions))
try:
if case_insensitive:
a, b, c, expected = [word.upper() for word in line.split(" | ")]
else:
a, b, c, expected = [word for word in line.split(" | ")]
# print("Line : ", line)
# print("a, b, c, expected: %s, %s, %s, %s"%(a, b, c, expected))
# input(">>> Wait ...")
except ValueError:
logger.info("SVX: ERROR skipping invalid line #%i in %s", line_no, questions)
print("Line : ", line)
print("a, b, c, expected: %s, %s, %s, %s" % (a, b, c, expected))
input(">>> Wait ...")
continue
# In case of Vietnamese, word analogy can be a phrase
if " " in a or " " in b or " " in c or " " in expected:
is_vn_counter += 1
pass
else:
if a not in ok_vocab or b not in ok_vocab or c not in ok_vocab or expected not in ok_vocab:
logger.debug("SVX: skipping line #%i with OOV words: %s", line_no, line.strip())
oov_counter += 1
continue
original_vocab = self.vocab
self.vocab = ok_vocab
ignore = {a, b, c} # input words to be ignored
predicted = None
# find the most likely prediction, ignoring OOV words and input words
sims = most_similar(self, positive=[b, c], negative=[a], topn=False, restrict_vocab=restrict_vocab)
self.vocab = original_vocab
for index in matutils.argsort(sims, reverse=True):
predicted = self.index2word[index].upper() if case_insensitive else self.index2word[index]
if predicted in ok_vocab and predicted not in ignore:
if predicted != expected:
logger.debug("%s: expected %s, predicted %s", line.strip(), expected, predicted)
break
if predicted == expected:
section['correct'].append((a, b, c, expected))
else:
section['incorrect'].append((a, b, c, expected))
if section:
# store the last section, too
sections.append(section)
self.log_accuracy(section)
total = {
'OOV/Total/VNCompound_Words': [oov_counter, (idx_cnt), is_vn_counter],
'section': 'total',
'correct': sum((s['correct'] for s in sections), []),
'incorrect': sum((s['incorrect'] for s in sections), []),
}
self.log_accuracy(total)
sections.append(total)
return sections
def convert_conll_format_to_normal(connl_file, out_file):
"""
read file conll format
return format : One sentence per line
sentences_arr: [EU rejects German call .., ...]
tags_arr: [B-ORG O B-MIST O ..., ...]
"""
f = open(connl_file)
sentences = []
sentence = ""
for line in f:
# print("line: ", line)
if len(line) == 0 or line.startswith('-DOCSTART') or line[0] == "\n":
sentences.append(sentence.rstrip())
sentence = ""
continue
else:
splits = line.split('\t')
sentence += splits[1].rstrip() + " "
# To handle the last sentence.
if len(sentence) > 0:
sentences.append(sentence)
del sentence
# Write to output
if out_file is None:
out_file = connl_file + ".std.txt"
writer = open(out_file, "w")
for sen in sentences:
writer.write(sen + "\n")
writer.flush()
writer.close()
return sentences
def verify_word_analogies(file):
"""
Verify the word analogy file.
:param file:
:return:
"""
f_reader = open(file, "r")
valid_cnt, invalid_cnt = 0, 0
for line in f_reader:
# print("line: ", line)
if len(line) == 0 or line.startswith('-DOCSTART') or line[0] == "\n":
continue
else:
splits = line.split('\t')
if len(splits) != 4:
invalid_cnt += 1
else:
valid_cnt += 1
print("Valid analogy: %s, invalid analogy: %s" % (valid_cnt, invalid_cnt))
def check_oov_of_word_analogies(w2v_format_emb_file, analogy_file, is_vn=True, case_sensitive=True):
emb_model = gensim.models.KeyedVectors.load_word2vec_format(w2v_format_emb_file,
binary=False,
unicode_errors='ignore')
f_reader = open(analogy_file, "r")
vocab_arr = []
for line in f_reader:
if not case_sensitive:
line = line.lower()
if line.startswith(': '):
continue
else:
for word in line.split(" | "):
# In Vietnamese, we have compound and single word.
# if is_vn:
# if " " in word:
# print("I should not going here")
# single_words = word.split(" ")
# for single_word in single_words:
# vocab_arr.append(single_word)
# For other languages.
# else:
vocab_arr.append(word)
print("Before unique set: len = ", len(vocab_arr))
unique_vocab_arr = set(vocab_arr)
print("After unique set: len = ", len(unique_vocab_arr))
valid_word_cnt = 0
for word in unique_vocab_arr:
if word in emb_model:
valid_word_cnt += 1
print("With Is_VN = %s, case_sensitive = %s, Valid word = %s/%s" % (is_vn,
case_sensitive,
valid_word_cnt,
len(unique_vocab_arr)))
def evaluator_api(input_files, analoglist, output, embed_config=None):
"""
:param input_files:
:param analoglist:
:param output:
:param embed_config:
:return:
"""
if embed_config is None:
embed_config = EmbeddingConfigs() # Initialize default config for embedding.
local_embedding_names, local_word_embeddings = embedding_io.load_word_embeddings(input_files, embed_config)
# emb_utils.print_analogy('man', 'him', 'woman', emb_words)
local_output_str = emb_utils.eval_word_analogy_4_all_embeddings(analoglist,
local_embedding_names,
local_word_embeddings,
output_file=output)
print("OUTPUT: ", local_output_str)
if __name__ == "__main__":
"""
Evaluates a given word embedding model.
To use:
evaluate.py path_to_model [-restrict]
optional restrict argument performs an evaluation using the original
Mikolov restriction of vocabulary
"""
desc = "Evaluates a word embedding model"
parser = argparse.ArgumentParser(description=desc)
parser.add_argument("-input",
required=True,
default="../data/embedding_dicts/ELMO_23.vec",
help="Input multiple word embeddings, each model separated by a `;`.")
parser.add_argument("-analoglist",
nargs="?",
# default="../data/embedding_analogies/vi/analogy_vn_seg.txt.std.txt",
default="../data/embedding_analogies/vi/solveable_analogies_vi.txt",
help="Input analogy file to run the word analogy evaluation.")
parser.add_argument("-r",
nargs="?",
default=False,
help="Vocabulary restriction")
parser.add_argument("-checkoov",
nargs="?",
default=False,
help="Check OOV percentage")
parser.add_argument("-lang",
nargs="?",
default="VI",
help="Specify language, by default, it's Vietnamese.")
parser.add_argument("-lowercase",
nargs="?",
default=True,
help="Lowercase all word analogies? (depends on how the emb was trained).")
parser.add_argument("-output",
nargs="?",
default="../data/embedding_analogies/vi/results_out.txt",
help="Output file of word analogy task")
parser.add_argument("-remove_redundancy",
nargs="?",
default=True,
help="Remove redundancy in predicted words")
print("Params: ", parser)
args = parser.parse_args()
embedding_config = EmbeddingConfigs()
paths_of_models = args.input
testset = args.analoglist
is_vietnamese = args.lang
output_file = args.output
# use restriction?
restriction = None
if args.r:
restriction = 30000
# set logging definitions
logging.basicConfig(format='%(asctime)s : %(levelname)s : %(message)s',
level=logging.INFO)
if args.checkoov:
print("Checking OOV ...")
check_oov_of_word_analogies(paths_of_models, testset, is_vn=is_vietnamese)
if not args.checkoov:
print("Evaluating embeddings on the word analogy task ...")
if is_vietnamese:
print(" ... for ETNLP's evaluation approach.")
embedding_names, word_embeddings = embedding_io.load_word_embeddings(paths_of_models, embedding_config)
# emb_utils.print_analogy('man', 'him', 'woman', emb_words)
output_str = emb_utils.eval_word_analogy_4_all_embeddings(testset, embedding_names, word_embeddings,
output_file=args.output_file)
print("#"*20)
print(output_str)
print("#" * 20)
else:
print(" ... for Mirkolov et al.'s evaluation approach.")
word_analogy_obj = new_Word2VecKeyedVectors(1024)
# load and evaluate
model = word_analogy_obj.load_word2vec_format(
paths_of_models,
binary=False,
unicode_errors='ignore')
model.accuracy = word_analogy_obj.new_accuracy
acc = model.accuracy(testset, restrict_vocab=restriction, case_insensitive=False)
print("Acc = ", acc)
print("DONE")
================================================
FILE: src/codes/api/embedding_extractor.py
================================================
from embeddings import embedding_utils
from pathlib import Path
import numpy as np
import os
import logging
import gzip
from embeddings.embedding_configs import EmbeddingConfigs
def get_multi_embedding_models(config: EmbeddingConfigs):
"""
:param config:
:return:
"""
model_paths_list = config.model_paths_list
model_names_list = config.model_names_list
model_dims_list = config.model_dims_list
char_model_path = config.char_model_path
char_model_dims = config.char_model_dims
if char_model_path:
char_model = embedding_utils.reload_char2vec_model(char_model_path, char_model_dims)
else:
char_model = None
embedding_models = embedding_utils.reload_embedding_models(model_paths_list,
model_names_list,
model_dims_list,
char_model)
# doc_vector = embedding_models.get_vector_of_document(tokenized_text)
return embedding_models
def get_emb_dim(emb_file):
idx = 0
dim = 0
with open(emb_file, "r") as reader:
if idx == 0:
line = reader.readline().rstrip()
dim = int(line.split(" ")[1])
return dim
def extract_embedding_for_vocab_file(paths_of_emb_models, vocab_words_file, c2v_emb_file, output_file, output_format):
"""
:param paths_of_emb_models:
:param vocab_words_file:
:param c2v_emb_file:
:param output_file:
:param output_format:
:return:
"""
config = EmbeddingConfigs()
config.output_format = output_format
config.model_paths_list = paths_of_emb_models.split(";")
embedding_file_names = []
embedding_dims = []
if c2v_emb_file:
config.char_model_path = c2v_emb_file
config.char_model_dims = get_emb_dim(c2v_emb_file)
print("02. Extracting word embeddings ...")
if paths_of_emb_models and paths_of_emb_models.__contains__(";"):
files = paths_of_emb_models.split(";")
for emb_file in files:
embedding_name = os.path.basename(os.path.normpath(emb_file))
embedding_file_names.append(embedding_name)
embedding_dim = get_emb_dim(emb_file)
embedding_dims.append(embedding_dim)
elif paths_of_emb_models: # In case there is only one embedding
embedding_name = os.path.basename(os.path.normpath(paths_of_emb_models))
embedding_file_names.append(embedding_name)
embedding_dim = get_emb_dim(paths_of_emb_models)
embedding_dims.append(embedding_dim)
else:
raise Exception("List of embeddings cannot be None.")
# Data type:
embedding_names = ["word2vec"]*len(embedding_dims) # embedding type, only support w2v and c2v type now
config.model_names_list = embedding_names
config.model_dims_list = embedding_dims
# Do extracting embeddings
extract_embedding_vectors(vocab_words_file, output_file, config)
print("Done")
def extract_embedding_vectors(vocab_words_file, output_file, config: EmbeddingConfigs):
"""
:param vocab_words_file:
:param output_file:
:param config:
:return:
"""
# Load vocab
with Path(vocab_words_file).open() as f:
word_to_idx = {line.strip(): idx for idx, line in enumerate(f)}
size_vocab = len(word_to_idx)
# Output writer
fwriter = open(output_file, "w")
# Array of zeros
dim_size = sum(config.model_dims_list)
found = 0
print('Reading embedding file (may take a while)')
embedding_models = get_multi_embedding_models(config)
embeddings = np.zeros((size_vocab, dim_size))
line_idx = 0
for word in word_to_idx.keys():
word_idx = word_to_idx[word]
word = word.rstrip()
try:
if line_idx % 100000 == 0:
print('- At line {}'.format(line_idx))
w2v_vector = embedding_models.get_word_vector_of_multi_embeddings(word)
if w2v_vector is not None and len(w2v_vector) > 0:
embeddings[word_idx] = w2v_vector
line = "%s %s" % (word, " ".join(str(scalar) for scalar in w2v_vector))
fwriter.write(line + "\n")
fwriter.flush()
found += 1
logging.debug("Embedding: ", w2v_vector)
except Exception as e:
logging.debug("Unexpected error: word = %s, error = %s" % (word, e))
pass
line_idx += 1
print('- done. Found {} vectors for {} words'.format(found, size_vocab))
fwriter.close()
# Open file again to add meta data:
src = open(output_file, "r")
meta_line = "%s %s\n"%(found, dim_size)
oline = src.readlines()
# Here, we prepend the string we want to on first line
oline.insert(0, meta_line)
src.close()
# We again open the file in WRITE mode
src = open(output_file, "w")
src.writelines(oline)
src.close()
# Done with writing.
if config.output_format.__contains__(".gz"):
content = open(output_file, "rb").read()
gzip_out_file = output_file + '.gz'
with gzip.open(gzip_out_file, 'wb') as f:
f.write(content)
print("Saved embedding to %s" % (gzip_out_file))
if config.output_format.__contains__(".npz"):
npz_out_file = output_file + '.npz'
np.savez_compressed(npz_out_file, embeddings=embeddings)
print("Saved embedding to %s"%(npz_out_file))
return
================================================
FILE: src/codes/api/embedding_preprocessing.py
================================================
# Convert to a standard word2vec format
import gensim
from utils import embedding_io
import sys
from threading import Thread
from embeddings.embedding_configs import EmbeddingConfigs
def convert_to_w2v(vocab_file, embedding_file, out_file):
"""
Export from a word2vec file by filtering out vocabs based on the input vocab file.
:param vocab_file:
:param embedding_file:
:param out_file:
:return: word2vec file
"""
std_vocab = []
with open(vocab_file) as f:
for word in f:
std_vocab.append(word)
print ("Loaded NER vocab_size = %s" % (len(std_vocab)))
is_binary = False
if embedding_file.endswith(".bin"):
is_binary = True
print("Loading w2v model ...")
emb_model = gensim.models.KeyedVectors.load_word2vec_format(embedding_file,
binary=is_binary,
unicode_errors='ignore')
print("LOADED model: vocab_size = %s" % (len(emb_model.wv.vocab)))
f_writer = open(out_file, "w")
for word in std_vocab:
word = word.rstrip()
line = None
if word in emb_model:
vector = " ".join(str(item) for item in emb_model[word])
# word = word.lower()
line = "%s %s" % (word, vector)
else:
word = word.lower()
if word in emb_model:
vector = " ".join(str(item) for item in emb_model[word])
line = "%s %s" % (word, vector)
# print("LINE: ", line)
if line:
f_writer.write(line + "\n")
f_writer.close()
def test():
vocab_file = "../data/vnner_BiLSTM_CRF/vocab.words.txt"
embedding_file = "../data/embedding_dicts/elmo_embeddings_large.txt"
out_file = "../data/embedding_dicts/elmo_1024dims_wiki_normalcase2lowercase_NER.vec"
convert_to_w2v(vocab_file, embedding_file, out_file)
print("Out file: ", out_file)
print("DONE")
def load_and_save_2_word2vec_model(input_model_path, output_model_path, embedding_config):
"""
Process one embedding model
:param input_model_path:
:param output_model_path:
:return:
"""
model_in = embedding_io.load_word_embedding(input_model_path, embedding_config)
embedding_io.save_model_to_file(model_in, output_model_path)
print("Write model back to ", output_model_path)
def load_and_save_2_word2vec_models(input_embedding_files_str, output_embedding_files_str, embedding_config):
"""
Multi-threaded processing to export to word2vec format
:param input_embedding_files_str:
:param output_embedding_files_str:
:return:
"""
if input_embedding_files_str.__contains__(";"):
input_model_files = input_embedding_files_str.split(";")
else:
input_model_files = [input_embedding_files_str]
if output_embedding_files_str.__contains__(";"):
output_model_files = output_embedding_files_str.split(";")
else:
output_model_files = [output_embedding_files_str]
# Double check input files and output files.
assert (len(output_model_files) == len(input_model_files)), \
"Number of input files and output files must be equal. Exiting ..."
# create a list of threads
threads = []
for model_in, model_out in zip(input_model_files, output_model_files):
# We start one thread per file.
process = Thread(target=load_and_save_2_word2vec_model, args=[model_in, model_out, embedding_config])
process.start()
threads.append(process)
# load_and_save_2_word2vec_model(model_in, model_out)
# This to ensure each thread has finished processing the input file.
for process in threads:
process.join()
if __name__ == "__main__":
if len(sys.argv) != 2:
print("Missing input arguments. Input format: ./*.py <emb_file1;emb_file2;...>. Exiting ...")
exit(0)
embedding_config = EmbeddingConfigs()
# We don't need to be word2vec format for pre-processing here but it still shows warning
# if input files aren't in w2v format.
embedding_config.is_word2vec_format = True
embedding_config.do_normalize_emb = False # If you don't want to normalize the embedding vectors.
if sys.argv[1].__contains__(";"):
in_model_files = sys.argv[1].split(";")
else:
in_model_files = [sys.argv[1]]
out_model_files = [input_model_path + ".extracted.vec" for input_model_path in in_model_files]
load_and_save_2_word2vec_models(in_model_files, out_model_files)
================================================
FILE: src/codes/api/embedding_visualizer.py
================================================
# 1. Read embedding file
# 2. Convert to tensorboard
# 3. Visualize
# encoding: utf-8
import sys, os
import gensim
import tensorflow as tf
import numpy as np
from tensorflow.contrib.tensorboard.plugins import projector
import logging
from tensorboard import default
from tensorboard import program
class TensorBoardTool:
def __init__(self, dir_path):
self.dir_path = dir_path
def run(self, emb_name, port):
# Remove http messages
# log = logging.getLogger('sonvx').setLevel(logging.INFO)
logging.basicConfig(level=logging.INFO)
logging.propagate = False
# Start tensorboard server
tb = program.TensorBoard(default.get_plugins(), default.get_assets_zip_provider())
tb.configure(argv=[None, '--logdir', self.dir_path, '--port', str(port)])
url = tb.launch()
sys.stdout.write('TensorBoard of %s at %s \n' % (emb_name, url))
def convert_multiple_emb_models_2_tf(emb_name_arr, w2v_model_arr, output_path, port):
"""
:param emb_name_arr:
:param w2v_model_arr:
:param output_path:
:param port:
:return:
"""
idx = 0
# define the model without training
sess = tf.InteractiveSession()
config = projector.ProjectorConfig()
for w2v_model in w2v_model_arr:
emb_name = emb_name_arr[idx]
meta_file = "%s.tsv" % emb_name
placeholder = np.zeros((len(w2v_model.wv.index2word), w2v_model.vector_size))
with open(os.path.join(output_path, meta_file), 'wb') as file_metadata:
for i, word in enumerate(w2v_model.wv.index2word):
placeholder[i] = w2v_model[word]
# temporary solution for https://github.com/tensorflow/tensorflow/issues/9094
if word == '':
print("Empty Line, should replaced by any thing else, or will cause a bug of tensorboard")
file_metadata.write(u"{0}".format('<Empty Line>').encode('utf-8') + b'\n')
else:
file_metadata.write(u"{0}".format(word).encode('utf-8') + b'\n')
word_embedding_var = tf.Variable(placeholder, trainable=False, name=emb_name)
tf.global_variables_initializer().run()
sess.run(word_embedding_var)
# adding into projector
embed = config.embeddings.add()
embed.tensor_name = emb_name
embed.metadata_path = meta_file
idx += 1
saver = tf.train.Saver()
writer = tf.summary.FileWriter(output_path, sess.graph)
# Specify the width and height of a single thumbnail.
projector.visualize_embeddings(writer, config)
all_emb_name = "_".join(emb_name for emb_name in emb_name_arr)
saver.save(sess, os.path.join(output_path, '%s.ckpt' % all_emb_name))
# tf.flags.FLAGS.logdir = output_path
# print('Running `tensorboard --logdir={0}` to run visualize result on tensorboard'.format(output_path))
# tb.run_main()q
tb_tool = TensorBoardTool(output_path)
tb_tool.run(all_emb_name, port)
return
def convert_one_emb_model_2_tf(emb_name, model, output_path, port):
"""
:param model: Word2Vec model
:param output_path:
:return:
"""
# emb_name = "word_embedding"
meta_file = "%s.tsv"%emb_name
placeholder = np.zeros((len(model.wv.index2word), model.vector_size))
with open(os.path.join(output_path, meta_file), 'wb') as file_metadata:
for i, word in enumerate(model.wv.index2word):
placeholder[i] = model[word]
# temporary solution for https://github.com/tensorflow/tensorflow/issues/9094
if word == '':
print("Empty Line, should replaced by any thing else, or will cause a bug of tensorboard")
file_metadata.write(u"{0}".format('<Empty Line>').encode('utf-8') + b'\n')
else:
file_metadata.write(u"{0}".format(word).encode('utf-8') + b'\n')
# define the model without training
sess = tf.InteractiveSession()
word_embedding_var = tf.Variable(placeholder, trainable=False, name=emb_name)
sess.run(word_embedding_var)
# tf.global_variables_initializer().run()
saver = tf.train.Saver()
writer = tf.summary.FileWriter(output_path, sess.graph)
# adding into projector
config = projector.ProjectorConfig()
embed = config.embeddings.add()
embed.tensor_name = emb_name
embed.metadata_path = meta_file
# Specify the width and height of a single thumbnail.
projector.visualize_embeddings(writer, config)
saver.save(sess, os.path.join(output_path, '%s.ckpt'%emb_name))
# tf.flags.FLAGS.logdir = output_path
# print('Running `tensorboard --logdir={0}` to run visualize result on tensorboard'.format(output_path))
# tb.run_main()q
tb_tool = TensorBoardTool(output_path)
tb_tool.run(emb_name, port)
return
def visualize_multiple_embeddings_individually(paths_of_emb_models):
output_root_dir = "../data/embedding_tf_data/"
starting_port = 6006
embedding_names = []
print("Loaded all word embeddings, going to visualize ...")
if paths_of_emb_models and paths_of_emb_models.__contains__(";"):
files = paths_of_emb_models.split(";")
for emb_file in files:
embedding_name = os.path.basename(os.path.normpath(emb_file))
tf_data_folder = output_root_dir + embedding_name
if not os.path.exists(tf_data_folder):
os.makedirs(tf_data_folder)
is_binary = False
if emb_file.endswith(".bin"):
is_binary = True
emb_model = gensim.models.KeyedVectors.load_word2vec_format(emb_file, binary=is_binary)
convert_one_emb_model_2_tf(embedding_name, emb_model, tf_data_folder, starting_port)
embedding_names.append(embedding_name)
starting_port += 1
while True:
print("Type exit to quite the visualizer: ")
user_input = input()
if user_input == "exit":
break
return
def visualize_multiple_embeddings_all_in_one(paths_of_emb_models, port):
output_root_dir = "../data/embedding_tf_data/"
starting_port = port
embedding_names = []
print("Loaded all word embeddings, going to visualize ...")
embedding_name_arr = []
w2v_embedding_model_arr = []
if paths_of_emb_models and paths_of_emb_models.__contains__(";"):
files = paths_of_emb_models.split(";")
for emb_file in files:
embedding_name = os.path.basename(os.path.normpath(emb_file))
embedding_name_arr.append(embedding_name)
is_binary = False
if emb_file.endswith(".bin"):
is_binary = True
emb_model = gensim.models.KeyedVectors.load_word2vec_format(emb_file, binary=is_binary)
w2v_embedding_model_arr.append(emb_model)
embedding_names.append(embedding_name)
# print("View side-by-side word similarity of multiple embeddings at: http://Sons-MBP.lan:8089")
all_emb_name = "_".join(emb_name for emb_name in embedding_name_arr)
tf_data_folder = output_root_dir + all_emb_name
if not os.path.exists(tf_data_folder):
os.makedirs(tf_data_folder)
convert_multiple_emb_models_2_tf(embedding_name_arr, w2v_embedding_model_arr, tf_data_folder, starting_port)
while True:
print("Type exit to quite the visualizer: ")
user_input = input()
if user_input == "exit":
break
return
def visualize_multiple_embeddings(paths_of_emb_models, port):
"""
API to other part to call, don't modify this function.
:param paths_of_emb_models:
:param port:
:return:
"""
visualize_multiple_embeddings_all_in_one(paths_of_emb_models, port)
if __name__ == "__main__":
"""
Just run `python w2v_visualizer.py word2vec.model visualize_result`
"""
try:
model_path = sys.argv[1]
output_path = sys.argv[2]
except Exception as e:
print("Please provide model path and output path %s " % e)
# model = Word2Vec.load(model_path)
model = gensim.models.KeyedVectors.load_word2vec_format(model_path, binary=True)
convert_one_emb_model_2_tf(model, output_path)
================================================
FILE: src/codes/embeddings/__init__.py
================================================
================================================
FILE: src/codes/embeddings/embedding_configs.py
================================================
class EmbeddingConfigs(object):
"""
Configuration information
"""
is_word2vec_format = True
do_normalize_emb = True
model_paths_list = []
model_names_list = []
model_dims_list = []
char_model_path = None
char_model_dims = -1
output_format = ".txt;.npz;.gz"
================================================
FILE: src/codes/embeddings/embedding_models.py
================================================
from gensim.models import KeyedVectors as Word2Vec
import numpy as np
from embeddings import embedding_utils
from utils import file_utils
import os, re
import logging
DEBUG = False
class Model_Constants(object):
word2vec = "word2vec"
char2vec = "char2vec"
private_word2vec = "private_word2vec"
elmo = "elmo"
class Embedding_Model(object):
def __init__(self, name, vector_dim):
self.name = name
self.model = None
self.char_model = None
self.vocabs_list = None
self.vector_dim = vector_dim
# TODO: update this changeable param later
# unk, random, mean, replace_by_character_embedding
self.unknown_word = "replace_by_character_embedding"
# self.MAX_DIM = 400 # No longer use MAX_DIM, now it depends on input dims
def load_model(self, model_path):
if self.name == Model_Constants.word2vec or self.name == Model_Constants.elmo:
if model_path.endswith(".bin"):
self.model = Word2Vec.load_word2vec_format(model_path, binary=True)
else:
self.model = Word2Vec.load_word2vec_format(model_path, binary=False)
elif self.name == Model_Constants.char2vec:
self.model = dict()
print("Loading model_path = ", model_path)
file = open(model_path, "r")
for line in file:
elements = line.split()
if len(elements) > 100: # because embedding dim is higher than 100.
# char_model[elements[0]] = np.array(map(float, elements[1:])).tolist()
self.model[elements[0]] = np.array([float(i) for i in elements[1:]]).tolist()
return self.model
elif self.name == Model_Constants.private_word2vec:
self.model, _, self.vocabs_list = embedding_utils.reload_embeddings(model_path)
else:
raise Exception("Unknown embedding models!")
def is_punct(self, word):
arr_list = [
'!',
'"',
'%',
'&',
"'",
"''",
'(',
'(.',
')',
'*',
'+',
',',
'-',
'---',
'.',
'..',
'...',
'....',
'/',
]
if word in arr_list:
return True
else:
return False
def is_number(self, word):
regex = r"^[0-9]+"
matches = re.finditer(regex, word, re.MULTILINE)
matchNum = 0
for matchNum, match in enumerate(matches):
matchNum = matchNum + 1
if matchNum > 0:
return True
else:
return False
def set_char_model(self, char_model):
self.char_model = char_model
def load_vocabs_list(self, vocab_file_path):
"""
Load vocabs list for private w2v model. Has to be pickle file.
:param vocab_file_path:
:return:
"""
if vocab_file_path:
self.vocabs_list = file_utils.load_obj(vocab_file_path)
def get_char_vector(self, char_model, word):
"""
char_model here is an instance of embedding_model
:param char_model: an instance of embedding_model
:param word:
:return:
"""
if char_model is None:
# Sonvx on March 20, 2019: we now allow the char_model is None,
# cannot call this get_char_vector in such case.
raise Exception("Char_model is None! Cannot use character-embedding.")
out_char_2_vec = []
char_vecs = []
chars = list(word)
vecs = []
for c in chars:
if c in char_model.model:
emb_vector = char_model.model[c]
vecs.append(emb_vector)
if DEBUG:
input(">>>>>>")
print("Char_emb_vector=", emb_vector)
# char_vecs.extend(list(vecs))
if len(vecs) > 0:
out_char_2_vec = np.mean(vecs, axis=0)
if DEBUG:
print(">>> Output of char2vec: %s"%(out_char_2_vec))
input(">>>> outc2v ...")
return out_char_2_vec
def is_unknown_word(self, word):
"""Check whether or not a word is unknown"""
is_unknown_word = False
if self.vocabs_list is not None:
if word not in self.vocabs_list:
is_unknown_word = True
else:
if word not in self.model:
is_unknown_word = True
return is_unknown_word
def get_word_vector(self, word):
"""
Handle unknown word: In case of our private word2vec, we have a vocabs_list to check. With regular models,
we can check inside the model. Note that by default, we use char-model to handle unknown words.
:param word:
:param char_model:
:return:
"""
rtn_vector = []
# try first time with normal case
is_unknown_word = self.is_unknown_word(word)
# try 2nd times with lowercase.
if is_unknown_word:
word = word.lower()
is_unknown_word = self.is_unknown_word(word)
# unknown word
if is_unknown_word and self.char_model:
# Sonvx on March 20, 2019: solve unknown only when char_model is SET.
rtn_vector = self.get_vector_of_unknown(word)
else:
# normal case
if self.name == Model_Constants.word2vec:
rtn_vector = self.model[word]
# For now we have self.vector_dim, max_dim, and len(rtn_vector)
# Update: move to use self.vector_dim only
if len(rtn_vector) > self.vector_dim:
print("Warning: auto trim to %s/%s dimensions"%(self.vector_dim, len(rtn_vector)))
rtn_vector = self.model[word][:self.vector_dim]
elif self.name == Model_Constants.elmo:
rtn_vector = self.model[word]
if self.vector_dim == len(rtn_vector)/2:
vector1 = rtn_vector[:self.vector_dim]
vector2 = rtn_vector[self.vector_dim:]
print("Notice: auto average to b[i] = (a[i] + a[i + %s])/2 /%s dimensions" % (self.vector_dim,
len(rtn_vector)))
rtn_vector = np.mean([vector1, vector2], 0)
elif len(rtn_vector) > self.vector_dim:
print("Warning: auto trim to %s/%s dimensions" % (self.vector_dim, len(rtn_vector)))
rtn_vector = self.model[word][:self.vector_dim]
elif self.name == Model_Constants.char2vec:
rtn_vector = self.get_char_vector(self, word)
elif self.name == Model_Constants.private_word2vec:
# Handle unknown word - Not need for now since we handle unknown words first
if word not in self.vocabs_list:
word = "UNK"
word_idx = self.vocabs_list.index(word)
emb_vector = self.model[word_idx]
rtn_vector = emb_vector
# final check before returning vector
if DEBUG:
print(">>> DEBUG: len(rtn_vector) = %s" % (len(rtn_vector)))
input(">>> before returning vector ...")
if len(rtn_vector) < 1:
return np.zeros(self.vector_dim)
else:
if len(rtn_vector) == self.vector_dim:
return rtn_vector
# TODO: find a better way to represent unknown word by character to have same-size with word-vector-size
# For now, I add 0 to the [current-len, expected-len]
else:
logging.debug("Model name = %s, Current word = %s, Current size = %s, expected size = %s"
%(self.name, word, len(rtn_vector), self.vector_dim))
return np.append(rtn_vector, np.zeros(self.vector_dim - len(rtn_vector)))
def get_vector_of_unknown(self, word):
"""
If word is UNK, use char_vector model instead.
:param word:
:return:
"""
# Here we handle features based on the w2v model where
# numbers and punctuations are encoded as <punct>, <number>
if self.name == Model_Constants.word2vec:
if self.is_number(word):
rtn_vector = self.model["<number>"]
elif self.is_punct(word):
rtn_vector = self.model["<punct>"]
else:
rtn_vector = self.get_char_vector(self.char_model, word)
if rtn_vector is not None:
if len(rtn_vector) > self.vector_dim:
print("Warning: auto trim to %s/%s dimensions"%(self.vector_dim, len(rtn_vector)))
return rtn_vector[:self.vector_dim]
else:
return rtn_vector
# otherwise, using c2v to build-up the embedding vector
else:
return self.get_char_vector(self.char_model, word)
class Embedding_Models(object):
"""
Using all available embedding models to generate vectors
"""
def __init__(self, list_models):
self.list_models = list_models # list of embedding_model_objs: ['word2vec', 'char2vec', 'private_word2vec']
def add_model(self, emb_model, char_model):
"""
Add new model into the collection of embedding models. Note that, every model has to add char_model to handle
unknown word.
:param emb_model:
:param char_model:
:return:
"""
if char_model is None:
print("Warning: char_model is None -> cannot solve OOV word. Keep going ...")
# Sonvx on March 20, 2019: change to allow None char_model
# raise Exception("char_model cannot be None.")
if isinstance(emb_model, Embedding_Model):
emb_model.set_char_model(char_model)
self.list_models.append(emb_model)
else:
raise Exception("Not an instance of embedding_model class.")
def get_vector_of_document(self, document):
"""
Get all embedding vectors for one document
:param document:
:return:
"""
doc_vector = []
# debug_dict = {}
# print ("len_doc = ", len(document))
for word in document:
all_vectors_of_word = []
# get all embedding vectors of a word
for emb_model in self.list_models:
emb_vector = emb_model.get_word_vector(word)
# print("len_emb_vector = ", len(emb_vector))
all_vectors_of_word.extend(emb_vector)
# if word in debug_dict.keys():
# debug_dict[word].append(len(emb_vector))
# else:
# debug_dict[word] = [len(emb_vector)]
# stack a combined vector of all words
doc_vector.append(all_vectors_of_word)
# print("list of words and emb size = ", debug_dict)
# get the mean of them to represent a document
doc_vector = np.mean(doc_vector, axis=0)
return doc_vector
def get_word_vector_of_multi_embeddings(self, word):
"""
Get all embedding vectors for one document
:param word:
:return:
"""
word_vector = []
for emb_model in self.list_models:
emb_vector = emb_model.get_word_vector(word)
word_vector.extend(emb_vector)
return word_vector
================================================
FILE: src/codes/embeddings/embedding_utils.py
================================================
import os
from utils import file_utils
from embeddings.embedding_models import Embedding_Model, Embedding_Models
def reload_char2vec_model(model_path, model_dim):
char_model = Embedding_Model("char2vec", model_dim)
char_model.load_model(model_path)
return char_model
def reload_embedding_models(model_paths_list, model_names_list, model_dims_list, char_model):
"""
Reload collection of embedding models to serve feature extraction task.
:param model_paths_list:
:param model_names_list:
:param model_dims_list:
:param char_model:
:return:
"""
# model path list and name list must be equal.
print("model_paths_list = ", model_paths_list)
print("model_formats_list = ", model_names_list)
assert (len(model_names_list) == len(model_paths_list)), "Not equal length"
assert (len(model_names_list) == len(model_dims_list)), "Not equal length"
all_emb_models = Embedding_Models([])
for model_idx in range(len(model_paths_list)):
# get model path based on index
model_path = model_paths_list[model_idx]
model_name = model_names_list[model_idx]
model_dim = model_dims_list[model_idx]
if model_path is not None:
emb_model = Embedding_Model(model_name, model_dim)
emb_model.load_model(model_path)
# add to final list of emb_models
all_emb_models.add_model(emb_model, char_model)
return all_emb_models
def save_embedding_models_tofolder(dir_path, final_embeddings, reverse_dictionary, vocabulary_size):
"""
Save all trained word-embedding model of the custom word2vec.
:param final_embeddings:
:param reverse_dictionary:
:param vocabulary_size:
:return:
"""
if not os.path.exists(dir_path):
os.makedirs(dir_path)
def save_to_word2vec_model(vocabs_list):
# print("Saving word2vec format ...")
filewriter = open(os.path.join(dir_path, "word2vec.txt"), "w", encoding="utf-8")
filewriter.write("%s %s\n" % (len(vocabs_list), len(final_embeddings[0])))
for word in vocabs_list:
word_idx = vocabs_list.index(word)
emb_vector = final_embeddings[word_idx]
line = ' '.join(["%s" % (x) for x in emb_vector])
filewriter.write(word + " " + line + "\n")
filewriter.close()
# print("Done!")
file_utils.save_obj(final_embeddings, os.path.join(dir_path, "final_embeddings"))
# We don't need to save reversed_dictionary
# file_utils.save_obj(reverse_dictionary, os.path.join(FLAGS.trained_models, "reversed_dictionary"))
vocab_list = [reverse_dictionary[i] for i in range(vocabulary_size)]
save_to_word2vec_model(vocab_list)
file_utils.save_obj(vocab_list, os.path.join(dir_path, "words_dictionary"))
def save_embedding_models(FLAGS, final_embeddings, reverse_dictionary, vocabulary_size):
"""
Keep for old implementation.
:param FLAGS:
:param final_embeddings:
:param reverse_dictionary:
:param vocabulary_size:
:return:
"""
save_embedding_models_tofolder(FLAGS.trained_models, final_embeddings,
reverse_dictionary, vocabulary_size)
def reload_embeddings(trained_models_dir):
"""
Reload trained word-embedding model of the custom word2vec.
:param trained_models_dir:
:return:
"""
final_embeddings = file_utils.load_obj(os.path.join(trained_models_dir, "final_embeddings"))
# reverse_dictionary = file_utils.load_obj(os.path.join(trained_models_dir, "reversed_dictionary"))
reverse_dictionary = None
labels = file_utils.load_obj(os.path.join(trained_models_dir, "words_dictionary"))
return final_embeddings, reverse_dictionary, labels
def create_single_utf8_file(input_dir, output_file):
import glob
# path = './wiki_data/*.txt'
# out = './wiki_all.vi.utf8.txt'
files = glob.glob(input_dir)
for file in files:
with open(output_file, "a") as myfile:
with open(file, "r") as fp:
for line in fp:
line = line.strip().lower()
line = line.decode('utf-8', 'ignore').encode("utf-8")
myfile.write(line)
print("done")
================================================
FILE: src/codes/etnlp_api.py
================================================
import argparse
from api import embedding_preprocessing, embedding_evaluator, embedding_extractor, embedding_visualizer
from visualizer import visualizer_sbs
import logging
import os
from embeddings.embedding_configs import EmbeddingConfigs
__version__ = "0.1.3"
embedding_config = EmbeddingConfigs()
if __name__ == "__main__":
"""
ETNLP: a toolkit for evaluate, extract, and visualize multiple word embeddings
"""
_desc = "Evaluates a word embedding model"
_parser = argparse.ArgumentParser(description=_desc)
_parser.add_argument("-input",
required=True,
default="../data/embedding_dicts/elmo_embeddings.txt",
#
help="model")
_parser.add_argument("-analoglist",
nargs="?",
# default="../data/embedding_analogies/vi/analogy_vn_seg.txt.std.txt",
default="./data/embedding_analogy/solveable_analogies_vi.txt",
help="testset")
_parser.add_argument("-args",
nargs="?",
default="eval",
help="Run evaluation")
_parser.add_argument("-lang",
nargs="?",
default="VI",
help="Specify language, by default, it's Vietnamese.")
_parser.add_argument("-vocab",
nargs="?",
default="../data/vocab.txt",
help="Vocab to be extracted")
_parser.add_argument("-port",
nargs="?",
default=8889,
help="Port for visualization")
_parser.add_argument("-input_c2v",
nargs="?",
default=None,
help="C2V embedding")
_parser.add_argument("-output",
nargs="?",
default="../data/embedding_analogies/vi/results_out.txt",
help="Output file of word analogy task")
_parser.add_argument("-output_format",
nargs="?",
default=".txt",
help="Format of output file of the extracted embedding.")
_args = _parser.parse_args()
# Set logging level
logging.basicConfig(level=logging.INFO)
logging.disable(logging.INFO)
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '5'
input_embedding_files_str = _args.input
analoglist = _args.analoglist
is_vietnamese = _args.lang
output_files_str = _args.output
options_str = _args.args
vocab_file = _args.vocab
output_format = _args.output_format
port = _args.port
# By default, we process all embeddings as word2vec format.
embedding_preprocessing.is_word2vec_format = True
if options_str == 'eval':
print("Starting evaluator ...")
embedding_evaluator.evaluator_api(input_files=input_embedding_files_str, analoglist=analoglist,
output=output_files_str)
print("Done evaluator !")
elif options_str == 'visualizer':
print("Starting visualizer ...")
embedding_visualizer.visualize_multiple_embeddings(input_embedding_files_str, port)
print("Done visualizer !")
elif options_str.startswith("extract"):
print("Starting extractor ...")
embedding_extractor.extract_embedding_for_vocab_file(input_embedding_files_str, vocab_file,
_args.input_c2v, output_files_str, output_format)
print("Done extractor !")
elif options_str.startswith("glove2w2v"):
print("Starting pre-processing: convert to word2vec format ...")
embedding_config.is_word2vec_format = False
if options_str.__contains__("do_normalize"):
embedding_config.do_normalize_emb = True
else:
embedding_config.do_normalize_emb = False
embedding_preprocessing.load_and_save_2_word2vec_models(input_embedding_files_str,
output_files_str,
embedding_config)
else:
print("Invalid options")
print("Done!")
================================================
FILE: src/codes/requirements.txt
================================================
gensim==3.4.0
scipy==1.1.0
six==1.12.0
setuptools==40.6.2
tensorflow==1.12.0
Flask==1.0.2
tensorboard==1.12.0
numpy==1.15.4
scikit_learn==0.20.3
typing==3.6.6
================================================
FILE: src/codes/setup.py
================================================
from setuptools import setup, find_packages
from etnlp_api import __version__
with open("../../README.md", "r") as fh:
long_description = fh.read()
setup(
name='ETNLP',
version=__version__,
# packages=['api', 'utils', 'embeddings', 'visualizer'],
packages=find_packages(),
py_modules=['etnlp_api'],
long_description=long_description,
long_description_content_type="text/markdown",
url='https://github.com/vietnlp/etnlp',
license='MIT',
author='vietnlp',
author_email='sonvx.coltech@gmail.com',
description='ETNLP: Embedding Toolkit for NLP Tasks'
)
# from setuptools import setup, find_packages
# import sys
#
# with open('requirements.txt') as f:
# reqs = f.read()
# setup(
# name='ETNLP',
# version='0.1.0',
# description='ETNLP: Embedding Toolkit for NLP Tasks',
# python_requires='>=3.5',
# packages=find_packages(exclude=('data')),
# install_requires=reqs.strip().split('\n'),
# )
================================================
FILE: src/codes/utils/__init__.py
================================================
================================================
FILE: src/codes/utils/emb_utils.py
================================================
from sklearn.metrics.pairwise import cosine_similarity
from typing import Any, Iterable, List, Optional, Set, Tuple
from utils.vectors import Vector
from utils import vectors
from utils.word import Word
from utils import eval_utils
from gensim import utils as genutils
import logging
import numpy as np
from scipy import stats
# Timing info for most_similar (100k words):
# Original version: 7.3s
# Normalized vectors: 3.4s
logger = logging.getLogger(__name__)
def most_similar(base_vector: Vector, words: List[Word]) -> List[Tuple[float, Word]]:
"""Finds n words with smallest cosine similarity to a given word"""
words_with_distance = [(vectors.cosine_similarity_normalized(base_vector, w.vector), w) for w in words]
# We want cosine similarity to be as large as possible (close to 1)
sorted_by_distance = sorted(words_with_distance, key=lambda t: t[0], reverse=True)
# Sonvx: remove duplications (not understand why yet, probably because the w2v?)
# sorted_by_distance = list(set(sorted_by_distance))
return sorted_by_distance
def print_most_similar(words: List[Word], text: str) -> None:
base_word = find_word(text, words)
if not base_word:
print("Unknown word: %s"%(text))
return
print("Words related to %s:" % (base_word.text))
sorted_by_distance = [
word.text for (dist, word) in
most_similar(base_word.vector, words)
if word.text.lower() != base_word.text.lower()
]
print(', '.join(sorted_by_distance[:10]))
def read_word() -> str:
return input("Type a word: ")
def find_word(text: str, words: List[Word]) -> Optional[Word]:
try:
return next(w for w in words if text == w.text)
except StopIteration:
return None
def closest_analogies_OLD(
left2: str, left1: str, right2: str, words: List[Word]
) -> List[Tuple[float, Word]]:
word_left1 = find_word(left1, words)
word_left2 = find_word(left2, words)
word_right2 = find_word(right2, words)
if (not word_left1) or (not word_left2) or (not word_right2):
return []
vector = vectors.add(
vectors.sub(word_left1.vector, word_left2.vector),
word_right2.vector)
closest = most_similar(vector, words)[:10]
def is_redundant(word: str) -> bool:
"""
Sometimes the two left vectors are so close the answer is e.g.
"shirt-clothing is like phone-phones". Skip 'phones' and get the next
suggestion, which might be more interesting.
"""
word_lower = word.lower()
return (
left1.lower() in word_lower or
left2.lower() in word_lower or
right2.lower() in word_lower)
closest_filtered = [(dist, w) for (dist, w) in closest if not is_redundant(w.text)]
return closest_filtered
def closest_analogies_vectors(
word_left2: Word, word_left1: Word, word_right2: Word, words: List[Word]) \
-> List[Tuple[float, Word]]:
"""
Sonvx:
:param word_left2:
:param word_left1:
:param word_right2:
:param words:
:param remove_redundancy: remove suggestions if they contain the given words.
:return:
"""
# print(">>>> Remove redundancy = ", remove_redundancy)
# input(">>>>")
vector = vectors.add(
vectors.sub(word_left1.vector, word_left2.vector),
word_right2.vector)
closest = most_similar(vector, words)[:10]
def is_redundant(word: str) -> bool:
"""
Sometimes the two left vectors are so close the answer is e.g.
"shirt-clothing is like phone-phones". Skip 'phones' and get the next
suggestion, which might be more interesting.
"""
word_lower = word.lower()
return (
word_left1.text.lower() in word_lower or
word_left2.text.lower() in word_lower or
word_right2.text.lower() in word_lower)
# It doesn't work this way for Vietnamese, so we try both of this to test for now
if False:
closest_filtered = [(dist, w) for (dist, w) in closest if not is_redundant(w.text)]
else:
closest_filtered = closest
return closest_filtered
def get_avg_vector(word, embedding_words):
if " " in word:
single_words = word.split(" ")
list_vector = []
for single_word in single_words:
word_vec = find_word(single_word, embedding_words)
if word_vec:
list_vector.append(word_vec.vector)
else:
# Try again with lowercase
single_word = single_word.lower()
word_vec = find_word(single_word, embedding_words)
if word_vec:
list_vector.append(word_vec.vector)
# print("list_vector: ", list_vector)
# input(">>>>>>>>")
returned_Word = Word(word, vectors.mean_list(list_vector), 1)
else:
returned_Word = find_word(word, embedding_words)
# print("Avg returned vector = ", returned_vector)
# input(">>>>")
return returned_Word
def run_paired_ttests(all_map_arr, embedding_names):
"""
Run Paired t-tests on MAP results
:param all_map_arr:
:param embedding_names:
:return:
"""
str_out = ""
num_embs = len(all_map_arr)
# Verify to make sure they have the same length
if all_map_arr and embedding_names:
for i in range(0, num_embs - 1):
for j in range(i + 1, num_embs):
if len(all_map_arr[i]) != len(all_map_arr[j]):
raise Exception("Two embedding (%s, %s) have different MAP list, sizes: %s vs. %s"
% (embedding_names[i], embedding_names[j], len(all_map_arr[i]), len(all_map_arr[j])))
else:
logging.error("Inputs are NULL")
result_str_ttest_arr = []
for i in range(0, num_embs - 1):
for j in range(i + 1, num_embs):
stat_test_ret = stats.ttest_rel(all_map_arr[i], all_map_arr[j])
# if stat_test_ret.pvalue >= 0.05:
result = "%s vs. %s: %s" % (embedding_names[i], embedding_names[j], stat_test_ret)
str_out += result + "\n"
return str_out
def eval_word_analogy_4_all_embeddings(word_analogies_file, embedding_names: List[str],
word_embeddings: List[List[Word]], output_file):
"""
Run word analogy for all embeddings
:param word_analogies_file:
:param embedding_names:
:param word_embeddings:
:param output_file:
:return:
"""
fwriter = open(output_file, "w")
idx = 0
all_map_arr = []
console_output_str = ""
category = ": | Word Analogy Task results\n"
fwriter.write(category)
console_output_str += category
for word_embedding in word_embeddings:
embedding_name = embedding_names[idx]
map_at_10, map_arr, result_str = eval_word_analogies(word_analogies_file, word_embedding, embedding_name)
all_map_arr.append(map_arr)
meta_info = "\nEmbedding: %s"%(embedding_names[idx])
fwriter.write(meta_info + "\n")
fwriter.write(result_str)
fwriter.write("MAP_arr = %s"%(map_arr))
fwriter.write("MAP@10 = %s" % (map_at_10))
fwriter.flush()
console_output_str += meta_info + "\n" + "MAP@10 = %s" % (map_at_10) + "\n"
idx += 1
# Getting significant Paired t-tests
category = "\n: | Paired t-tests results\n"
fwriter.write(category)
console_output_str += category
ttests_result = run_paired_ttests(all_map_arr, embedding_names)
console_output_str += ttests_result
fwriter.write(ttests_result)
fwriter.flush()
fwriter.close()
return console_output_str
def eval_word_analogies(word_analogies_file, words: List[Word], embedding_name):
"""
Sonvx: Evaluate word analogy for one embedding.
:param word_analogies_file:
:param words:
:return:
"""
# input("GO checking >>>>")
oov_counter, idx_cnt, is_vn_counter, phrase_cnt = 0, -1, 0, 0
sections, section = [], None
# map_arr = []
out_str = ""
map_ret_dict = {}
for line_no, line in enumerate(genutils.smart_open(word_analogies_file)):
# TODO: use level3 BLAS (=evaluate multiple questions at once), for speed
line = genutils.to_unicode(line)
line = line.rstrip()
if line.startswith(': |'):
# a new section starts => store the old section
if section:
sections.append(section)
section = {'section': line.lstrip(': ').strip(), 'correct': [], 'incorrect': []}
else:
# Count number of analogy to check
idx_cnt += 1
# Set default map value
map_ret_dict[idx_cnt] = 0.0
if not section:
raise ValueError("missing section header before line #%i in %s" % (line_no, word_analogies_file))
try:
# a - b + c = expected
# Input: Baghdad | Irac | Bangkok | Thai_Lan
# Baghdad - Irac = Bangkok - Thai_Lan
# -> Baghdad - Irac + Thai_Lan = Bangkok
# =>
a, b, expected, c = [word for word in line.split(" | ")]
except ValueError:
logger.debug("SVX: ERROR skipping invalid line #%i in %s", line_no, word_analogies_file)
print("Line : ", line)
print("a, b, c, expected: %s, %s, %s, %s" % (a, b, c, expected))
# input(">>> Wait ...")
continue
# In case of Vietnamese, word analogy can be a phrase
if " " in expected:
print("INFO: we don't support to find word analogies for phrase for NOW.")
phrase_cnt += 1
continue
elif " " in a or " " in b or " " in c:
is_vn_counter += 1
word_left1 = get_avg_vector(a, words)
word_left2 = get_avg_vector(b, words)
word_right2 = get_avg_vector(c, words)
else:
word_left1 = find_word(a, words)
word_left2 = find_word(b, words)
word_right2 = find_word(c, words)
if (not word_left1) or (not word_left2) or (not word_right2):
logger.debug("SVX: skipping line #%i with OOV words: %s", line_no, line.strip())
oov_counter += 1
continue
# Write solable analogy to a file
# fsolveable_writer.write(line + "\n")
logger.debug("word_left1 = %s", word_left1.text)
logger.debug("word_left2 = %s", word_left2.text)
logger.debug("word_right2 = %s", word_right2.text)
# Start finding close word:
# Note: we can only find 1 expected word in Vietnamese for NOW
top10_candidate = closest_analogies_vectors(word_left2, word_left1,
word_right2, words)
list_candidate_arr = []
for tuple in top10_candidate:
list_candidate_arr.append(tuple[1].text)
logger.debug("Expected Word: %s, candidate = %s" % (expected, list_candidate_arr))
# input(">>>>>")
# Calculate MAP@10 score
this_map_result = eval_utils.mapk(expected, list_candidate_arr, word_level=True)
if this_map_result >= 0:
this_map_result = round(this_map_result, 6)
# map_arr[idx_cnt] = this_map_result
else:
this_map_result = 0.0
# map_arr.append(0.0)
# map_arr[idx_cnt] = this_map_result
map_ret_dict[idx_cnt] = this_map_result
if expected in list_candidate_arr:
section['correct'].append((a, b, c, expected))
out_line = "%s - %s + %s = ?; Expect: %s, candidate: %s" % \
(word_left1, word_left2, word_right2, expected, list_candidate_arr)
out_str += out_line + "\n"
# else:
# section['incorrect'].append((a, b, c, expected))
# fsolveable_writer.close()
if section:
# store the last section, too
sections.append(section)
map_arr = list(map_ret_dict.values())
logger.debug("map_arr = ", map_arr)
logger.debug("MAP_RET_DICT = ", map_ret_dict)
# input("Check result dict: >>>>>")
total = {
"Emb_Name: " + embedding_name + '/OOV/Total/VN_Solveable_Cases/VN_Phrase_Target':
[oov_counter, (idx_cnt + 1), is_vn_counter, phrase_cnt],
'MAP@10': np.mean(map_arr)
# ,
# 'section': 'total'
# ,
# 'correct': sum((s['correct'] for s in sections), []),
# 'incorrect': sum((s['incorrect'] for s in sections), []),
}
# print (out_str)
# print(total)
# logger.info(total)
sections.append(total)
sections_str = "\n%s\n" % sections
return np.mean(map_arr), map_arr, sections_str
def print_analogy(left2: str, left1: str, right2: str, words: List[Word]) -> None:
analogies = closest_analogies_OLD(left2, left1, right2, words)
if (len(analogies) == 0):
# print(f"{left2}-{left1} is like {right2}-?")
print("%s-%s is like %s-?"%(left2, left1, right2))
# man-king is like woman-king
# input: man is to king is like woman is to ___?(queen).
else:
(dist, w) = analogies[0]
# alternatives = ', '.join([f"{w.text} ({dist})" for (dist, w) in analogies])
# print(f"{left2}-{left1} is like {right2}-{w.text}")
print("%s-%s is like %s-%s"%(left2, left1, right2, w.text))
================================================
FILE: src/codes/utils/embedding_io.py
================================================
from typing import Iterable, List, Set
from itertools import groupby
import numpy as np
import re
import utils.vectors as v
from utils.word import Word
import logging
import os
from embeddings.embedding_configs import EmbeddingConfigs
def save_model_to_file(embedding_model: List[Word], model_file_out: str):
"""
Save loaded model back to file (to remove duplicated items).
:param embedding_model:
:param model_file_out:
:return:
"""
fwriter = open(model_file_out, "w")
meta_data = "%s %s\n"%(len(embedding_model), len(embedding_model[0].vector))
fwriter.write(meta_data)
fwriter.flush()
for w_Word in embedding_model:
line = w_Word.text + " " + " ".join(str(scalar) for scalar in w_Word.vector.tolist())
fwriter.write(line + "\n")
fwriter.flush()
fwriter.close()
def load_word_embeddings(file_paths: str, emb_config: EmbeddingConfigs) -> List[List[Word]]:
"""
Sonvx: load multiple embeddings: e.g., <emb_file1>;<emb_file2>
:param file_paths:
:param emb_config:
:return:
"""
embedding_models = []
embedding_names = []
if file_paths and file_paths.__contains__(";"):
files = file_paths.split(";")
for emb_file in files:
word_embedding = load_word_embedding(emb_file.replace("\"", ""), emb_config)
embedding_name = os.path.basename(os.path.normpath(emb_file))
embedding_models.append(word_embedding)
embedding_names.append(embedding_name)
else:
return [load_word_embedding(file_paths), emb_config]
return embedding_names, embedding_models
def load_word_embedding(file_path: str, emb_config: EmbeddingConfigs) -> List[Word]:
"""
Load and cleanup the data.
:param file_path:
:param emb_config:
:return:
"""
# print(f"Loading {file_path}...")
print("Loading %s ..."%(file_path))
words = load_words_raw(file_path, emb_config)
# print(f"Loaded {len(words)} words.")
print("Loaded %s words." %(len(words)))
# Test
word1 = words[1]
print("Vec Len(word1) = ", len(word1.vector))
# num_dimensions = most_common_dimension(words)
# words = [w for w in words if len(w.vector) == dims]
# print(f"Using {num_dimensions}-dimensional vectors, {len(words)} remain.")
# words = remove_stop_words(words)
# print(f"Removed stop words, {len(words)} remain.")
# ords = remove_duplicates(words)
# print(f"Removed duplicates, {len(words)} remain.")
logging.debug("Embedding words: ", words[:10])
print("Emb_vocab_size = ", len(words))
# input("Done loading embedding: >>>>")
return words
def load_words_raw(file_path: str, emb_config: EmbeddingConfigs) -> List[Word]:
"""
Load the file as-is, without doing any validation or cleanup.
:param file_path:
:param emb_config:
:return:
"""
def parse_line(line: str, frequency: int) -> Word:
# print("Line=", line)
tokens = line.split(" ")
word = tokens[0]
if emb_config.do_normalize_emb:
vector = v.normalize(np.array([float(x) for x in tokens[1:]]))
else:
vector = np.array([float(x) for x in tokens[1:]])
return Word(word, vector, frequency)
# Sonvx: NOT loading the same word twice.
unique_dict = {}
words = []
# Words are sorted from the most common to the least common ones
frequency = 1
duplicated_entry = 0
idx_counter, vocab_size, emb_dim = 0, 0, 0
with open(file_path) as f:
for line in f:
line = line.rstrip()
# print("Processing line: ", line)
if idx_counter == 0 and emb_config.is_word2vec_format:
try:
meta_info = line.split(" ")
vocab_size = int(meta_info[0])
emb_dim = int(meta_info[1])
idx_counter += 1
continue
except Exception as e:
print("meta_info = "%(meta_info))
logging.error("Input embedding has format issue: Error = %s" % (e))
# if len(line) < 20: # Ignore the first line of w2v format.
# continue
w = parse_line(line, frequency)
# Svx: only load if the word is not existed in the list.
if w.text not in unique_dict:
unique_dict[w.text] = frequency
words.append(w)
frequency += 1
else:
duplicated_entry += 1
# print("Loading the same word again")
# # Svx: check if the embedding dim is the same with the metadata, random check only
if idx_counter == 10:
if len(w.vector) != emb_dim:
message = "Metadata and the real vector size do not match: meta:real = %s:%s" \
% (emb_dim, len(w.vector))
logging.error(message)
raise ValueError(message)
idx_counter += 1
if duplicated_entry > 0:
logging.debug("Loading the same word again: %s"%(duplicated_entry))
# Final check:
if (frequency - 1) != vocab_size:
msg = "Loaded %s/%s unique vocab." % ((frequency - 1), vocab_size)
logging.info(msg)
return words
def iter_len(iter: Iterable[complex]) -> int:
return sum(1 for _ in iter)
def most_common_dimension(words: List[Word]) -> int:
"""
There is a line in the input file which is missing a word
(search -0.0739, -0.135, 0.0584).
"""
lengths = sorted([len(word.vector) for word in words])
dimensions = [(k, iter_len(v)) for k, v in groupby(lengths)]
print("Dimensions:")
for (dim, num_vectors) in dimensions:
# print(f"{num_vectors} {dim}-dimensional vectors")
print("%s %s-dimensional vectors"%(num_vectors, dim))
most_common = sorted(dimensions, key=lambda t: t[1], reverse=True)[0]
return most_common[0]
# We want to ignore these characters,
# so that e.g. "U.S.", "U.S", "US_" and "US" are the same word.
ignore_char_regex = re.compile("[\W_]")
# Has to start and end with an alphanumeric character
is_valid_word = re.compile("^[^\W_].*[^\W_]$")
def remove_duplicates(words: List[Word]) -> List[Word]:
seen_words: Set[str] = set()
unique_words: List[Word] = []
for w in words:
canonical = ignore_char_regex.sub("", w.text)
if not canonical in seen_words:
seen_words.add(canonical)
# Keep the original ordering
unique_words.append(w)
return unique_words
def remove_stop_words(words: List[Word]) -> List[Word]:
return [w for w in words if (
len(w.text) > 1 and is_valid_word.match(w.text))]
# Run "smoke tests" on import
assert [w.text for w in remove_stop_words([
Word('a', [], 1),
Word('ab', [], 1),
Word('-ab', [], 1),
Word('ab_', [], 1),
Word('a.', [], 1),
Word('.a', [], 1),
Word('ab', [], 1),
])] == ['ab', 'ab']
assert [w.text for w in remove_duplicates([
Word('a.b', [], 1),
Word('-a-b', [], 1),
Word('ab_+', [], 1),
Word('.abc...', [], 1),
])] == ['a.b', '.abc...']
================================================
FILE: src/codes/utils/eval_utils.py
================================================
"""
MAP@K word level and character level are explained in detail in this paper:
dpUGC: Learn Differentially Private Representationfor User Generated Contents
Xuan-Son Vu, Son N. Tran, Lili Jiang
In: Proceedings of the 20th International Conference on Computational Linguistics and
Intelligent Text Processing, April, 2019, (to appear)
Please cite the above paper if you use codes in this file.
"""
def apk(actual, predicted, k=10):
"""
Computes the average precision at k.
This function computes the average prescision at k between two lists of
items.
Parameters
----------
actual : list
A list of elements that are to be predicted (order doesn't matter)
predicted : list
A list of predicted elements (order does matter)
k : int, optional
The maximum number of predicted elements
Returns
-------
score : double
The average precision at k over the input lists
"""
if len(predicted) > k:
predicted = predicted[:k]
score = 0.0
num_hits = 0.0
for i, p in enumerate(predicted):
if p in actual and p not in predicted[:i]:
num_hits += 1.0
score += num_hits / (i + 1.0)
if not actual:
return 0.0
return score / min(len(actual), k)
def mapk(actual, predicted, k=10, word_level=True):
"""
Computes the mean average precision at k.
This function computes the mean average prescision at k between two lists
of lists of items.
Parameters
----------
actual : list
A list of lists of elements that are to be predicted
(order doesn't matter in the lists)
predicted : list
A list of lists of predicted elements
(order matters in the lists)
k : int, optional
The maximum number of predicted elements
Returns
-------
score : double
The mean average precision at k over the input lists
"""
# print("Sending arr = ", arr)
if word_level:
return calc_map(actual, predicted, topK=k)
else:
# arr = [apk(a, p, k) for a, p in zip(actual, predicted)]
# return np.mean(arr)
return calc_map_character_level(actual, predicted, topK=k)
def calc_map(actual, predicted, topK=10):
"""
:param actual:
:param predicted:
:param topK:
:return:
"""
# print("Input: actual %s, predicted %s"%(actual, predicted))
if len(predicted) > topK:
predicted = predicted[:topK]
idx = 1
hit = 0
map_arr = []
for answer in predicted:
if answer in actual[:topK]:
hit += 1
val = (hit * 1.0) / (idx * 1.0)
# print("hit = %s, idx = %s"%(hit, idx))
map_arr.append(val)
# print("hit: %s, map_arr = %s"%(answer, map_arr))
idx += 1
# print("map_arr = %s done", map_arr)
if len(map_arr) > 0:
return np.mean(map_arr)
else:
return 0.0
def calc_map_character_level(actual, predicted, topK=10):
"""
:param actual:
:param predicted:
:param topK:
:return:
"""
# print("Input: actual %s, predicted %s" % (actual, predicted))
if len(predicted) > topK:
predicted = predicted[:topK]
if len(actual) > topK:
actual = actual[:topK]
rank = 1
hit = 0
actual_seq = ''.join([word for word in actual])
predicted_seq = ''.join([word for word in predicted])
map_arr = []
for char in predicted_seq:
if char in actual_seq[:rank]:
hit += 1
val = (hit * 1.0) / (rank * 1.0)
# print("hit = %s, idx = %s" % (hit, rank))
map_arr.append(val)
# print("hit: %s, map_arr = %s" % (char, map_arr))
rank += 1
# print("map_arr = %s done", map_arr)
return np.mean(map_arr)
import unittest
import numpy as np
def test_apk(self):
self.assertAlmostEqual(apk(range(1, 6), [6, 4, 7, 1, 2], 2), 0.25)
self.assertAlmostEqual(apk(range(1, 6), [1, 1, 1, 1, 1], 5), 0.2)
predicted = range(1, 21)
predicted.extend(range(200, 600))
self.assertAlmostEqual(apk(range(1, 100), predicted, 20), 1.0)
def test_mapk(self):
self.assertAlmostEqual(mapk([range(1, 5)], [range(1, 5)], 3), 1.0)
self.assertAlmostEqual(mapk([[1, 3, 4], [1, 2, 4], [1, 3]],
[range(1, 6), range(1, 6), range(1, 6)], 3), 0.685185185185185)
self.assertAlmostEqual(mapk([range(1, 6), range(1, 6)],
[[6, 4, 7, 1, 2], [1, 1, 1, 1, 1]], 5), 0.26)
self.assertAlmostEqual(mapk([[1, 3], [1, 2, 3], [1, 2, 3]],
[range(1, 6), [1, 1, 1], [1, 2, 1]], 3), 11.0 / 18)
if __name__ == '__main__':
a1 = ["1", '2', '3', '4']
b1 = ['1', '5', '2', '8']
print(mapk(a1, b1, 4))
a1 = ["15"]
b1 = ["1", "2", "3", "4", "5","6","7","8","9","10"]
print("MapK:", mapk(a1, b1, 4))
# unittest.main()
================================================
FILE: src/codes/utils/file_utils.py
================================================
import pickle
def save_obj(obj, file_path):
with open(file_path + '.pkl', 'wb') as f:
pickle.dump(obj, f, pickle.HIGHEST_PROTOCOL)
def load_obj(file_path):
with open(file_path + '.pkl', 'rb') as f:
return pickle.load(f)
def get_unique_vocab(analogy_file_path, write_out_file):
"""
:param analogy_file_path:
:param write_out_file:
:return:
"""
vocab_dict = {}
with open(analogy_file_path, "r") as freader:
for line in freader:
if line.__contains__(" | "):
word_parts = line.split(" | ")
for word in word_parts:
word = word.rstrip()
vocab_dict[word] = 0
fwriter = open(write_out_file, "w")
for word in vocab_dict.keys():
fwriter.write(word + "\n")
fwriter.close()
print("Write dictionary file to %s"%(write_out_file))
return vocab_dict
if __name__ == '__main__':
get_unique_vocab("../data/embedding_analogies/portuguese/LX-4WAnalogies-ETNLP.txt",
"../data/embedding_analogies/portuguese/vocab.txt")
================================================
FILE: src/codes/utils/string_utils.py
================================================
import six
def convert_to_unicode(text):
"""Converts `text` to Unicode (if it's not already), assuming utf-8 input."""
if six.PY3:
if isinstance(text, str):
return text
elif isinstance(text, bytes):
return text.decode("utf-8", "ignore")
else:
raise ValueError("Unsupported string type: %s" % (type(text)))
elif six.PY2:
if isinstance(text, str):
return text.decode("utf-8", "ignore")
elif isinstance(text, unicode):
return text
else:
raise ValueError("Unsupported string type: %s" % (type(text)))
else:
raise ValueError("Not running on Python2 or Python 3?")
================================================
FILE: src/codes/utils/vectors.py
================================================
from typing import List, Any, Optional
import math
import numpy as np
# Adopt from https://github.com/mkonicek/nlp/vecters.py
# Vector = np.ndarray[float]
Vector = 'np.ndarray[float]'
vector_type = 'np.ndarray[float]'
# Vector = np.ndarray(dtype=float)
def l2_len(v: vector_type) -> float:
return math.sqrt(np.dot(v, v))
def dot(v1: vector_type, v2: vector_type) -> float:
assert v1.shape == v2.shape
return np.dot(v1, v2)
def mean(v1: vector_type, v2: vector_type) -> Vector:
"""
Added by Sonvx: get mean of 2 vectors.
:param v1:
:param v2:
:return:
"""
assert v1.shape == v2.shape
return np.mean([v1, v2], axis=0)
def mean_list(v1: List[Vector]) -> Vector:
"""
Added by Sonvx: get mean of 2 vectors.
:param v1:
:return:
"""
if len(v1) > 0:
return np.mean(v1, axis=0)
else:
return None
def add(v1: vector_type, v2: vector_type) -> Vector:
assert v1.shape == v2.shape
return np.add(v1, v2)
def sub(v1: vector_type, v2: vector_type) -> Vector:
assert v1.shape == v2.shape
return np.subtract(v1, v2)
def normalize(v: vector_type) -> Vector:
return v / l2_len(v)
def cosine_similarity_normalized(v1: vector_type, v2: vector_type) -> float:
"""
Returns the cosine of the angle between the two vectors.
Each of the vectors must have length (L2-norm) equal to 1.
Results range from -1 (very different) to 1 (very similar).
"""
return dot(v1, v2)
================================================
FILE: src/codes/utils/word.py
================================================
from typing import List
from utils.vectors import Vector
# Adopt from https://github.com/mkonicek/nlp/Word.py
class Word:
"""A single word (one line of the input vector embedding file)"""
def __init__(self, text: str, vector: Vector, frequency: int) -> None:
self.text = text
self.vector = vector
self.frequency = frequency
def __repr__(self) -> str:
vector_preview = ', '.join(map(str, self.vector[:2]))
# return f"{self.text} [{vector_preview}, ...]"
return "%s [%s, ...]"%(self.text, vector_preview)
================================================
FILE: src/codes/visualizer/README.md
================================================
# Requirements:
- ```pip install gensim flask```
- Download any pre-trained embeddings and put it into ../03.run_etnlp_visualizer_inter.sh
# How to run
> 1. sh ../03.run_etnlp_visualizer_inter.sh
> 2. Visit http://localhost:8089
# Screenshot

================================================
FILE: src/codes/visualizer/__init__.py
================================================
================================================
FILE: src/codes/visualizer/outof_w2vec.dict
================================================
'news'
'news'
'news'
'news'
'news'
'news'
'news'
'back'
'back'
'back'
'back'
'news'
'news'
'back'
'back'
'back'
'back'
'news'
'news'
'back'
'back'
'back'
'back'
'news'
'news'
'lovely'
'lovely'
'lovely'
'lovely'
'love'
'love'
================================================
FILE: src/codes/visualizer/static/style.css
================================================
.container-4{
overflow: hidden;
width: 300px;
vertical-align: middle;
white-space: nowrap;
}
.container-4 input#search{
width: 300px;
height: 50px;
background: #2b303b;
border: none;
font-size: 10pt;
float: left;
color: #fff;
padding-left: 15px;
-webkit-border-radius: 5px;
-moz-border-radius: 5px;
border-radius: 5px;
}
.container-4 input#search::-webkit-input-placeholder {
color: #65737e;
}
.container-4 input#search:-moz-placeholder { /* Firefox 18- */
color: #65737e;
}
.container-4 input#search::-moz-placeholder { /* Firefox 19+ */
color: #65737e;
}
.container-4 input#search:-ms-input-placeholder {
color: #65737e;
}
.container-4 button.icon{
-webkit-border-top-right-radius: 5px;
-webkit-border-bottom-right-radius: 5px;
-moz-border-radius-topright: 5px;
-moz-border-radius-bottomright: 5px;
border-top-right-radius: 5px;
border-bottom-right-radius: 5px;
border: none;
background: #232833;
height: 50px;
width: 50px;
color: #4f5b66;
opacity: 0;
font-size: 10pt;
-webkit-transition: all .55s ease;
-moz-transition: all .55s ease;
-ms-transition: all .55s ease;
-o-transition: all .55s ease;
transition: all .55s ease;
}
.container-4:hover button.icon, .container-4:active button.icon, .container-4:focus button.icon{
outline: none;
opacity: 1;
margin-left: -50px;
}
.container-4:hover button.icon:hover{
background: white;
}
div#answers {
background-color: #f2f2f2;
padding-top: 2px;
padding-bottom: 2px;
padding-left: 100px;
}
================================================
FILE: src/codes/visualizer/templates/app.html
================================================
<!DOCTYPE html>
<html lang="en">
<head>
<link rel="stylesheet" type="text/css" href="static/style.css">
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>
<link href="//maxcdn.bootstrapcdn.com/font-awesome/4.1.0/css/font-awesome.min.css" rel="stylesheet">
<div class="box">
<div class="container-4">
<input type="search" id="search" placeholder="Search..." />
<button class="icon"><i class="fa fa-search"></i></button>
</div>
</div>
</body>
</html>
================================================
FILE: src/codes/visualizer/templates/search.html
================================================
{% block content %}
<div class="search">
<div class="container">
<form action="/search" method="post" role="form">
<div class="form-group">
<label for="name">Search:</label>
<input type="text" class="form-control" id="name" name="search"
placeholder="Type a word, e.g., heo"
value="{{ request.form.search}}">
</div>
<button type="submit" class="btn btn-success">Search</button>
</form>
</div>
</div>
<html>
<head>
<link rel="stylesheet" type="text/css" href="static/style.css">
<meta charset="UTF-8">
<title>ETNLP's Side-by-Side Visualizer</title>
<link rel="stylesheet" media="screen" href ="static/bootstrap.min.css">
<link rel="stylesheet" href="static/bootstrap-theme.min.css">
<meta name="viewport" content = "width=device-width, initial-scale=1.0">
<style>
.answers {
padding-top: 20px;
padding-bottom: 100px;
padding-left: 50px;
}
.testimonial-group > .row {
overflow-x: auto;
white-space: nowrap;
}
.testimonial-group > .row > .col-xs-4 {
display: inline-block;
float: none;
}
</style>
</head>
<div class="answers testimonial-group">
{% for emb_name in embedding_names_arr %}
<table style="float: left">
<tr><td> </td></tr>
<tr><td> </td></tr>
<tr><td>{{ emb_name }}</td></tr>
{% for page in output_arr[loop.index0] %}
<tr><td>{{ page }}</td></tr>
{% endfor %}
</table>
{% endfor %}
</div>
{% for message in get_flashed_messages() %}
<div class=flash>
{{ message }}
</div>
{% endfor %}
{% endblock %}
</html>
================================================
FILE: src/codes/visualizer/visualizer_sbs.py
================================================
from flask import Flask, render_template
from flask import request
import gensim
from distutils.version import LooseVersion
from utils import string_utils
import sys
app = Flask(__name__)
app.config.from_object(__name__)
app.config['SECRET_KEY'] = '7d441f27d441f27567d441f2b6176a'
global embedding_models
@app.route('/search', methods=['GET', 'POST'])
def search():
"""
Get input query and return list of top similiar words in all embeddings.
:param embedding_paths_arr:
:return:
"""
if request.method == "POST":
query = request.values['search'] or ''
# query = unicode(query, "utf-8")
# query = query.decode().encode("utf-8")
# Python 2.7
try:
# Old
# query = unicode(query).lower()
query = string_utils.convert_to_unicode(query)
except Exception as e:
raise Exception("Something went wrong: msg = %s, query = %s."%(e, query))
print('query = ' + query)
output_arr = []
for embedding_model in embedding_models:
try:
output = []
sim_list = embedding_model.most_similar(query, topn=50)
for wordsimilar in sim_list:
output.append(wordsimilar[0] + ' - ' + str(round(wordsimilar[1], 6)))
output_arr.append(output)
except Exception as e:
output = 'Err: %s, Not found query = %s' % (e, query)
output_arr.append(output)
return render_template('search.html',
embedding_names_arr=embedding_names_arr,
output_arr=output_arr
)
@app.route("/")
def get_index():
return render_template('search.html')
@app.route("/multi_search")
def multi_search():
return render_template('multi_search.html')
if __name__ == "__main__":
import os
dir_path = os.path.dirname(os.path.realpath(__file__))
# download pre-trained_models at https://github.com/vietnlp/etnlp
if len(sys.argv) < 2:
print("Missing input arguments. Input format: ./*.py <emb_file1;emb_file2;...>. Exiting ...")
exit(0)
if sys.argv[1].__contains__(";"):
model_files = sys.argv[1].split(";")
else:
model_files = [sys.argv[1]]
embedding_names_arr = [os.path.basename(file_path) for file_path in model_files]
embedding_models = []
idx = 0
for model in model_files:
# model = root_dir + model
if os.path.isfile(model):
print('Loading embedding model ... %s' % (idx))
isBinary = False
if model.endswith(".bin"):
isBinary = True
if LooseVersion(gensim.__version__) >= LooseVersion("1.0.1"):
from gensim.models import KeyedVectors
embedding_models.append(KeyedVectors.load_word2vec_format(model, binary=isBinary))
else:
from gensim.models import Word2Vec
embedding_models.append(Word2Vec.load_word2vec_format(model, binary=isBinary))
idx += 1
else:
print(
"Download word2vec model and put into ../data/. File: https://github.com/vietnlp/etnlp")
app.run(debug=False, port=8089, host='0.0.0.0')
================================================
FILE: src/data/embedding_analogies/english/english-word-analogy.txt
================================================
: | capital-common-countries
Athens | Greece | Baghdad | Iraq
Athens | Greece | Bangkok | Thailand
Athens | Greece | Beijing | China
Athens | Greece | Berlin | Germany
Athens | Greece | Bern | Switzerland
Athens | Greece | Cairo | Egypt
Athens | Greece | Canberra | Australia
Athens | Greece | Hanoi | Vietnam
Athens | Greece | Havana | Cuba
Athens | Greece | Helsinki | Finland
Athens | Greece | Islamabad | Pakistan
Athens | Greece | Kabul | Afghanistan
Athens | Greece | London | England
Athens | Greece | Madrid | Spain
Athens | Greece | Moscow | Russia
Athens | Greece | Oslo | Norway
Athens | Greece | Ottawa | Canada
Athens | Greece | Paris | France
Athens | Greece | Rome | Italy
Athens | Greece | Stockholm | Sweden
Athens | Greece | Tehran | Iran
Athens | Greece | Tokyo | Japan
Baghdad | Iraq | Bangkok | Thailand
Baghdad | Iraq | Beijing | China
Baghdad | Iraq | Berlin | Germany
Baghdad | Iraq | Bern | Switzerland
Baghdad | Iraq | Cairo | Egypt
Baghdad | Iraq | Canberra | Australia
Baghdad | Iraq | Hanoi | Vietnam
Baghdad | Iraq | Havana | Cuba
Baghdad | Iraq | Helsinki | Finland
Baghdad | Iraq | Islamabad | Pakistan
Baghdad | Iraq | Kabul | Afghanistan
Baghdad | Iraq | London | England
Baghdad | Iraq | Madrid | Spain
Baghdad | Iraq | Moscow | Russia
Baghdad | Iraq | Oslo | Norway
Baghdad | Iraq | Ottawa | Canada
Baghdad | Iraq | Paris | France
Baghdad | Iraq | Rome | Italy
Baghdad | Iraq | Stockholm | Sweden
Baghdad | Iraq | Tehran | Iran
Baghdad | Iraq | Tokyo | Japan
Baghdad | Iraq | Athens | Greece
Bangkok | Thailand | Beijing | China
Bangkok | Thailand | Berlin | Germany
Bangkok | Thailand | Bern | Switzerland
Bangkok | Thailand | Cairo | Egypt
Bangkok | Thailand | Canberra | Australia
Bangkok | Thailand | Hanoi | Vietnam
Bangkok | Thailand | Havana | Cuba
Bangkok | Thailand | Helsinki | Finland
Bangkok | Thailand | Islamabad | Pakistan
Bangkok | Thailand | Kabul | Afghanistan
Bangkok | Thailand | London | England
Bangkok | Thailand | Madrid | Spain
Bangkok | Thailand | Moscow | Russia
Bangkok | Thailand | Oslo | Norway
Bangkok | Thailand | Ottawa | Canada
Bangkok | Thailand | Paris | France
Bangkok | Thailand | Rome | Italy
Bangkok | Thailand | Stockholm | Sweden
Bangkok | Thailand | Tehran | Iran
Bangkok | Thailand | Tokyo | Japan
Bangkok | Thailand | Athens | Greece
Bangkok | Thailand | Baghdad | Iraq
Beijing | China | Berlin | Germany
Beijing | China | Bern | Switzerland
Beijing | China | Cairo | Egypt
Beijing | China | Canberra | Australia
Beijing | China | Hanoi | Vietnam
Beijing | China | Havana | Cuba
Beijing | China | Helsinki | Finland
Beijing | China | Islamabad | Pakistan
Beijing | China | Kabul | Afghanistan
Beijing | China | London | England
Beijing | China | Madrid | Spain
Beijing | China | Moscow | Russia
Beijing | China | Oslo | Norway
Beijing | China | Ottawa | Canada
Beijing | China | Paris | France
Beijing | China | Rome | Italy
Beijing | China | Stockholm | Sweden
Beijing | China | Tehran | Iran
Beijing | China | Tokyo | Japan
Beijing | China | Athens | Greece
Beijing | China | Baghdad | Iraq
Beijing | China | Bangkok | Thailand
Berlin | Germany | Bern | Switzerland
Berlin | Germany | Cairo | Egypt
Berlin | Germany | Canberra | Australia
Berlin | Germany | Hanoi | Vietnam
Berlin | Germany | Havana | Cuba
Berlin | Germany | Helsinki | Finland
Berlin | Germany | Islamabad | Pakistan
Berlin | Germany | Kabul | Afghanistan
Berlin | Germany | London | England
Berlin | Germany | Madrid | Spain
Berlin | Germany | Moscow | Russia
Berlin | Germany | Oslo | Norway
Berlin | Germany | Ottawa | Canada
Berlin | Germany | Paris | France
Berlin | Germany | Rome | Italy
Berlin | Germany | Stockholm | Sweden
Berlin | Germany | Tehran | Iran
Berlin | Germany | Tokyo | Japan
Berlin | Germany | Athens | Greece
Berlin | Germany | Baghdad | Iraq
Berlin | Germany | Bangkok | Thailand
Berlin | Germany | Beijing | China
Bern | Switzerland | Cairo | Egypt
Bern | Switzerland | Canberra | Australia
Bern | Switzerland | Hanoi | Vietnam
Bern | Switzerland | Havana | Cuba
Bern | Switzerland | Helsinki | Finland
Bern | Switzerland | Islamabad | Pakistan
Bern | Switzerland | Kabul | Afghanistan
Bern | Switzerland | London | England
Bern | Switzerland | Madrid | Spain
Bern | Switzerland | Moscow | Russia
Bern | Switzerland | Oslo | Norway
Bern | Switzerland | Ottawa | Canada
Bern | Switzerland | Paris | France
Bern | Switzerland | Rome | Italy
Bern | Switzerland | Stockholm | Sweden
Bern | Switzerland | Tehran | Iran
Bern | Switzerland | Tokyo | Japan
Bern | Switzerland | Athens | Greece
Bern | Switzerland | Baghdad | Iraq
Bern | Switzerland | Bangkok | Thailand
Bern | Switzerland | Beijing | China
Bern | Switzerland | Berlin | Germany
Cairo | Egypt | Canberra | Australia
Cairo | Egypt | Hanoi | Vietnam
Cairo | Egypt | Havana | Cuba
Cairo | Egypt | Helsinki | Finland
Cairo | Egypt | Islamabad | Pakistan
Cairo | Egypt | Kabul | Afghanistan
Cairo | Egypt | London | England
Cairo | Egypt | Madrid | Spain
Cairo | Egypt | Moscow | Russia
Cairo | Egypt | Oslo | Norway
Cairo | Egypt | Ottawa | Canada
Cairo | Egypt | Paris | France
Cairo | Egypt | Rome | Italy
Cairo | Egypt | Stockholm | Sweden
Cairo | Egypt | Tehran | Iran
Cairo | Egypt | Tokyo | Japan
Cairo | Egypt | Athens | Greece
Cairo | Egypt | Baghdad | Iraq
Cairo | Egypt | Bangkok | Thailand
Cairo | Egypt | Beijing | China
Cairo | Egypt | Berlin | Germany
Cairo | Egypt | Bern | Switzerland
Canberra | Australia | Hanoi | Vietnam
Canberra | Australia | Havana | Cuba
Canberra | Australia | Helsinki | Finland
Canberra | Australia | Islamabad | Pakistan
Canberra | Australia | Kabul | Afghanistan
Canberra | Australia | London | England
Canberra | Australia | Madrid | Spain
Canberra | Australia | Moscow | Russia
Canberra | Australia | Oslo | Norway
Canberra | Australia | Ottawa | Canada
Canberra | Australia | Paris | France
Canberra | Australia | Rome | Italy
Canberra | Australia | Stockholm | Sweden
Canberra | Australia | Tehran | Iran
Canberra | Australia | Tokyo | Japan
Canberra | Australia | Athens | Greece
Canberra | Australia | Baghdad | Iraq
Canberra | Australia | Bangkok | Thailand
Canberra | Australia | Beijing | China
Canberra | Australia | Berlin | Germany
Canberra | Australia | Bern | Switzerland
Canberra | Australia | Cairo | Egypt
Hanoi | Vietnam | Havana | Cuba
Hanoi | Vietnam | Helsinki | Finland
Hanoi | Vietnam | Islamabad | Pakistan
Hanoi | Vietnam | Kabul | Afghanistan
Hanoi | Vietnam | London | England
Hanoi | Vietnam | Madrid | Spain
Hanoi | Vietnam | Moscow | Russia
Hanoi | Vietnam | Oslo | Norway
Hanoi | Vietnam | Ottawa | Canada
Hanoi | Vietnam | Paris | France
Hanoi | Vietnam | Rome | Italy
Hanoi | Vietnam | Stockholm | Sweden
Hanoi | Vietnam | Tehran | Iran
Hanoi | Vietnam | Tokyo | Japan
Hanoi | Vietnam | Athens | Greece
Hanoi | Vietnam | Baghdad | Iraq
Hanoi | Vietnam | Bangkok | Thailand
Hanoi | Vietnam | Beijing | China
Hanoi | Vietnam | Berlin | Germany
Hanoi | Vietnam | Bern | Switzerland
Hanoi | Vietnam | Cairo | Egypt
Hanoi | Vietnam | Canberra | Australia
Havana | Cuba | Helsinki | Finland
Havana | Cuba | Islamabad | Pakistan
Havana | Cuba | Kabul | Afghanistan
Havana | Cuba | London | England
Havana | Cuba | Madrid | Spain
Havana | Cuba | Moscow | Russia
Havana | Cuba | Oslo | Norway
Havana | Cuba | Ottawa | Canada
Havana | Cuba | Paris | France
Havana | Cuba | Rome | Italy
Havana | Cuba | Stockholm | Sweden
Havana | Cuba | Tehran | Iran
Havana | Cuba | Tokyo | Japan
Havana | Cuba | Athens | Greece
Havana | Cuba | Baghdad | Iraq
Havana | Cuba | Bangkok | Thailand
Havana | Cuba | Beijing | China
Havana | Cuba | Berlin | Germany
Havana | Cuba | Bern | Switzerland
Havana | Cuba | Cairo | Egypt
Havana | Cuba | Canberra | Australia
Havana | Cuba | Hanoi | Vietnam
Helsinki | Finland | Islamabad | Pakistan
Helsinki | Finland | Kabul | Afghanistan
Helsinki | Finland | London | England
Helsinki | Finland | Madrid | Spain
Helsinki | Finland | Moscow | Russia
Helsinki | Finland | Oslo | Norway
Helsinki | Finland | Ottawa | Canada
Helsinki | Finland | Paris | France
Helsinki | Finland | Rome | Italy
Helsinki | Finland | Stockholm | Sweden
Helsinki | Finland | Tehran | Iran
Helsinki | Finland | Tokyo | Japan
Helsinki | Finland | Athens | Greece
Helsinki | Finland | Baghdad | Iraq
Helsinki | Finland | Bangkok | Thailand
Helsinki | Finland | Beijing | China
Helsinki | Finland | Berlin | Germany
Helsinki | Finland | Bern | Switzerland
Helsinki | Finland | Cairo | Egypt
Helsinki | Finland | Canberra | Australia
Helsinki | Finland | Hanoi | Vietnam
Helsinki | Finland | Havana | Cuba
Islamabad | Pakistan | Kabul | Afghanistan
Islamabad | Pakistan | London | England
Islamabad | Pakistan | Madrid | Spain
Islamabad | Pakistan | Moscow | Russia
Islamabad | Pakistan | Oslo | Norway
Islamabad | Pakistan | Ottawa | Canada
Islamabad | Pakistan | Paris | France
Islamabad | Pakistan | Rome | Italy
Islamabad | Pakistan | Stockholm | Sweden
Islamabad | Pakistan | Tehran | Iran
Islamabad | Pakistan | Tokyo | Japan
Islamabad | Pakistan | Athens | Greece
Islamabad | Pakistan | Baghdad | Iraq
Islamabad | Pakistan | Bangkok | Thailand
Islamabad | Pakistan | Beijing | China
Islamabad | Pakistan | Berlin | Germany
Islamabad | Pakistan | Bern | Switzerland
Islamabad | Pakistan | Cairo | Egypt
Islamabad | Pakistan | Canberra | Australia
Islamabad | Pakistan | Hanoi | Vietnam
Islamabad | Pakistan | Havana | Cuba
Islamabad | Pakistan | Helsinki | Finland
Kabul | Afghanistan | London | England
Kabul | Afghanistan | Madrid | Spain
Kabul | Afghanistan | Moscow | Russia
Kabul | Afghanistan | Oslo | Norway
Kabul | Afghanistan | Ottawa | Canada
Kabul | Afghanistan | Paris | France
Kabul | Afghanistan | Rome | Italy
Kabul | Afghanistan | Stockholm | Sweden
Kabul | Afghanistan | Tehran | Iran
Kabul | Afghanistan | Tokyo | Japan
Kabul | Afghanistan | Athens | Greece
Kabul | Afghanistan | Baghdad | Iraq
Kabul | Afghanistan | Bangkok | Thailand
Kabul | Afghanistan | Beijing | China
Kabul | Afghanistan | Berlin | Germany
Kabul | Afghanistan | Bern | Switzerland
Kabul | Afghanistan | Cairo | Egypt
Kabul | Afghanistan | Canberra | Australia
Kabul | Afghanistan | Hanoi | Vietnam
Kabul | Afghanistan | Havana | Cuba
Kabul | Afghanistan | Helsinki | Finland
Kabul | Afghanistan | Islamabad | Pakistan
London | England | Madrid | Spain
London | England | Moscow | Russia
London | England | Oslo | Norway
London | England | Ottawa | Canada
London | England | Paris | France
London | England | Rome | Italy
London | England | Stockholm | Sweden
London | England | Tehran | Iran
London | England | Tokyo | Japan
London | England | Athens | Greece
London | England | Baghdad | Iraq
London | England | Bangkok | Thailand
London | England | Beijing | China
London | England | Berlin | Germany
London | England | Bern | Switzerland
London | England | Cairo | Egypt
London | England | Canberra | Australia
London | England | Hanoi | Vietnam
London | England | Havana | Cuba
London | England | Helsinki | Finland
London | England | Islamabad | Pakistan
London | England | Kabul | Afghanistan
Madrid | Spain | Moscow | Russia
Madrid | Spain | Oslo | Norway
Madrid | Spain | Ottawa | Canada
Madrid | Spain | Paris | France
Madrid | Spain | Rome | Italy
Madrid | Spain | Stockholm | Sweden
Madrid | Spain | Tehran | Iran
Madrid | Spain | Tokyo | Japan
Madrid | Spain | Athens | Greece
Madrid | Spain | Baghdad | Iraq
Madrid | Spain | Bangkok | Thailand
Madrid | Spain | Beijing | China
Madrid | Spain | Berlin | Germany
Madrid | Spain | Bern | Switzerland
Madrid | Spain | Cairo | Egypt
Madrid | Spain | Canberra | Australia
Madrid | Spain | Hanoi | Vietnam
Madrid | Spain | Havana | Cuba
Madrid | Spain | Helsinki | Finland
Madrid | Spain | Islamabad | Pakistan
Madrid | Spain | Kabul | Afghanistan
Madrid | Spain | London | England
Moscow | Russia | Oslo | Norway
Moscow | Russia | Ottawa | Canada
Moscow | Russia | Paris | France
Moscow | Russia | Rome | Italy
Moscow | Russia | Stockholm | Sweden
Moscow | Russia | Tehran | Iran
Moscow | Russia | Tokyo | Japan
Moscow | Russia | Athens | Greece
Moscow | Russia | Baghdad | Iraq
Moscow | Russia | Bangkok | Thailand
Moscow | Russia | Beijing | China
Moscow | Russia | Berlin | Germany
Moscow | Russia | Bern | Switzerland
Moscow | Russia | Cairo | Egypt
Moscow | Russia | Canberra | Australia
Moscow | Russia | Hanoi | Vietnam
Moscow | Russia | Havana | Cuba
Moscow | Russia | Helsinki | Finland
Moscow | Russia | Islamabad | Pakistan
Moscow | Russia | Kabul | Afghanistan
Moscow | Russia | London | England
Moscow | Russia | Madrid | Spain
Oslo | Norway | Ottawa | Canada
Oslo | Norway | Paris | France
Oslo | Norway | Rome | Italy
Oslo | Norway | Stockholm | Sweden
Oslo | Norway | Tehran | Iran
Oslo | Norway | Tokyo | Japan
Oslo | Norway | Athens | Greece
Oslo | Norway | Baghdad | Iraq
Oslo | Norway | Bangkok | Thailand
Oslo | Norway | Beijing | China
Oslo | Norway | Berlin | Germany
Oslo | Norway | Bern | Switzerland
Oslo | Norway | Cairo | Egypt
Oslo | Norway | Canberra | Australia
Oslo | Norway | Hanoi | Vietnam
Oslo | Norway | Havana | Cuba
Oslo | Norway | Helsinki | Finland
Oslo | Norway | Islamabad | Pakistan
Oslo | Norway | Kabul | Afghanistan
Oslo | Norway | London | England
Oslo | Norway | Madrid | Spain
Oslo | Norway | Moscow | Russia
Ottawa | Canada | Paris | France
Ottawa | Canada | Rome | Italy
Ottawa | Canada | Stockholm | Sweden
Ottawa | Canada | Tehran | Iran
Ottawa | Canada | Tokyo | Japan
Ottawa | Canada | Athens | Greece
Ottawa | Canada | Baghdad | Iraq
Ottawa | Canada | Bangkok | Thailand
Ottawa | Canada | Beijing | China
Ottawa | Canada | Berlin | Germany
Ottawa | Canada | Bern | Switzerland
Ottawa | Canada | Cairo | Egypt
Ottawa | Canada | Canberra | Australia
Ottawa | Canada | Hanoi | Vietnam
Ottawa | Canada | Havana | Cuba
Ottawa | Canada | Helsinki | Finland
Ottawa | Canada | Islamabad | Pakistan
Ottawa | Canada | Kabul | Afghanistan
Ottawa | Canada | London | England
Ottawa | Canada | Madrid | Spain
Ottawa | Canada | Moscow | Russia
Ottawa | Canada | Oslo | Norway
Paris | France | Rome | Italy
Paris | France | Stockholm | Sweden
Paris | France | Tehran | Iran
Paris | France | Tokyo | Japan
Paris | France | Athens | Greece
Paris | France | Baghdad | Iraq
Paris | France | Bangkok | Thailand
Paris | France | Beijing | China
Paris | France | Berlin | Germany
Paris | France | Bern | Switzerland
Paris | France | Cairo | Egypt
Paris | France | Canberra | Australia
Paris | France | Hanoi | Vietnam
Paris | France | Havana | Cuba
Paris | France | Helsinki | Finland
Paris | France | Islamabad | Pakistan
Paris | France | Kabul | Afghanistan
Paris | France | London | England
Paris | France | Madrid | Spain
Paris | France | Moscow | Russia
Paris | France | Oslo | Norway
Paris | France | Ottawa | Canada
Rome | Italy | Stockholm | Sweden
Rome | Italy | Tehran | Iran
Rome | Italy | Tokyo | Japan
Rome | Italy | Athens | Greece
Rome | Italy | Baghdad | Iraq
Rome | Italy | Bangkok | Thailand
Rome | Italy | Beijing | China
Rome | Italy | Berlin | Germany
Rome | Italy | Bern | Switzerland
Rome | Italy | Cairo | Egypt
Rome | Italy | Canberra | Australia
Rome | Italy | Hanoi | Vietnam
Rome | Italy | Havana | Cuba
Rome | Italy | Helsinki | Finland
Rome | Italy | Islamabad | Pakistan
Rome | Italy | Kabul | Afghanistan
Rome | Italy | London | England
Rome | Italy | Madrid | Spain
Rome | Italy | Moscow | Russia
Rome | Italy | Oslo | Norway
Rome | Italy | Ottawa | Canada
Rome | Italy | Paris | France
Stockholm | Sweden | Tehran | Iran
Stockholm | Sweden | Tokyo | Japan
Stockholm | Sweden | Athens | Greece
Stockholm | Sweden | Baghdad | Iraq
Stockholm | Sweden | Bangkok | Thailand
Stockholm | Sweden | Beijing | China
Stockholm | Sweden | Berlin | Germany
Stockholm | Sweden | Bern | Switzerland
Stockholm | Sweden | Cairo | Egypt
Stockholm | Sweden | Canberra | Australia
Stockholm | Sweden | Hanoi | Vietnam
Stockholm | Sweden | Havana | Cuba
Stockholm | Sweden | Helsinki | Finland
Stockholm | Sweden | Islamabad | Pakistan
Stockholm | Sweden | Kabul | Afghanistan
Stockholm | Sweden | London | England
Stockholm | Sweden | Madrid | Spain
Stockholm | Sweden | Moscow | Russia
Stockholm | Sweden | Oslo | Norway
Stockholm | Sweden | Ottawa | Canada
Stockholm | Sweden | Paris | France
Stockholm | Sweden | Rome | Italy
Tehran | Iran | Tokyo | Japan
Tehran | Iran | Athens | Greece
Tehran | Iran | Baghdad | Iraq
Tehran | Iran | Bangkok | Thailand
Tehran | Iran | Beijing | China
Tehran | Iran | Berlin | Germany
Tehran | Iran | Bern | Switzerland
Tehran | Iran | Cairo | Egypt
Tehran | Iran | Canberra | Australia
Tehran | Iran | Hanoi | Vietnam
Tehran | Iran | Havana | Cuba
Tehran | Iran | Helsinki | Finland
Tehran | Iran | Islamabad | Pakistan
Tehran | Iran | Kabul | Afghanistan
Tehran | Iran | London | England
Tehran | Iran | Madrid | Spain
Tehran | Iran | Moscow | Russia
Tehran | Iran | Oslo | Norway
Tehran | Iran | Ottawa | Canada
Tehran | Iran | Paris | France
Tehran | Iran | Rome | Italy
Tehran | Iran | Stockholm | Sweden
Tokyo | Japan | Athens | Greece
Tokyo | Japan | Baghdad | Iraq
Tokyo | Japan | Bangkok | Thailand
Tokyo | Japan | Beijing | China
Tokyo | Japan | Berlin | Germany
Tokyo | Japan | Bern | Switzerland
Tokyo | Japan | Cairo | Egypt
Tokyo | Japan | Canberra | Australia
Tokyo | Japan | Hanoi | Vietnam
Tokyo | Japan | Havana | Cuba
Tokyo | Japan | Helsinki | Finland
Tokyo | Japan | Islamabad | Pakistan
Tokyo | Japan | Kabul | Afghanistan
Tokyo | Japan | London | England
Tokyo | Japan | Madrid | Spain
Tokyo | Japan | Moscow | Russia
Tokyo | Japan | Oslo | Norway
Tokyo | Japan | Ottawa | Canada
Tokyo | Japan | Paris | France
Tokyo | Japan | Rome | Italy
Tokyo | Japan | Stockholm | Sweden
Tokyo | Japan | Tehran | Iran
: | capital-world
Abuja | Nigeria | Accra | Ghana
Abuja | Nigeria | Algiers | Algeria
Abuja | Nigeria | Amman | Jordan
Abuja | Nigeria | Ankara | Turkey
Abuja | Nigeria | Antananarivo | Madagascar
Abuja | Nigeria | Apia | Samoa
Abuja | Nigeria | Ashgabat | Turkmenistan
Abuja | Nigeria | Asmara | Eritrea
Abuja | Nigeria | Astana | Kazakhstan
Abuja | Nigeria | Athens | Greece
Abuja | Nigeria | Baghdad | Iraq
Abuja | Nigeria | Baku | Azerbaijan
Abuja | Nigeria | Bamako | Mali
Abuja | Nigeria | Bangkok | Thailand
Abuja | Nigeria | Banjul | Gambia
Abuja | Nigeria | Beijing | China
Abuja | Nigeria | Beirut | Lebanon
Abuja | Nigeria | Belgrade | Serbia
Abuja | Nigeria | Belmopan | Belize
Abuja | Nigeria | Berlin | Germany
Abuja | Nigeria | Bern | Switzerland
Abuja | Nigeria | Bishkek | Kyrgyzstan
Abuja | Nigeria | Bratislava | Slovakia
Abuja | Nigeria | Brussels | Belgium
Abuja | Nigeria | Bucharest | Romania
Abuja | Nigeria | Budapest | Hungary
Abuja | Nigeria | Bujumbura | Burundi
Abuja | Nigeria | Cairo | Egypt
Abuja | Nigeria | Canberra | Australia
Abuja | Nigeria | Caracas | Venezuela
Abuja | Nigeria | Chisinau | Moldova
Abuja | Nigeria | Conakry | Guinea
Abuja | Nigeria | Copenhagen | Denmark
Abuja | Nigeria | Dakar | Senegal
Abuja | Nigeria | Damascus | Syria
Abuja | Nigeria | Dhaka | Bangladesh
Abuja | Nigeria | Doha | Qatar
Abuja | Nigeria | Dublin | Ireland
Abuja | Nigeria | Dushanbe | Tajikistan
Accra | Ghana | Algiers | Algeria
Accra | Ghana | Amman | Jordan
Accra | Ghana | Ankara | Turkey
Accra | Ghana | Antananarivo | Madagascar
Accra | Ghana | Apia | Samoa
Accra | Ghana | Ashgabat | Turkmenistan
Accra | Ghana | Asmara | Eritrea
Accra | Ghana | Astana | Kazakhstan
Accra | Ghana | Athens | Greece
Accra | Ghana | Baghdad | Iraq
Accra | Ghana | Baku | Azerbaijan
Accra | Ghana | Bamako | Mali
Accra | Ghana | Bangkok | Thailand
Accra | Ghana | Banjul | Gambia
Accra | Ghana | Beijing | China
Accra | Ghana | Beirut | Lebanon
Accra | Ghana | Belgrade | Serbia
Accra | Ghana | Belmopan | Belize
Accra | Ghana | Berlin | Germany
Accra | Ghana | Bern | Switzerland
Accra | Ghana | Bishkek | Kyrgyzstan
Accra | Ghana | Bratislava | Slovakia
Accra | Ghana | Brussels | Belgium
Accra | Ghana | Bucharest | Romania
Accra | Ghana | Budapest | Hungary
Accra | Ghana | Bujumbura | Burundi
Accra | Ghana | Cairo | Egypt
Accra | Ghana | Canberra | Australia
Accra | Ghana | Caracas | Venezuela
Accra | Ghana | Chisinau | Moldova
Accra | Ghana | Conakry | Guinea
Accra | Ghana | Copenhagen | Denmark
Accra | Ghana | Dakar | Senegal
Accra | Ghana | Damascus | Syria
Accra | Ghana | Dhaka | Bangladesh
Accra | Ghana | Doha | Qatar
Accra | Ghana | Dublin | Ireland
Accra | Ghana | Dushanbe | Tajikistan
Accra | Ghana | Funafuti | Tuvalu
Algiers | Algeria | Amman | Jordan
Algiers | Algeria | Ankara | Turkey
Algiers | Algeria | Antananarivo | Madagascar
Algiers | Algeria | Apia | Samoa
Algiers | Algeria | Ashgabat | Turkmenistan
Algiers | Algeria | Asmara | Eritrea
Algiers | Algeria | Astana | Kazakhstan
Algiers | Algeria | Athens | Greece
Algiers | Algeria | Baghdad | Iraq
Algiers | Algeria | Baku | Azerbaijan
Algiers | Algeria | Bamako | Mali
Algiers | Algeria | Bangkok | Thailand
Algiers | Algeria | Banjul | Gambia
Algiers | Algeria | Beijing | China
Algiers | Algeria | Beirut | Lebanon
Algiers | Algeria | Belgrade | Serbia
Algiers | Algeria | Belmopan | Belize
Algiers | Algeria | Berlin | Germany
Algiers | Algeria | Bern | Switzerland
Algiers | Algeria | Bishkek | Kyrgyzstan
Algiers | Algeria | Bratislava | Slovakia
Algiers | Algeria | Brussels | Belgium
Algiers | Algeria | Bucharest | Romania
Algiers | Algeria | Budapest | Hungary
Algiers | Algeria | Bujumbura | Burundi
Algiers | Algeria | Cairo | Egypt
Algiers | Algeria | Canberra | Australia
Algiers | Algeria | Caracas | Venezuela
Algiers | Algeria | Chisinau | Moldova
Algiers | Algeria | Conakry | Guinea
Algiers | Algeria | Copenhagen | Denmark
Algiers | Algeria | Dakar | Senegal
Algiers | Algeria | Damascus | Syria
Algiers | Algeria | Dhaka | Bangladesh
Algiers | Algeria | Doha | Qatar
Algiers | Algeria | Dublin | Ireland
Algiers | Algeria | Dushanbe | Tajikistan
Algiers | Algeria | Funafuti | Tuvalu
Algiers | Algeria | Gaborone | Botswana
Amman | Jordan | Ankara | Turkey
Amman | Jordan | Antananarivo | Madagascar
Amman | Jordan | Apia | Samoa
Amman | Jordan | Ashgabat | Turkmenistan
Amman | Jordan | Asmara | Eritrea
Amman | Jordan | Astana | Kazakhstan
Amman | Jordan | Athens | Greece
Amman | Jordan | Baghdad | Iraq
Amman | Jordan | Baku | Azerbaijan
Amman | Jordan | Bamako | Mali
Amman | Jordan | Bangkok | Thailand
Amman | Jordan | Banjul | Gambia
Amman | Jordan | Beijing | China
Amman | Jordan | Beirut | Lebanon
Amman | Jordan | Belgrade | Serbia
Amman | Jordan | Belmopan | Belize
Amman | Jordan | Berlin | Germany
Amman | Jordan | Bern | Switzerland
Amman | Jordan | Bishkek | Kyrgyzstan
Amman | Jordan | Bratislava | Slovakia
Amman | Jordan | Brussels | Belgium
Amman | Jordan | Bucharest | Romania
Amman | Jordan | Budapest | Hungary
Amman | Jordan | Bujumbura | Burundi
Amman | Jordan | Cairo | Egypt
Amman | Jordan | Canberra | Australia
Amman | Jordan | Caracas | Venezuela
Amman | Jordan | Chisinau | Moldova
Amman | Jordan | Conakry | Guinea
Amman | Jordan | Copenhagen | Denmark
Amman | Jordan | Dakar | Senegal
Amman | Jordan | Damascus | Syria
Amman | Jordan | Dhaka | Bangladesh
Amman | Jordan | Doha | Qatar
Amman | Jordan | Dublin | Ireland
Amman | Jordan | Dushanbe | Tajikistan
Amman | Jordan | Funafuti | Tuvalu
Amman | Jordan | Gaborone | Botswana
Amman | Jordan | Georgetown | Guyana
Ankara | Turkey | Antananarivo | Madagascar
Ankara | Turkey | Apia | Samoa
Ankara | Turkey | Ashgabat | Turkmenistan
Ankara | Turkey | Asmara | Eritrea
Ankara | Turkey | Astana | Kazakhstan
Ankara | Turkey | Athens | Greece
Ankara | Turkey | Baghdad | Iraq
Ankara | Turkey | Baku | Azerbaijan
Ankara | Turkey | Bamako | Mali
Ankara | Turkey | Bangkok | Thailand
Ankara | Turkey | Banjul | Gambia
Ankara | Turkey | Beijing | China
Ankara | Turkey | Beirut | Lebanon
Ankara | Turkey | Belgrade | Serbia
Ankara | Turkey | Belmopan | Belize
Ankara | Turkey | Berlin | Germany
Ankara | Turkey | Bern | Switzerland
Ankara | Turkey | Bishkek | Kyrgyzstan
Ankara | Turkey | Bratislava | Slovakia
Ankara | Turkey | Brussels | Belgium
Ankara | Turkey | Bucharest | Romania
Ankara | Turkey | Budapest | Hungary
Ankara | Turkey | Bujumbura | Burundi
Ankara | Turkey | Cairo | Egypt
Ankara | Turkey | Canberra | Australia
Ankara | Turkey | Caracas | Venezuela
Ankara | Turkey | Chisinau | Moldova
Ankara | Turkey | Conakry | Guinea
Ankara | Turkey | Copenhagen | Denmark
Ankara | Turkey | Dakar | Senegal
Ankara | Turkey | Damascus | Syria
Ankara | Turkey | Dhaka | Bangladesh
Ankara | Turkey | Doha | Qatar
Ankara | Turkey | Dublin | Ireland
Ankara | Turkey | Dushanbe | Tajikistan
Ankara | Turkey | Funafuti | Tuvalu
Ankara | Turkey | Gaborone | Botswana
Ankara | Turkey | Georgetown | Guyana
Ankara | Turkey | Hanoi | Vietnam
Antananarivo | Madagascar | Apia | Samoa
Antananarivo | Madagascar | Ashgabat | Turkmenistan
Antananarivo | Madagascar | Asmara | Eritrea
Antananarivo | Madagascar | Astana | Kazakhstan
Antananarivo | Madagascar | Athens | Greece
Antananarivo | Madagascar | Baghdad | Iraq
Antananarivo | Madagascar | Baku | Azerbaijan
Antananarivo | Madagascar | Bamako | Mali
Antananarivo | Madagascar | Bangkok | Thailand
Antananarivo | Madagascar | Banjul | Gambia
Antananarivo | Madagascar | Beijing | China
Antananarivo | Madagascar | Beirut | Lebanon
Antananarivo | Madagascar | Belgrade | Serbia
Antananarivo | Madagascar | Belmopan | Belize
Antananarivo | Madagascar | Berlin | Germany
Antananarivo | Madagascar | Bern | Switzerland
Antananarivo | Madagascar | Bishkek | Kyrgyzstan
Antananarivo | Madagascar | Bratislava | Slovakia
Antananarivo | Madagascar | Brussels | Belgium
Antananarivo | Madagascar | Bucharest | Romania
Antananarivo | Madagascar | Budapest | Hungary
Antananarivo | Madagascar | Bujumbura | Burundi
Antananarivo | Madagascar | Cairo | Egypt
Antananarivo | Madagascar | Canberra | Australia
Antananarivo | Madagascar | Caracas | Venezuela
Antananarivo | Madagascar | Chisinau | Moldova
Antananarivo | Madagascar | Conakry | Guinea
Antananarivo | Madagascar | Copenhagen | Denmark
Antananarivo | Madagascar | Dakar | Senegal
Antananarivo | Madagascar | Damascus | Syria
Antananarivo | Madagascar | Dhaka | Bangladesh
Antananarivo | Madagascar | Doha | Qatar
Antananarivo | Madagascar | Dublin | Ireland
Antananarivo | Madagascar | Dushanbe | Tajikistan
Antananarivo | Madagascar | Funafuti | Tuvalu
Antananarivo | Madagascar | Gaborone | Botswana
Antananarivo | Madagascar | Georgetown | Guyana
Antananarivo | Madagascar | Hanoi | Vietnam
Antananarivo | Madagascar | Harare | Zimbabwe
Apia | Samoa | Ashgabat | Turkmenistan
Apia | Samoa | Asmara | Eritrea
Apia | Samoa | Astana | Kazakhstan
Apia | Samoa | Athens | Greece
Apia | Samoa | Baghdad | Iraq
Apia | Samoa | Baku | Azerbaijan
Apia | Samoa | Bamako | Mali
Apia | Samoa | Bangkok | Thailand
Apia | Samoa | Banjul | Gambia
Apia | Samoa | Beijing | China
Apia | Samoa | Beirut | Lebanon
Apia | Samoa | Belgrade | Serbia
Apia | Samoa | Belmopan | Belize
Apia | Samoa | Berlin | Germany
Apia | Samoa | Bern | Switzerland
Apia | Samoa | Bishkek | Kyrgyzstan
Apia | Samoa | Bratislava | Slovakia
Apia | Samoa | Brussels | Belgium
Apia | Samoa | Bucharest | Romania
Apia | Samoa | Budapest | Hungary
Apia | Samoa | Bujumbura | Burundi
Apia | Samoa | Cairo | Egypt
Apia | Samoa | Canberra | Australia
Apia | Samoa | Caracas | Venezuela
Apia | Samoa | Chisinau | Moldova
Apia | Samoa | Conakry | Guinea
Apia | Samoa | Copenhagen | Denmark
Apia | Samoa | Dakar | Senegal
Apia | Samoa | Damascus | Syria
Apia | Samoa | Dhaka | Bangladesh
Apia | Samoa | Doha | Qatar
Apia | Samoa | Dublin | Ireland
Apia | Samoa | Dushanbe | Tajikistan
Apia | Samoa | Funafuti | Tuvalu
Apia | Samoa | Gaborone | Botswana
Apia | Samoa | Georgetown | Guyana
Apia | Samoa | Hanoi | Vietnam
Apia | Samoa | Harare | Zimbabwe
Apia | Samoa | Havana | Cuba
Ashgabat | Turkmenistan | Asmara | Eritrea
Ashgabat | Turkmenistan | Astana | Kazakhstan
Ashgabat | Turkmenistan | Athens | Greece
Ashgabat | Turkmenistan | Baghdad | Iraq
Ashgabat | Turkmenistan | Baku | Azerbaijan
Ashgabat | Turkmenistan | Bamako | Mali
Ashgabat | Turkmenistan | Bangkok | Thailand
Ashgabat | Turkmenistan | Banjul | Gambia
Ashgabat | Turkmenistan | Beijing | China
Ashgabat | Turkmenistan | Beirut | Lebanon
Ashgabat | Turkmenistan | Belgrade | Serbia
Ashgabat | Turkmenistan | Belmopan | Belize
Ashgabat | Turkmenistan | Berlin | Germany
Ashgabat | Turkmenistan | Bern | Switzerland
Ashgabat | Turkmenistan | Bishkek | Kyrgyzstan
Ashgabat | Turkmenistan | Bratislava | Slovakia
Ashgabat | Turkmenistan | Brussels | Belgium
Ashgabat | Turkmenistan | Bucharest | Romania
Ashgabat | Turkmenistan | Budapest | Hungary
Ashgabat | Turkmenistan | Bujumbura | Burundi
Ashgabat | Turkmenistan | Cairo | Egypt
Ashgabat | Turkmenistan | Canberra | Australia
Ashgabat | Turkmenistan | Caracas | Venezuela
Ashgabat | Turkmenistan | Chisinau | Moldova
Ashgabat | Turkmenistan | Conakry | Guinea
Ashgabat | Turkmenistan | Copenhagen | Denmark
Ashgabat | Turkmenistan | Dakar | Senegal
Ashgabat | Turkmenistan | Damascus | Syria
Ashgabat | Turkmenistan | Dhaka | Bangladesh
Ashgabat | Turkmenistan | Doha | Qatar
Ashgabat | Turkmenistan | Dublin | Ireland
Ashgabat | Turkmenistan | Dushanbe | Tajikistan
Ashgabat | Turkmenistan | Funafuti | Tuvalu
Ashgabat | Turkmenistan | Gaborone | Botswana
Ashgabat | Turkmenistan | Georgetown | Guyana
Ashgabat | Turkmenistan | Hanoi | Vietnam
Ashgabat | Turkmenistan | Harare | Zimbabwe
Ashgabat | Turkmenistan | Havana | Cuba
Ashgabat | Turkmenistan | Helsinki | Finland
Asmara | Eritrea | Astana | Kazakhstan
Asmara | Eritrea | Athens | Greece
Asmara | Eritrea | Baghdad | Iraq
Asmara | Eritrea | Baku | Azerbaijan
Asmara | Eritrea | Bamako | Mali
Asmara | Eritrea | Bangkok | Thailand
Asmara | Eritrea | Banjul | Gambia
Asmara | Eritrea | Beijing | China
Asmara | Eritrea | Beirut | Lebanon
Asmara | Eritrea | Belgrade | Serbia
Asmara | Eritrea | Belmopan | Belize
Asmara | Eritrea | Berlin | Germany
Asmara | Eritrea | Bern | Switzerland
Asmara | Eritrea | Bishkek | Kyrgyzstan
Asmara | Eritrea | Bratislava | Slovakia
Asmara | Eritrea | Brussels | Belgium
Asmara | Eritrea | Bucharest | Romania
Asmara | Eritrea | Budapest | Hungary
Asmara | Eritrea | Bujumbura | Burundi
Asmara | Eritrea | Cairo | Egypt
Asmara | Eritrea | Canberra | Australia
Asmara | Eritrea | Caracas | Venezuela
Asmara | Eritrea | Chisinau | Moldova
Asmara | Eritrea | Conakry | Guinea
Asmara | Eritrea | Copenhagen | Denmark
Asmara | Eritrea | Dakar | Senegal
Asmara | Eritrea | Damascus | Syria
Asmara | Eritrea | Dhaka | Bangladesh
Asmara | Eritrea | Doha | Qatar
Asmara | Eritrea | Dublin | Ireland
Asmara | Eritrea | Dushanbe | Tajikistan
Asmara | Eritrea | Funafuti | Tuvalu
Asmara | Eritrea | Gaborone | Botswana
Asmara | Eritrea | Georgetown | Guyana
Asmara | Eritrea | Hanoi | Vietnam
Asmara | Eritrea | Harare | Zimbabwe
Asmara | Eritrea | Havana | Cuba
Asmara | Eritrea | Helsinki | Finland
Asmara | Eritrea | Islamabad | Pakistan
Astana | Kazakhstan | Athens | Greece
Astana | Kazakhstan | Baghdad | Iraq
Astana | Kazakhstan | Baku | Azerbaijan
Astana | Kazakhstan | Bamako | Mali
Astana | Kazakhstan | Bangkok | Thailand
Astana | Kazakhstan | Banjul | Gambia
Astana | Kazakhstan | Beijing | China
Astana | Kazakhstan | Beirut | Lebanon
Astana | Kazakhstan | Belgrade | Serbia
Astana | Kazakhstan | Belmopan | Belize
Astana | Kazakhstan | Berlin | Germany
Astana | Kazakhstan | Bern | Switzerland
Astana | Kazakhstan | Bishkek | Kyrgyzstan
Astana | Kazakhstan | Bratislava | Slovakia
Astana | Kazakhstan | Brussels | Belgium
Astana | Kazakhstan | Bucharest | Romania
Astana | Kazakhstan | Budapest | Hungary
Astana | Kazakhstan | Bujumbura | Burundi
Astana | Kazakhstan | Cairo | Egypt
Astana | Kazakhstan | Canberra | Australia
Astana | Kazakhstan | Caracas | Venezuela
Astana | Kazakhstan | Chisinau | Moldova
Astana | Kazakhstan | Conakry | Guinea
Astana | Kazakhstan | Copenhagen | Denmark
Astana | Kazakhstan | Dakar | Senegal
Astana | Kazakhstan | Damascus | Syria
Astana | Kazakhstan | Dhaka | Bangladesh
Astana | Kazakhstan | Doha | Qatar
Astana | Kazakhstan | Dublin | Ireland
Astana | Kazakhstan | Dushanbe | Tajikistan
Astana | Kazakhstan | Funafuti | Tuvalu
Astana | Kazakhstan | Gaborone | Botswana
Astana | Kazakhstan | Georgetown | Guyana
Astana | Kazakhstan | Hanoi | Vietnam
Astana | Kazakhstan | Harare | Zimbabwe
Astana | Kazakhstan | Havana | Cuba
Astana | Kazakhstan | Helsinki | Finland
Astana | Kazakhstan | Islamabad | Pakistan
Astana | Kazakhstan | Jakarta | Indonesia
Athens | Greece | Baghdad | Iraq
Athens | Greece | Baku | Azerbaijan
Athens | Greece | Bamako | Mali
Athens | Greece | Bangkok | Thailand
Athens | Greece | Banjul | Gambia
Athens | Greece | Beijing | China
Athens | Greece | Beirut | Lebanon
Athens | Greece | Belgrade | Serbia
Athens | Greece | Belmopan | Belize
Athens | Greece | Berlin | Germany
Athens | Greece | Bern | Switzerland
Athens | Greece | Bishkek | Kyrgyzstan
Athens | Greece | Bratislava | Slovakia
Athens | Greece | Brussels | Belgium
Athens | Greece | Bucharest | Romania
Athens | Greece | Budapest | Hungary
Athens | Greece | Bujumbura | Burundi
Athens | Greece | Cairo | Egypt
Athens | Greece | Canberra | Australia
Athens | Greece | Caracas | Venezuela
Athens | Greece | Chisinau | Moldova
Athens | Greece | Conakry | Guinea
Athens | Greece | Copenhagen | Denmark
Athens | Greece | Dakar | Senegal
Athens | Greece | Damascus | Syria
Athens | Greece | Dhaka | Bangladesh
Athens | Greece | Doha | Qatar
Athens | Greece | Dublin | Ireland
Athens | Greece | Dushanbe | Tajikistan
Athens | Greece | Funafuti | Tuvalu
Athens | Greece | Gaborone | Botswana
Athens | Greece | Georgetown | Guyana
Athens | Greece | Hanoi | Vietnam
Athens | Greece | Harare | Zimbabwe
Athens | Greece | Havana | Cuba
Athens | Greece | Helsinki | Finland
Athens | Greece | Islamabad | Pakistan
Athens | Greece | Jakarta | Indonesia
Athens | Greece | Kabul | Afghanistan
Baghdad | Iraq | Baku | Azerbaijan
Baghdad | Iraq | Bamako | Mali
Baghdad | Iraq | Bangkok | Thailand
Baghdad | Iraq | Banjul | Gambia
Baghdad | Iraq | Beijing | China
Baghdad | Iraq | Beirut | Lebanon
Baghdad | Iraq | Belgrade | Serbia
Baghdad | Iraq | Belmopan | Belize
Baghdad | Iraq | Berlin | Germany
Baghdad | Iraq | Bern | Switzerland
Baghdad | Iraq | Bishkek | Kyrgyzstan
Baghdad | Iraq | Bratislava | Slovakia
Baghdad | Iraq | Brussels | Belgium
Baghdad | Iraq | Bucharest | Romania
Baghdad | Iraq | Budapest | Hungary
Baghdad | Iraq | Bujumbura | Burundi
Baghdad | Iraq | Cairo | Egypt
Baghdad | Iraq | Canberra | Australia
Baghdad | Iraq | Caracas | Venezuela
Baghdad | Iraq | Chisinau | Moldova
Baghdad | Iraq | Conakry | Guinea
Baghdad | Iraq | Copenhagen | Denmark
Baghdad | Iraq | Dakar | Senegal
Baghdad | Iraq | Damascus | Syria
Baghdad | Iraq | Dhaka | Bangladesh
Baghdad | Iraq | Doha | Qatar
Baghdad | Iraq | Dublin | Ireland
Baghdad | Iraq | Dushanbe | Tajikistan
Baghdad | Iraq | Funafuti | Tuvalu
Baghdad | Iraq | Gaborone | Botswana
Baghdad | Iraq | Georgetown | Guyana
Baghdad | Iraq | Hanoi | Vietnam
Baghdad | Iraq | Harare | Zimbabwe
Baghdad | Iraq | Havana | Cuba
Baghdad | Iraq | Helsinki | Finland
Baghdad | Iraq | Islamabad | Pakistan
Baghdad | Iraq | Jakarta | Indonesia
Baghdad | Iraq | Kabul | Afghanistan
Baghdad | Iraq | Kampala | Uganda
Baku | Azerbaijan | Bamako | Mali
Baku | Azerbaijan | Bangkok | Thailand
Baku | Azerbaijan | Banjul | Gambia
Baku | Azerbaijan | Beijing | China
Baku | Azerbaijan | Beirut | Lebanon
Baku | Azerbaijan | Belgrade | Serbia
Baku | Azerbaijan | Belmopan | Belize
Baku | Azerbaijan | Berlin | Germany
Baku | Azerbaijan | Bern | Switzerland
Baku | Azerbaijan | Bishkek | Kyrgyzstan
Baku | Azerbaijan | Bratislava | Slovakia
Baku | Azerbaijan | Brussels | Belgium
Baku | Azerbaijan | Bucharest | Romania
Baku | Azerbaijan | Budapest | Hungary
Baku | Azerbaijan | Bujumbura | Burundi
Baku | Azerbaijan | Cairo | Egypt
Baku | Azerbaijan | Canberra | Australia
Baku | Azerbaijan | Caracas | Venezuela
Baku | Azerbaijan | Chisinau | Moldova
Baku | Azerbaijan | Conakry | Guinea
Baku | Azerbaijan | Copenhagen | Denmark
Baku | Azerbaijan | Dakar | Senegal
Baku | Azerbaijan | Damascus | Syria
Baku | Azerbaijan | Dhaka | Bangladesh
Baku | Azerbaijan | Doha | Qatar
Baku | Azerbaijan | Dublin | Ireland
Baku | Azerbaijan | Dushanbe | Tajikistan
Baku | Azerbaijan | Funafuti | Tuvalu
Baku | Azerbaijan | Gaborone | Botswana
Baku | Azerbaijan | Georgetown | Guyana
Baku | Azerbaijan | Hanoi | Vietnam
Baku | Azerbaijan | Harare | Zimbabwe
Baku | Azerbaijan | Havana | Cuba
Baku | Azerbaijan | Helsinki | Finland
Baku | Azerbaijan | Islamabad | Pakistan
Baku | Azerbaijan | Jakarta | Indonesia
Baku | Azerbaijan | Kabul | Afghanistan
Baku | Azerbaijan | Kampala | Uganda
Baku | Azerbaijan | Kathmandu | Nepal
Bamako | Mali | Bangkok | Thailand
Bamako | Mali | Banjul | Gambia
Bamako | Mali | Beijing | China
Bamako | Mali | Beirut | Lebanon
Bamako | Mali | Belgrade | Serbia
Bamako | Mali | Belmopan | Belize
Bamako | Mali | Berlin | Germany
Bamako | Mali | Bern | Switzerland
Bamako | Mali | Bishkek | Kyrgyzstan
Bamako | Mali | Bratislava | Slovakia
Bamako | Mali | Brussels | Belgium
Bamako | Mali | Bucharest | Romania
Bamako | Mali | Budapest | Hungary
Bamako | Mali | Bujumbura | Burundi
Bamako | Mali | Cairo | Egypt
Bamako | Mali | Canberra | Australia
Bamako | Mali | Caracas | Venezuela
Bamako | Mali | Chisinau | Moldova
Bamako | Mali | Conakry | Guinea
Bamako | Mali | Copenhagen | Denmark
Bamako | Mali | Dakar | Senegal
Bamako | Mali | Damascus | Syria
Bamako | Mali | Dhaka | Bangladesh
Bamako | Mali | Doha | Qatar
Bamako | Mali | Dublin | Ireland
Bamako | Mali | Dushanbe | Tajikistan
Bamako | Mali | Funafuti | Tuvalu
Bamako | Mali | Gaborone | Botswana
Bamako | Mali | Georgetown | Guyana
Bamako | Mali | Hanoi | Vietnam
Bamako | Mali | Harare | Zimbabwe
Bamako | Mali | Havana | Cuba
Bamako | Mali | Helsinki | Finland
Bamako | Mali | Islamabad | Pakistan
Bamako | Mali | Jakarta | Indonesia
Bamako | Mali | Kabul | Afghanistan
Bamako | Mali | Kampala | Uganda
Bamako | Mali | Kathmandu | Nepal
Bamako | Mali | Khartoum | Sudan
Bangkok | Thailand | Banjul | Gambia
Bangkok | Thailand | Beijing | China
Bangkok | Thailand | Beirut | Lebanon
Bangkok | Thailand | Belgrade | Serbia
Bangkok | Thailand | Belmopan | Belize
Bangkok | Thailand | Berlin | Germany
Bangkok | Thailand | Bern | Switzerland
Bangkok | Thailand | Bishkek | Kyrgyzstan
Bangkok | Thailand | Bratislava | Slovakia
Bangkok | Thailand | Brussels | Belgium
Bangkok | Thailand | Bucharest | Romania
Bangkok | Thailand | Budapest | Hungary
Bangkok | Thailand | Bujumbura | Burundi
Bangkok | Thailand | Cairo | Egypt
Bangkok | Thailand | Canberra | Australia
Bangkok | Thailand | Caracas | Venezuela
Bangkok | Thailand | Chisinau | Moldova
Bangkok | Thailand | Conakry | Guinea
Bangkok | Thailand | Copenhagen | Denmark
Bangkok | Thailand | Dakar | Senegal
Bangkok | Thailand | Damascus | Syria
Bangkok | Thailand | Dhaka | Bangladesh
Bangkok | Thailand | Doha | Qatar
Bangkok | Thailand | Dublin | Ireland
Bangkok | Thailand | Dushanbe | Tajikistan
Bangkok | Thailand | Funafuti | Tuvalu
Bangkok | Thailand | Gaborone | Botswana
Bangkok | Thailand | Georgetown | Guyana
Bangkok | Thailand | Hanoi | Vietnam
Bangkok | Thailand | Harare | Zimbabwe
Bangkok | Thailand | Havana | Cuba
Bangkok | Thailand | Helsinki | Finland
Bangkok | Thailand | Islamabad | Pakistan
Bangkok | Thailand | Jakarta | Indonesia
Bangkok | Thailand | Kabul | Afghanistan
Bangkok | Thailand | Kampala | Uganda
Bangkok | Thailand | Kathmandu | Nepal
Bangkok | Thailand | Khartoum | Sudan
Bangkok | Thailand | Kiev | Ukraine
Banjul | Gambia | Beijing | China
Banjul | Gambia | Beirut | Lebanon
Banjul | Gambia | Belgrade | Serbia
Banjul | Gambia | Belmopan | Belize
Banjul | Gambia | Berlin | Germany
Banjul | Gambia | Bern | Switzerland
Banjul | Gambia | Bishkek | Kyrgyzstan
Banjul | Gambia | Bratislava | Slovakia
Banjul | Gambia | Brussels | Belgium
Banjul | Gambia | Bucharest | Romania
Banjul | Gambia | Budapest | Hungary
Banjul | Gambia | Bujumbura | Burundi
Banjul | Gambia | Cairo | Egypt
Banjul | Gambia | Canberra | Australia
Banjul | Gambia | Caracas | Venezuela
Banjul | Gambia | Chisinau | Moldova
Banjul | Gambia | Conakry | Guinea
Banjul | Gambia | Copenhagen | Denmark
Banjul | Gambia | Dakar | Senegal
Banjul | Gambia | Damascus | Syria
Banjul | Gambia | Dhaka | Bangladesh
Banjul | Gambia | Doha | Qatar
Banjul | Gambia | Dublin | Ireland
Banjul | Gambia | Dushanbe | Tajikistan
Banjul | Gambia | Funafuti | Tuvalu
Banjul | Gambia | Gaborone | Botswana
Banjul | Gambia | Georgetown | Guyana
Banjul | Gambia | Hanoi | Vietnam
Banjul | Gambia | Harare | Zimbabwe
Banjul | Gambia | Havana | Cuba
Banjul | Gambia | Helsinki | Finland
Banjul | Gambia | Islamabad | Pakistan
Banjul | Gambia | Jakarta | Indonesia
Banjul | Gambia | Kabul | Afghanistan
Banjul | Gambia | Kampala | Uganda
Banjul | Gambia | Kathmandu | Nepal
Banjul | Gambia | Khartoum | Sudan
Banjul | Gambia | Kiev | Ukraine
Banjul | Gambia | Kigali | Rwanda
Beijing | China | Beirut | Lebanon
Beijing | China | Belgrade | Serbia
Beijing | China | Belmopan | Belize
Beijing | China | Berlin | Germany
Beijing | China | Bern | Switzerland
Beijing | China | Bishkek | Kyrgyzstan
Beijing | China | Bratislava | Slovakia
Beijing | China | Brussels | Belgium
Beijing | China | Bucharest | Romania
Beijing | China | Budapest | Hungary
Beijing | China | Bujumbura | Burundi
Beijing | China | Cairo | Egypt
Beijing | China | Canberra | Australia
Beijing | China | Caracas | Venezuela
Beijing | China | Chisinau | Moldova
Beijing | China | Conakry | Guinea
Beijing | China | Copenhagen | Denmark
Beijing | China | Dakar | Senegal
Beijing | China | Damascus | Syria
Beijing | China | Dhaka | Bangladesh
Beijing | China | Doha | Qatar
Beijing | China | Dublin | Ireland
Beijing | China | Dushanbe | Tajikistan
Beijing | China | Funafuti | Tuvalu
Beijing | China | Gaborone | Botswana
Beijing | China | Georgetown | Guyana
Beijing | China | Hanoi | Vietnam
Beijing | China | Harare | Zimbabwe
Beijing | China | Havana | Cuba
Beijing | China | Helsinki | Finland
Beijing | China | Islamabad | Pakistan
Beijing | China | Jakarta | Indonesia
Beijing | China | Kabul | Afghanistan
Beijing | China | Kampala | Uganda
Beijing | China | Kathmandu | Nepal
Beijing | China | Khartoum | Sudan
Beijing | China | Kiev | Ukraine
Beijing | China | Kigali | Rwanda
Beijing | China | Kingston | Jamaica
Beirut | Lebanon | Belgrade | Serbia
Beirut | Lebanon | Belmopan | Belize
Beirut | Lebanon | Berlin | Germany
Beirut | Lebanon | Bern | Switzerland
Beirut | Lebanon | Bishkek | Kyrgyzstan
Beirut | Lebanon | Bratislava | Slovakia
Beirut | Lebanon | Brussels | Belgium
Beirut | Lebanon | Bucharest | Romania
Beirut | Lebanon | Budapest | Hungary
Beirut | Lebanon | Bujumbura | Burundi
Beirut | Lebanon | Cairo | Egypt
Beirut | Lebanon | Canberra | Australia
Beirut | Lebanon | Caracas | Venezuela
Beirut | Lebanon | Chisinau | Moldova
Beirut | Lebanon | Conakry | Guinea
Beirut | Lebanon | Copenhagen | Denmark
Beirut | Lebanon | Dakar | Senegal
Beirut | Lebanon | Damascus | Syria
Beirut | Lebanon | Dhaka | Bangladesh
Beirut | Lebanon | Doha | Qatar
Beirut | Lebanon | Dublin | Ireland
Beirut | Lebanon | Dushanbe | Tajikistan
Beirut | Lebanon | Funafuti | Tuvalu
Beirut | Lebanon | Gaborone | Botswana
Beirut | Lebanon | Georgetown | Guyana
Beirut | Lebanon | Hanoi | Vietnam
Beirut | Lebanon | Harare | Zimbabwe
Beirut | Lebanon | Havana | Cuba
Beirut | Lebanon | Helsinki | Finland
Beirut | Lebanon | Islamabad | Pakistan
Beirut | Lebanon | Jakarta | Indonesia
Beirut | Lebanon | Kabul | Afghanistan
Beirut | Lebanon | Kampala | Uganda
Beirut | Lebanon | Kathmandu | Nepal
Beirut | Lebanon | Khartoum | Sudan
Beirut | Lebanon | Kiev | Ukraine
Beirut | Lebanon | Kigali | Rwanda
Beirut | Lebanon | Kingston | Jamaica
Beirut | Lebanon | Libreville | Gabon
Belgrade | Serbia | Belmopan | Belize
Belgrade | Serbia | Berlin | Germany
Belgrade | Serbia | Bern | Switzerland
Belgrade | Serbia | Bishkek | Kyrgyzstan
Belgrade | Serbia | Bratislava | Slovakia
Belgrade | Serbia | Brussels | Belgium
Belgrade | Serbia | Bucharest | Romania
Belgrade | Serbia | Budapest | Hungary
Belgrade | Serbia | Bujumbura | Burundi
Belgrade | Serbia | Cairo | Egypt
Belgrade | Serbia | Canberra | Australia
Belgrade | Serbia | Caracas | Venezuela
Belgrade | Serbia | Chisinau | Moldova
Belgrade | Serbia | Conakry | Guinea
Belgrade | Serbia | Copenhagen | Denmark
Belgrade | Serbia | Dakar | Senegal
Belgrade | Serbia | Damascus | Syria
Belgrade | Serbia | Dhaka | Bangladesh
Belgrade | Serbia | Doha | Qatar
Belgrade | Serbia | Dublin | Ireland
Belgrade | Serbia | Dushanbe | Tajikistan
Belgrade | Serbia | Funafuti | Tuvalu
Belgrade | Serbia | Gaborone | Botswana
Belgrade | Serbia | Georgetown | Guyana
Belgrade | Serbia | Hanoi | Vietnam
Belgrade | Serbia | Harare | Zimbabwe
Belgrade | Serbia | Havana | Cuba
Belgrade | Serbia | Helsinki | Finland
Belgrade | Serbia | Islamabad | Pakistan
Belgrade | Serbia | Jakarta | Indonesia
Belgrade | Serbia | Kabul | Afghanistan
Belgrade | Serbia | Kampala | Uganda
Belgrade | Serbia | Kathmandu | Nepal
Belgrade | Serbia | Khartoum | Sudan
Belgrade | Serbia | Kiev | Ukraine
Belgrade | Serbia | Kigali | Rwanda
Belgrade | Serbia | Kingston | Jamaica
Belgrade | Serbia | Libreville | Gabon
Belgrade | Serbia | Lilongwe | Malawi
Belmopan | Belize | Berlin | Germany
Belmopan | Belize | Bern | Switzerland
Belmopan | Belize | Bishkek | Kyrgyzstan
Belmopan | Belize | Bratislava | Slovakia
Belmopan | Belize | Brussels | Belgium
Belmopan | Belize | Bucharest | Romania
Belmopan | Belize | Budapest | Hungary
Belmopan | Belize | Bujumbura | Burundi
Belmopan | Belize | Cairo | Egypt
Belmopan | Belize | Canberra | Australia
Belmopan | Belize | Caracas | Venezuela
Belmopan | Belize | Chisinau | Moldova
Belmopan | Belize | Conakry | Guinea
Belmopan | Belize | Copenhagen | Denmark
Belmopan | Belize | Dakar | Senegal
Belmopan | Belize | Damascus | Syria
Belmopan | Belize | Dhaka | Bangladesh
Belmopan | Belize | Doha | Qatar
Belmopan | Belize | Dublin | Ireland
Belmopan | Belize | Dushanbe | Tajikistan
Belmopan | Belize | Funafuti | Tuvalu
Belmopan | Belize | Gaborone | Botswana
Belmopan | Belize | Georgetown | Guyana
Belmopan | Belize | Hanoi | Vietnam
Belmopan | Belize | Harare | Zimbabwe
Belmopan | Belize | Havana | Cuba
Belmopan | Belize | Helsinki | Finland
Belmopan | Belize | Islamabad | Pakistan
Belmopan | Belize | Jakarta | Indonesia
Belmopan | Belize | Kabul | Afghanistan
Belmopan | Belize | Kampala | Uganda
Belmopan | Belize | Kathmandu | Nepal
Belmopan | Belize | Khartoum | Sudan
Belmopan | Belize | Kiev | Ukraine
Belmopan | Belize | Kigali | Rwanda
Belmopan | Belize | Kingston | Jamaica
Belmopan | Belize | Libreville | Gabon
Belmopan | Belize | Lilongwe | Malawi
Belmopan | Belize | Lima | Peru
Berlin | Germany | Bern | Switzerland
Berlin | Germany | Bishkek | Kyrgyzstan
Berlin | Germany | Bratislava | Slovakia
Berlin | Germany | Brussels | Belgium
Berlin | Germany | Bucharest | Romania
Berlin | Germany | Budapest | Hungary
Berlin | Germany | Bujumbura | Burundi
Berlin | Germany | Cairo | Egypt
Berlin | Germany | Canberra | Australia
Berlin | Germany | Caracas | Venezuela
Berlin | Germany | Chisinau | Moldova
Berlin | Germany | Conakry | Guinea
Berlin | Germany | Copenhagen | Denmark
Berlin | Germany | Dakar | Senegal
Berlin | Germany | Damascus | Syria
Berlin | Germany | Dhaka | Bangladesh
Berlin | Germany | Doha | Qatar
Berlin | Germany | Dublin | Ireland
Berlin | Germany | Dushanbe | Tajikistan
Berlin | Germany | Funafuti | Tuvalu
Berlin | Germany | Gaborone | Botswana
Berlin | Germany | Georgetown | Guyana
Berlin | Germany | Hanoi | Vietnam
Berlin | Germany | Harare | Zimbabwe
Berlin | Germany | Havana | Cuba
Berlin | Germany | Helsinki | Finland
Berlin | Germany | Islamabad | Pakistan
Berlin | Germany | Jakarta | Indonesia
Berlin | Germany | Kabul | Afghanistan
Berlin | Germany | Kampala | Uganda
Berlin | Germany | Kathmandu | Nepal
Berlin | Germany | Khartoum | Sudan
Berlin | Germany | Kiev | Ukraine
Berlin | Germany | Kigali | Rwanda
Berlin | Germany | Kingston | Jamaica
Berlin | Germany | Libreville | Gabon
Berlin | Germany | Lilongwe | Malawi
Berlin | Germany | Lima | Peru
Berlin | Germany | Lisbon | Portugal
Bern | Switzerland | Bishkek | Kyrgyzstan
Bern | Switzerland | Bratislava | Slovakia
Bern | Switzerland | Brussels | Belgium
Bern | Switzerland | Bucharest | Romania
Bern | Switzerland | Budapest | Hungary
Bern | Switzerland | Bujumbura | Burundi
Bern | Switzerland | Cairo | Egypt
Bern | Switzerland | Canberra | Australia
Bern | Switzerland | Caracas | Venezuela
Bern | Switzerland | Chisinau | Moldova
Bern | Switzerland | Conakry | Guinea
Bern | Switzerland | Copenhagen | Denmark
Bern | Switzerland | Dakar | Senegal
Bern | Switzerland | Damascus | Syria
Bern | Switzerland | Dhaka | Bangladesh
Bern | Switzerland | Doha | Qatar
Bern | Switzerland | Dublin | Ireland
Bern | Switzerland | Dushanbe | Tajikistan
Bern | Switzerland | Funafuti | Tuvalu
Bern | Switzerland | Gaborone | Botswana
Bern | Switzerland | Georgetown | Guyana
Bern | Switzerland | Hanoi | Vietnam
Bern | Switzerland | Harare | Zimbabwe
Bern | Switzerland | Havana | Cuba
Bern | Switzerland | Helsinki | Finland
Bern | Switzerland | Islamabad | Pakistan
Bern | Switzerland | Jakarta | Indonesia
Bern | Switzerland | Kabul | Afghanistan
Bern | Switzerland | Kampala | Uganda
Bern | Switzerland | Kathmandu | Nepal
Bern | Switzerland | Khartoum | Sudan
Bern | Switzerland | Kiev | Ukraine
Bern | Switzerland | Kigali | Rwanda
Bern | Switzerland | Kingston | Jamaica
Bern | Switzerland | Libreville | Gabon
Bern | Switzerland | Lilongwe | Malawi
Bern | Switzerland | Lima | Peru
Bern | Switzerland | Lisbon | Portugal
Bern | Switzerland | Ljubljana | Slovenia
Bishkek | Kyrgyzstan | Bratislava | Slovakia
Bishkek | Kyrgyzstan | Brussels | Belgium
Bishkek | Kyrgyzstan | Bucharest | Romania
Bishkek | Kyrgyzstan | Budapest | Hungary
Bishkek | Kyrgyzstan | Bujumbura | Burundi
Bishkek | Kyrgyzstan | Cairo | Egypt
Bishkek | Kyrgyzstan | Canberra | Australia
Bishkek | Kyrgyzstan | Caracas | Venezuela
Bishkek | Kyrgyzstan | Chisinau | Moldova
Bishkek | Kyrgyzstan | Conakry | Guinea
Bishkek | Kyrgyzstan | Copenhagen | Denmark
Bishkek | Kyrgyzstan | Dakar | Senegal
Bishkek | Kyrgyzstan | Damascus | Syria
Bishkek | Kyrgyzstan | Dhaka | Bangladesh
Bishkek | Kyrgyzstan | Doha | Qatar
Bishkek | Kyrgyzstan | Dublin | Ireland
Bishkek | Kyrgyzstan | Dushanbe | Tajikistan
Bishkek | Kyrgyzstan | Funafuti | Tuvalu
Bishkek | Kyrgyzstan | Gaborone | Botswana
Bishkek | Kyrgyzstan | Georgetown | Guyana
Bishkek | Kyrgyzstan | Hanoi | Vietnam
Bishkek | Kyrgyzstan | Harare | Zimbabwe
Bishkek | Kyrgyzstan | Havana | Cuba
Bishkek | Kyrgyzstan | Helsinki | Finland
Bishkek | Kyrgyzstan | Islamabad | Pakistan
Bishkek | Kyrgyzstan | Jakarta | Indonesia
Bishkek | Kyrgyzstan | Kabul | Afghanistan
Bishkek | Kyrgyzstan | Kampala | Uganda
Bishkek | Kyrgyzstan | Kathmandu | Nepal
Bishkek | Kyrgyzstan | Khartoum | Sudan
Bishkek | Kyrgyzstan | Kiev | Ukraine
Bishkek | Kyrgyzstan | Kigali | Rwanda
Bishkek | Kyrgyzstan | Kingston | Jamaica
Bishkek | Kyrgyzstan | Libreville | Gabon
Bishkek | Kyrgyzstan | Lilongwe | Malawi
Bishkek | Kyrgyzstan | Lima | Peru
Bishkek | Kyrgyzstan | Lisbon | Portugal
Bishkek | Kyrgyzstan | Ljubljana | Slovenia
Bishkek | Kyrgyzstan | London | England
Bratislava | Slovakia | Brussels | Belgium
Bratislava | Slovakia | Bucharest | Romania
Bratislava | Slovakia | Budapest | Hungary
Bratislava | Slovakia | Bujumbura | Burundi
Bratislava | Slovakia | Cairo | Egypt
Bratislava | Slovakia | Canberra | Australia
Bratislava | Slovakia | Caracas | Venezuela
Bratislava | Slovakia | Chisinau | Moldova
Bratislava | Slovakia | Conakry | Guinea
Bratislava | Slovakia | Copenhagen | Denmark
Bratislava | Slovakia | Dakar | Senegal
Bratislava | Slovakia | Damascus | Syria
Bratislava | Slovakia | Dhaka | Bangladesh
Bratislava | Slovakia | Doha | Qatar
Bratislava | Slovakia | Dublin | Ireland
Bratislava | Slovakia | Dushanbe | Tajikistan
Bratislava | Slovakia | Funafuti | Tuvalu
Bratislava | Slovakia | Gaborone | Botswana
Bratislava | Slovakia | Georgetown | Guyana
Bratislava | Slovakia | Hanoi | Vietnam
Bratislava | Slovakia | Harare | Zimbabwe
Bratislava | Slovakia | Havana | Cuba
Bratislava | Slovakia | Helsinki | Finland
Bratislava | Slovakia | Islamabad | Pakistan
Bratislava | Slovakia | Jakarta | Indonesia
Bratislava | Slovakia | Kabul | Afghanistan
Bratislava | Slovakia | Kampala | Uganda
Bratislava | Slovakia | Kathmandu | Nepal
Bratislava | Slovakia | Khartoum | Sudan
Bratislava | Slovakia | Kiev | Ukraine
Bratislava | Slovakia | Kigali | Rwanda
Bratislava | Slovakia | Kingston | Jamaica
Bratislava | Slovakia | Libreville | Gabon
Bratislava | Slovakia | Lilongwe | Malawi
Bratislava | Slovakia | Lima | Peru
Bratislava | Slovakia | Lisbon | Portugal
Bratislava | Slovakia | Ljubljana | Slovenia
Bratislava | Slovakia | London | England
Bratislava | Slovakia | Luanda | Angola
Brussels | Belgium | Bucharest | Romania
Brussels | Belgium | Budapest | Hungary
Brussels | Belgium | Bujumbura | Burundi
Brussels | Belgium | Cairo | Egypt
Brussels | Belgium | Canberra | Australia
Brussels | Belgium | Caracas | Venezuela
Brussels | Belgium | Chisinau | Moldova
Brussels | Belgium | Conakry | Guinea
Brussels | Belgium | Copenhagen | Denmark
Brussels | Belgium | Dakar | Senegal
Brussels | Belgium | Damascus | Syria
Brussels | Belgium | Dhaka | Bangladesh
Brussels | Belgium | Doha | Qatar
Brussels | Belgium | Dublin | Ireland
Brussels | Belgium | Dushanbe | Tajikistan
Brussels | Belgium | Funafuti | Tuvalu
Brussels | Belgium | Gaborone | Botswana
Brussels | Belgium | Georgetown | Guyana
Brussels | Belgium | Hanoi | Vietnam
Brussels | Belgium | Harare | Zimbabwe
Brussels | Belgium | Havana | Cuba
Brussels | Belgium | Helsinki | Finland
Brussels | Belgium | Islamabad | Pakistan
Brussels | Belgium | Jakarta | Indonesia
Brussels | Belgium | Kabul | Afghanistan
Brussels | Belgium | Kampala | Uganda
Brussels | Belgium | Kathmandu | Nepal
Brussels | Belgium | Khartoum | Sudan
Brussels | Belgium | Kiev | Ukraine
Brussels | Belgium | Kigali | Rwanda
Brussels | Belgium | Kingston | Jamaica
Brussels | Belgium | Libreville | Gabon
Brussels | Belgium | Lilongwe | Malawi
Brussels | Belgium | Lima | Peru
Brussels | Belgium | Lisbon | Portugal
Brussels | Belgium | Ljubljana | Slovenia
Brussels | Belgium | London | England
Brussels | Belgium | Luanda | Angola
Brussels | Belgium | Lusaka | Zambia
Bucharest | Romania | Budapest | Hungary
Bucharest | Romania | Bujumbura | Burundi
Bucharest | Romania | Cairo | Egypt
Bucharest | Romania | Canberra | Australia
Bucharest | Romania | Caracas | Venezuela
Bucharest | Romania | Chisinau | Moldova
Bucharest | Romania | Conakry | Guinea
Bucharest | Romania | Copenhagen | Denmark
Bucharest | Romania | Dakar | Senegal
Bucharest | Romania | Damascus | Syria
Bucharest | Romania | Dhaka | Bangladesh
Bucharest | Romania | Doha | Qatar
Bucharest | Romania | Dublin | Ireland
Bucharest | Romania | Dushanbe | Tajikistan
Bucharest | Romania | Funafuti | Tuvalu
Bucharest | Romania | Gaborone | Botswana
Bucharest | Romania | Georgetown | Guyana
Bucharest | Romania | Hanoi | Vietnam
Bucharest | Romania | Harare | Zimbabwe
Bucharest | Romania | Havana | Cuba
Bucharest | Romania | Helsinki | Finland
Bucharest | Romania | Islamabad | Pakistan
Bucharest | Romania | Jakarta | Indonesia
Bucharest | Romania | Kabul | Afghanistan
Bucharest | Romania | Kampala | Uganda
Bucharest | Romania | Kathmandu | Nepal
Bucharest | Romania | Khartoum | Sudan
Bucharest | Romania | Kiev | Ukraine
Bucharest | Romania | Kigali | Rwanda
Bucharest | Romania | Kingston | Jamaica
Bucharest | Romania | Libreville | Gabon
Bucharest | Romania | Lilongwe | Malawi
Bucharest | Romania | Lima | Peru
Bucharest | Romania | Lisbon | Portugal
Bucharest | Romania | Ljubljana | Slovenia
Bucharest | Romania | London | England
Bucharest | Romania | Luanda | Angola
Bucharest | Romania | Lusaka | Zambia
Bucharest | Romania | Madrid | Spain
Budapest | Hungary | Bujumbura | Burundi
Budapest | Hungary | Cairo | Egypt
Budapest | Hungary | Canberra | Australia
Budapest | Hungary | Caracas | Venezuela
Budapest | Hungary | Chisinau | Moldova
Budapest | Hungary | Conakry | Guinea
Budapest | Hungary | Copenhagen | Denmark
Budapest | Hungary | Dakar | Senegal
Budapest | Hungary | Damascus | Syria
Budapest | Hungary | Dhaka | Bangladesh
Budapest | Hungary | Doha | Qatar
Budapest | Hungary | Dublin | Ireland
Budapest | Hungary | Dushanbe | Tajikistan
Budapest | Hungary | Funafuti | Tuvalu
Budapest | Hungary | Gaborone | Botswana
Budapest | Hungary | Georgetown | Guyana
Budapest | Hungary | Hanoi | Vietnam
Budapest | Hungary | Harare | Zimbabwe
Budapest | Hungary | Havana | Cuba
Budapest | Hungary | Helsinki | Finland
Budapest | Hungary | Islamabad | Pakistan
Budapest | Hungary | Jakarta | Indonesia
Budapest | Hungary | Kabul | Afghanistan
Budapest | Hungary | Kampala | Uganda
Budapest | Hungary | Kathmandu | Nepal
Budapest | Hungary | Khartoum | Sudan
Budapest | Hungary | Kiev | Ukraine
Budapest | Hungary | Kigali | Rwanda
Budapest | Hungary | Kingston | Jamaica
Budapest | Hungary | Libreville | Gabon
Budapest | Hungary | Lilongwe | Malawi
Budapest | Hungary | Lima | Peru
Budapest | Hungary | Lisbon | Portugal
Budapest | Hungary | Ljubljana | Slovenia
Budapest | Hungary | London | England
Budapest | Hungary | Luanda | Angola
Budapest | Hungary | Lusaka | Zambia
Budapest | Hungary | Madrid | Spain
Budapest | Hungary | Managua | Nicaragua
Bujumbura | Burundi | Cairo | Egypt
Bujumbura | Burundi | Canberra | Australia
Bujumbura | Burundi | Caracas | Venezuela
Bujumbura | Burundi | Chisinau | Moldova
Bujumbura | Burundi | Conakry | Guinea
Bujumbura | Burundi | Copenhagen | Denmark
Bujumbura | Burundi | Dakar | Senegal
Bujumbura | Burundi | Damascus | Syria
Bujumbura | Burundi | Dhaka | Bangladesh
Bujumbura | Burundi | Doha | Qatar
Bujumbura | Burundi | Dublin | Ireland
Bujumbura | Burundi | Dushanbe | Tajikistan
Bujumbura | Burundi | Funafuti | Tuvalu
Bujumbura | Burundi | Gaborone | Botswana
Bujumbura | Burundi | Georgetown | Guyana
Bujumbura | Burundi | Hanoi | Vietnam
Bujumbura | Burundi | Harare | Zimbabwe
Bujumbura | Burundi | Havana | Cuba
Bujumbura | Burundi | Helsinki | Finland
Bujumbura | Burundi | Islamabad | Pakistan
Bujumbura | Burundi | Jakarta | Indonesia
Bujumbura | Burundi | Kabul | Afghanistan
Bujumbura | Burundi | Kampala | Uganda
Bujumbura | Burundi | Kathmandu | Nepal
Bujumbura | Burundi | Khartoum | Sudan
Bujumbura | Burundi | Kiev | Ukraine
Bujumbura | Burundi | Kigali | Rwanda
Bujumbura | Burundi | Kingston | Jamaica
Bujumbura | Burundi | Libreville | Gabon
Bujumbura | Burundi | Lilongwe | Malawi
Bujumbura | Burundi | Lima | Peru
Bujumbura | Burundi | Lisbon | Portugal
Bujumbura | Burundi | Ljubljana | Slovenia
Bujumbura | Burundi | London | England
Bujumbura | Burundi | Luanda | Angola
Bujumbura | Burundi | Lusaka | Zambia
Bujumbura | Burundi | Madrid | Spain
Bujumbura | Burundi | Managua | Nicaragua
Bujumbura | Burundi | Manama | Bahrain
Cairo | Egypt | Canberra | Australia
Cairo | Egypt | Caracas | Venezuela
Cairo | Egypt | Chisinau | Moldova
Cairo | Egypt | Conakry | Guinea
Cairo | Egypt | Copenhagen | Denmark
Cairo | Egypt | Dakar | Senegal
Cairo | Egypt | Damascus | Syria
Cairo | Egypt | Dhaka | Bangladesh
Cairo | Egypt | Doha | Qatar
Cairo | Egypt | Dublin | Ireland
Cairo | Egypt | Dushanbe | Tajikistan
Cairo | Egypt | Funafuti | Tuvalu
Cairo | Egypt | Gaborone | Botswana
Cairo | Egypt | Georgetown | Guyana
Cairo | Egypt | Hanoi | Vietnam
Cairo | Egypt | Harare | Zimbabwe
Cairo | Egypt | Havana | Cuba
Cairo | Egypt | Helsinki | Finland
Cairo | Egypt | Islamabad | Pakistan
Cairo | Egypt | Jakarta | Indonesia
Cairo | Egypt | Kabul | Afghanistan
Cairo | Egypt | Kampala | Uganda
Cairo | Egypt | Kathmandu | Nepal
Cairo | Egypt | Khartoum | Sudan
Cairo | Egypt | Kiev | Ukraine
Cairo | Egypt | Kigali | Rwanda
Cairo | Egypt | Kingston | Jamaica
Cairo | Egypt | Libreville | Gabon
Cairo | Egypt | Lilongwe | Malawi
Cairo | Egypt | Lima | Peru
Cairo | Egypt | Lisbon | Portugal
Cairo | Egypt | Ljubljana | Slovenia
Cairo | Egypt | London | England
Cairo | Egypt | Luanda | Angola
Cairo | Egypt | Lusaka | Zambia
Cairo | Egypt | Madrid | Spain
Cairo | Egypt | Managua | Nicaragua
Cairo | Egypt | Manama | Bahrain
Cairo | Egypt | Manila | Philippines
Canberra | Australia | Caracas | Venezuela
Canberra | Australia | Chisinau | Moldova
Canberra | Australia | Conakry | Guinea
Canberra | Australia | Copenhagen | Denmark
Canberra | Australia | Dakar | Senegal
Canberra | Australia | Damascus | Syria
Canberra | Australia | Dhaka | Bangladesh
Canberra | Australia | Doha | Qatar
Canberra | Australia | Dublin | Ireland
Canberra | Australia | Dushanbe | Tajikistan
Canberra | Australia | Funafuti | Tuvalu
Canberra | Australia | Gaborone | Botswana
Canberra | Australia | Georgetown | Guyana
Canberra | Australia | Hanoi | Vietnam
Canberra | Australia | Harare | Zimbabwe
Canberra | Australia | Havana | Cuba
Canberra | Australia | Helsinki | Finland
Canberra | Australia | Islamabad | Pakistan
Canberra | Australia | Jakarta | Indonesia
Canberra | Australia | Kabul | Afghanistan
Canberra | Australia | Kampala | Uganda
Canberra | Australia | Kathmandu | Nepal
Canberra | Australia | Khartoum | Sudan
Canberra | Australia | Kiev | Ukraine
Canberra | Australia | Kigali | Rwanda
Canberra | Australia | Kingston | Jamaica
Canberra | Australia | Libreville | Gabon
Canberra | Australia | Lilongwe | Malawi
Canberra | Australia | Lima | Peru
Canberra | Australia | Lisbon | Portugal
Canberra | Australia | Ljubljana | Slovenia
Canberra | Australia | London | England
Canberra | Australia | Luanda | Angola
Canberra | Australia | Lusaka | Zambia
Canberra | Australia | Madrid | Spain
Canberra | Australia | Managua | Nicaragua
Canberra | Australia | Manama | Bahrain
Canberra | Australia | Manila | Philippines
Canberra | Australia | Maputo | Mozambique
Caracas | Venezuela | Chisinau | Moldova
Caracas | Venezuela | Conakry | Guinea
Caracas | Venezuela | Copenhagen | Denmark
Caracas | Venezuela | Dakar | Senegal
Caracas | Venezuela | Damascus | Syria
Caracas | Venezuela | Dhaka | Bangladesh
Caracas | Venezuela | Doha | Qatar
Caracas | Venezuela | Dublin | Ireland
Caracas | Venezuela | Dushanbe | Tajikistan
Caracas | Venezuela | Funafuti | Tuvalu
Caracas | Venezuela | Gaborone | Botswana
Caracas | Venezuela | Georgetown | Guyana
Caracas | Venezuela | Hanoi | Vietnam
Caracas | Venezuela | Harare | Zimbabwe
Caracas | Venezuela | Havana | Cuba
Caracas | Venezuela | Helsinki | Finland
Caracas | Venezuela | Islamabad | Pakistan
Caracas | Venezuela | Jakarta | Indonesia
Caracas | Venezuela | Kabul | Afghanistan
Caracas | Venezuela | Kampala | Uganda
Caracas | Venezuela | Kathmandu | Nepal
Caracas | Venezuela | Khartoum | Sudan
Caracas | Venezuela | Kiev | Ukraine
Caracas | Venezuela | Kigali | Rwanda
Caracas | Venezuela | Kingston | Jamaica
Caracas | Venezuela | Libreville | Gabon
Caracas | Venezuela | Lilongwe | Malawi
Caracas | Venezuela | Lima | Peru
Caracas | Venezuela | Lisbon | Portugal
Caracas | Venezuela | Ljubljana | Slovenia
Caracas | Venezuela | London | England
Caracas | Venezuela | Luanda | Angola
Caracas | Venezuela | Lusaka | Zambia
Caracas | Venezuela | Madrid | Spain
Caracas | Venezuela | Managua | Nicaragua
Caracas | Venezuela | Manama | Bahrain
Caracas | Venezuela | Manila | Philippines
Caracas | Venezuela | Maputo | Mozambique
Caracas | Venezuela | Minsk | Belarus
Chisinau | Moldova | Conakry | Guinea
Chisinau | Moldova | Copenhagen | Denmark
Chisinau | Moldova | Dakar | Senegal
Chisinau | Moldova | Damascus | Syria
Chisinau | Moldova | Dhaka | Bangladesh
Chisinau | Moldova | Doha | Qatar
Chisinau | Moldova | Dublin | Ireland
Chisinau | Moldova | Dushanbe | Tajikistan
Chisinau | Moldova | Funafuti | Tuvalu
Chisinau | Moldova | Gaborone | Botswana
Chisinau | Moldova | Georgetown | Guyana
Chisinau | Moldova | Hanoi | Vietnam
Chisinau | Moldova | Harare | Zimbabwe
Chisinau | Moldova | Havana | Cuba
Chisinau | Moldova | Helsinki | Finland
Chisinau | Moldova | Islamabad | Pakistan
Chisinau | Moldova | Jakarta | Indonesia
Chisinau | Moldova | Kabul | Afghanistan
Chisinau | Moldova | Kampala | Uganda
Chisinau | Moldova | Kathmandu | Nepal
Chisinau | Moldova | Khartoum | Sudan
Chisinau | Moldova | Kiev | Ukraine
Chisinau | Moldova | Kigali | Rwanda
Chisinau | Moldova | Kingston | Jamaica
Chisinau | Moldova | Libreville | Gabon
Chisinau | Moldova | Lilongwe | Malawi
Chisinau | Moldova | Lima | Peru
Chisinau | Moldova | Lisbon | Portugal
Chisinau | Moldova | Ljubljana | Slovenia
Chisinau | Moldova | London | England
Chisinau | Moldova | Luanda | Angola
Chisinau | Moldova | Lusaka | Zambia
Chisinau | Moldova | Madrid | Spain
Chisinau | Moldova | Managua | Nicaragua
Chisinau | Moldova | Manama | Bahrain
Chisinau | Moldova | Manila | Philippines
Chisinau | Moldova | Maputo | Mozambique
Chisinau | Moldova | Minsk | Belarus
Chisinau | Moldova | Mogadishu | Somalia
Conakry | Guinea | Copenhagen | Denmark
Conakry | Guinea | Dakar | Senegal
Conakry | Guinea | Damascus | Syria
Conakry | Guinea | Dhaka | Bangladesh
Conakry | Guinea | Doha | Qatar
Conakry | Guinea | Dublin | Ireland
Conakry | Guinea | Dushanbe | Tajikistan
Conakry | Guinea | Funafuti | Tuvalu
Conakry | Guinea | Gaborone | Botswana
Conakry | Guinea | Georgetown | Guyana
Conakry | Guinea | Hanoi | Vietnam
Conakry | Guinea | Harare | Zimbabwe
Conakry | Guinea | Havana | Cuba
Conakry | Guinea | Helsinki | Finland
Conakry | Guinea | Islamabad | Pakistan
Conakry | Guinea | Jakarta | Indonesia
Conakry | Guinea | Kabul | Afghanistan
Conakry | Guinea | Kampala | Uganda
Conakry | Guinea | Kathmandu | Nepal
Conakry | Guinea | Khartoum | Sudan
Conakry | Guinea | Kiev | Ukraine
Conakry | Guinea | Kigali | Rwanda
Conakry | Guinea | Kingston | Jamaica
Conakry | Guinea | Libreville | Gabon
Conakry | Guinea | Lilongwe | Malawi
Conakry | Guinea | Lima | Peru
Conakry | Guinea | Lisbon | Portugal
Conakry | Guinea | Ljubljana | Slovenia
Conakry | Guinea | London | England
Conakry | Guinea | Luanda | Angola
Conakry | Guinea | Lusaka | Zambia
Conakry | Guinea | Madrid | Spain
Conakry | Guinea | Managua | Nicaragua
Conakry | Guinea | Manama | Bahrain
Conakry | Guinea | Manila | Philippines
Conakry | Guinea | Maputo | Mozambique
Conakry | Guinea | Minsk | Belarus
Conakry | Guinea | Mogadishu | Somalia
Conakry | Guinea | Monrovia | Liberia
Copenhagen | Denmark | Dakar | Senegal
Copenhagen | Denmark | Damascus | Syria
Copenhagen | Denmark | Dhaka | Bangladesh
Copenhagen | Denmark | Doha | Qatar
Copenhagen | Denmark | Dublin | Ireland
Copenhagen | Denmark | Dushanbe | Tajikistan
Copenhagen | Denmark | Funafuti | Tuvalu
Copenhagen | Denmark | Gaborone | Botswana
Copenhagen | Denmark | Georgetown | Guyana
Copenhagen | Denmark | Hanoi | Vietnam
Copenhagen | Denmark | Harare | Zimbabwe
Copenhagen | Denmark | Havana | Cuba
Copenhagen | Denmark | Helsinki | Finland
Copenhagen | Denmark | Islamabad | Pakistan
Copenhagen | Denmark | Jakarta | Indonesia
Copenhagen | Denmark | Kabul | Afghanistan
Copenhagen | Denmark | Kampala | Uganda
Copenhagen | Denmark | Kathmandu | Nepal
Copenhagen | Denmark | Khartoum | Sudan
Copenhagen | Denmark | Kiev | Ukraine
Copenhagen | Denmark | Kigali | Rwanda
Copenhagen | Denmark | Kingston | Jamaica
Copenhagen | Denmark | Libreville | Gabon
Copenhagen | Denmark | Lilongwe | Malawi
Copenhagen | Denmark | Lima | Peru
Copenhagen | Denmark | Lisbon | Portugal
Copenhagen | Denmark | Ljubljana | Slovenia
Copenhagen | Denmark | London | England
Copenhagen | Denmark | Luanda | Angola
Copenhagen | Denmark | Lusaka | Zambia
Copenhagen | Denmark | Madrid | Spain
Copenhagen | Denmark | Managua | Nicaragua
Copenhagen | Denmark | Manama | Bahrain
Copenhagen | Denmark | Manila | Philippines
Copenhagen | Denmark | Maputo | Mozambique
Copenhagen | Denmark | Minsk | Belarus
Copenhagen | Denmark | Mogadishu | Somalia
Copenhagen | Denmark | Monrovia | Liberia
Copenhagen | Denmark | Montevideo | Uruguay
Dakar | Senegal | Damascus | Syria
Dakar | Senegal | Dhaka | Bangladesh
Dakar | Senegal | Doha | Qatar
Dakar | Senegal | Dublin | Ireland
Dakar | Senegal | Dushanbe | Tajikistan
Dakar | Senegal | Funafuti | Tuvalu
Dakar | Senegal | Gaborone | Botswana
Dakar | Senegal | Georgetown | Guyana
Dakar | Senegal | Hanoi | Vietnam
Dakar | Senegal | Harare | Zimbabwe
Dakar | Senegal | Havana | Cuba
Dakar | Senegal | Helsinki | Finland
Dakar | Senegal | Islamabad | Pakistan
Dakar | Senegal | Jakarta | Indonesia
Dakar | Senegal | Kabul | Afghanistan
Dakar | Senegal | Kampala | Uganda
Dakar | Senegal | Kathmandu | Nepal
Dakar | Senegal | Khartoum | Sudan
Dakar | Senegal | Kiev | Ukraine
Dakar | Senegal | Kigali | Rwanda
Dakar | Senegal | Kingston | Jamaica
Dakar | Senegal | Libreville | Gabon
Dakar | Senegal | Lilongwe | Malawi
Dakar | Senegal | Lima | Peru
Dakar | Senegal | Lisbon | Portugal
Dakar | Senegal | Ljubljana | Slovenia
Dakar | Senegal | London | England
Dakar | Senegal | Luanda | Angola
Dakar | Senegal | Lusaka | Zambia
Dakar | Senegal | Madrid | Spain
Dakar | Senegal | Managua | Nicaragua
Dakar | Senegal | Manama | Bahrain
Dakar | Senegal | Manila | Philippines
Dakar | Senegal | Maputo | Mozambique
Dakar | Senegal | Minsk | Belarus
Dakar | Senegal | Mogadishu | Somalia
Dakar | Senegal | Monrovia | Liberia
Dakar | Senegal | Montevideo | Uruguay
Dakar | Senegal | Moscow | Russia
Damascus | Syria | Dhaka | Bangladesh
Damascus | Syria | Doha | Qatar
Damascus | Syria | Dublin | Ireland
Damascus | Syria | Dushanbe | Tajikistan
Damascus | Syria | Funafuti | Tuvalu
Damascus | Syria | Gaborone | Botswana
Damascus | Syria | Georgetown | Guyana
Damascus | Syria | Hanoi | Vietnam
Damascus | Syria | Harare | Zimbabwe
Damascus | Syria | Havana | Cuba
Damascus | Syria | Helsinki | Finland
Damascus | Syria | Islamabad | Pakistan
Damascus | Syria | Jakarta | Indonesia
Damascus | Syria | Kabul | Afghanistan
Damascus | Syria | Kampala | Uganda
Damascus | Syria | Kathmandu | Nepal
Damascus | Syria | Khartoum | Sudan
Damascus | Syria | Kiev | Ukraine
Damascus | Syria | Kigali | Rwanda
Damascus | Syria | Kingston | Jamaica
Damascus | Syria | Libreville | Gabon
Damascus | Syria | Lilongwe | Malawi
Damascus | Syria | Lima | Peru
Damascus | Syria | Lisbon | Portugal
Damascus | Syria | Ljubljana | Slovenia
Damascus | Syria | London | England
Damascus | Syria | Luanda | Angola
Damascus | Syria | Lusaka | Zambia
Damascus | Syria | Madrid | Spain
Damascus | Syria | Managua | Nicaragua
Damascus | Syria | Manama | Bahrain
Damascus | Syria | Manila | Philippines
Damascus | Syria | Maputo | Mozambique
Damascus | Syria | Minsk | Belarus
Damascus | Syria | Mogadishu | Somalia
Damascus | Syria | Monrovia | Liberia
Damascus | Syria | Montevideo | Uruguay
Damascus | Syria | Moscow | Russia
Damascus | Syria | Muscat | Oman
Dhaka | Bangladesh | Doha | Qatar
Dhaka | Bangladesh | Dublin | Ireland
Dhaka | Bangladesh | Dushanbe | Tajikistan
Dhaka | Bangladesh | Funafuti | Tuvalu
Dhaka | Bangladesh | Gaborone | Botswana
Dhaka | Bangladesh | Georgetown | Guyana
Dhaka | Bangladesh | Hanoi | Vietnam
Dhaka | Bangladesh | Harare | Zimbabwe
Dhaka | Bangladesh | Havana | Cuba
Dhaka | Bangladesh | Helsinki | Finland
Dhaka | Bangladesh | Islamabad | Pakistan
Dhaka | Bangladesh | Jakarta | Indonesia
Dhaka | Bangladesh | Kabul | Afghanistan
Dhaka | Bangladesh | Kampala | Uganda
Dhaka | Bangladesh | Kathmandu | Nepal
Dhaka | Bangladesh | Khartoum | Sudan
Dhaka | Bangladesh | Kiev | Ukraine
Dhaka | Bangladesh | Kigali | Rwanda
Dhaka | Bangladesh | Kingston | Jamaica
Dhaka | Bangladesh | Libreville | Gabon
Dhaka | Bangladesh | Lilongwe | Malawi
Dhaka | Bangladesh | Lima | Peru
Dhaka | Bangladesh | Lisbon | Portugal
Dhaka | Bangladesh | Ljubljana | Slovenia
Dhaka | Bangladesh | London | England
Dhaka | Bangladesh | Luanda | Angola
Dhaka | Bangladesh | Lusaka | Zambia
Dhaka | Bangladesh | Madrid | Spain
Dhaka | Bangladesh | Managua | Nicaragua
Dhaka | Bangladesh | Manama | Bahrain
Dhaka | Bangladesh | Manila | Philippines
Dhaka | Bangladesh | Maputo | Mozambique
Dhaka | Bangladesh | Minsk | Belarus
Dhaka | Bangladesh | Mogadishu | Somalia
Dhaka | Bangladesh | Monrovia | Liberia
Dhaka | Bangladesh | Montevideo | Uruguay
Dhaka | Bangladesh | Moscow | Russia
Dhaka | Bangladesh | Muscat | Oman
Dhaka | Bangladesh | Nairobi | Kenya
Doha | Qatar | Dublin | Ireland
Doha | Qatar | Dushanbe | Tajikistan
Doha | Qatar | Funafuti | Tuvalu
Doha | Qatar | Gaborone | Botswana
Doha | Qatar | Georgetown | Guyana
Doha | Qatar | Hanoi | Vietnam
Doha | Qatar | Harare | Zimbabwe
Doha | Qatar | Havana | Cuba
Doha | Qatar | Helsinki | Finland
Doha | Qatar | Islamabad | Pakistan
Doha | Qatar | Jakarta | Indonesia
Doha | Qatar | Kabul | Afghanistan
Doha | Qatar | Kampala | Uganda
Doha | Qatar | Kathmandu | Nepal
Doha | Qatar | Khartoum | Sudan
Doha | Qatar | Kiev | Ukraine
Doha | Qatar | Kigali | Rwanda
Doha | Qatar | Kingston | Jamaica
Doha | Qatar | Libreville | Gabon
Doha | Qatar | Lilongwe | Malawi
Doha | Qatar | Lima | Peru
Doha | Qatar | Lisbon | Portugal
Doha | Qatar | Ljubljana | Slovenia
Doha | Qatar | London | England
Doha | Qatar | Luanda | Angola
Doha | Qatar | Lusaka | Zambia
Doha | Qatar | Madrid | Spain
Doha | Qatar | Managua | Nicaragua
Doha | Qatar | Manama | Bahrain
Doha | Qatar | Manila | Philippines
Doha | Qatar | Maputo | Mozambique
Doha | Qatar | Minsk | Belarus
Doha | Qatar | Mogadishu | Somalia
Doha | Qatar | Monrovia | Liberia
Doha | Qatar | Montevideo | Uruguay
Doha | Qatar | Moscow | Russia
Doha | Qatar | Muscat | Oman
Doha | Qatar | Nairobi | Kenya
Doha | Qatar | Nassau | Bahamas
Dublin | Ireland | Dushanbe | Tajikistan
Dublin | Ireland | Funafuti | Tuvalu
Dublin | Ireland | Gaborone | Botswana
Dublin | Ireland | Georgetown | Guyana
Dublin | Ireland | Hanoi | Vietnam
Dublin | Ireland | Harare | Zimbabwe
Dublin | Ireland | Havana | Cuba
Dublin | Ireland | Helsinki | Finland
Dublin | Ireland | Islamabad | Pakistan
Dublin | Ireland | Jakarta | Indonesia
Dublin | Ireland | Kabul | Afghanistan
Dublin | Ireland | Kampala | Uganda
Dublin | Ireland | Kathmandu | Nepal
Dublin | Ireland | Khartoum | Sudan
Dublin | Ireland | Kiev | Ukraine
Dublin | Ireland | Kigali | Rwanda
Dublin | Ireland | Kingston | Jamaica
Dublin | Ireland | Libreville | Gabon
Dublin | Ireland | Lilongwe | Malawi
Dublin | Ireland | Lima | Peru
Dublin | Ireland | Lisbon | Portugal
Dublin | Ireland | Ljubljana | Slovenia
Dublin | Ireland | London | England
Dublin | Ireland | Luanda | Angola
Dublin | Ireland | Lusaka | Zambia
Dublin | Ireland | Madrid | Spain
Dublin | Ireland | Managua | Nicaragua
Dublin | Ireland | Manama | Bahrain
Dublin | Ireland | Manila | Philippines
Dublin | Ireland | Maputo | Mozambique
Dublin | Ireland | Minsk | Belarus
Dublin | Ireland | Mogadishu | Somalia
Dublin | Ireland | Monrovia | Liberia
Dublin | Ireland | Montevideo | Uruguay
Dublin | Ireland | Moscow | Russia
Dublin | Ireland | Muscat | Oman
Dublin | Ireland | Nairobi | Kenya
Dublin | Ireland | Nassau | Bahamas
Dublin | Ireland | Niamey | Niger
Dushanbe | Tajikistan | Funafuti | Tuvalu
Dushanbe | Tajikistan | Gaborone | Botswana
Dushanbe | Tajikistan | Georgetown | Guyana
Dushanbe | Tajikistan | Hanoi | Vietnam
Dushanbe | Tajikistan | Harare | Zimbabwe
Dushanbe | Tajikistan | Havana | Cuba
Dushanbe | Tajikistan | Helsinki | Finland
Dushanbe | Tajikistan | Islamabad | Pakistan
Dushanbe | Tajikistan | Jakarta | Indonesia
Dushanbe | Tajikistan | Kabul | Afghanistan
Dushanbe | Tajikistan | Kampala | Uganda
Dushanbe | Tajikistan | Kathmandu | Nepal
Dushanbe | Tajikistan | Khartoum | Sudan
Dushanbe | Tajikistan | Kiev | Ukraine
Dushanbe | Tajikistan | Kigali | Rwanda
Dushanbe | Tajikistan | Kingston | Jamaica
Dushanbe | Tajikistan | Libreville | Gabon
Dushanbe | Tajikistan | Lilongwe | Malawi
Dushanbe | Tajikistan | Lima | Peru
Dushanbe | Tajikistan | Lisbon | Portugal
Dushanbe | Tajikistan | Ljubljana | Slovenia
Dushanbe | Tajikistan | London | England
Dushanbe | Tajikistan | Luanda | Angola
Dushanbe | Tajikistan | Lusaka | Zambia
Dushanbe | Tajikistan | Madrid | Spain
Dushanbe | Tajikistan | Managua | Nicaragua
Dushanbe | Tajikistan | Manama | Bahrain
Dushanbe | Tajikistan | Manila | Philippines
Dushanbe | Tajikistan | Maputo | Mozambique
Dushanbe | Tajikistan | Minsk | Belarus
Dushanbe | Tajikistan | Mogadishu | Somalia
Dushanbe | Tajikistan | Monrovia | Liberia
Dushanbe | Tajikistan | Montevideo | Uruguay
Dushanbe | Tajikistan | Moscow | Russia
Dushanbe | Tajikistan | Muscat | Oman
Dushanbe | Tajikistan | Nairobi | Kenya
Dushanbe | Tajikistan | Nassau | Bahamas
Dushanbe | Tajikistan | Niamey | Niger
Dushanbe | Tajikistan | Nicosia | Cyprus
Funafuti | Tuvalu | Gaborone | Botswana
Funafuti | Tuvalu | Georgetown | Guyana
Funafuti | Tuvalu | Hanoi | Vietnam
Funafuti | Tuvalu | Harare | Zimbabwe
Funafuti | Tuvalu | Havana | Cuba
Funafuti | Tuvalu | Helsinki | Finland
Funafuti | Tuvalu | Islamabad | Pakistan
Funafuti | Tuvalu | Jakarta | Indonesia
Funafuti | Tuvalu | Kabul | Afghanistan
Funafuti | Tuvalu | Kampala | Uganda
Funafuti | Tuvalu | Kathmandu | Nepal
Funafuti | Tuvalu | Khartoum | Sudan
Funafuti | Tuvalu | Kiev | Ukraine
Funafuti | Tuvalu | Kigali | Rwanda
Funafuti | Tuvalu | Kingston | Jamaica
Funafuti | Tuvalu | Libreville | Gabon
Funafuti | Tuvalu | Lilongwe | Malawi
Funafuti | Tuvalu | Lima | Peru
Funafuti | Tuvalu | Lisbon | Portugal
Funafuti | Tuvalu | Ljubljana | Slovenia
Funafuti | Tuvalu | London | England
Funafuti | Tuvalu | Luanda | Angola
Funafuti | Tuvalu | Lusaka | Zambia
Funafuti | Tuvalu | Madrid | Spain
Funafuti | Tuvalu | Managua | Nicaragua
Funafuti | Tuvalu | Manama | Bahrain
Funafuti | Tuvalu | Manila | Philippines
Funafuti | Tuvalu | Maputo | Mozambique
Funafuti | Tuvalu | Minsk | Belarus
Funafuti | Tuvalu | Mogadishu | Somalia
Funafuti | Tuvalu | Monrovia | Liberia
Funafuti | Tuvalu | Montevideo | Uruguay
Funafuti | Tuvalu | Moscow | Russia
Funafuti | Tuvalu | Muscat | Oman
Funafuti | Tuvalu | Nairobi | Kenya
Funafuti | Tuvalu | Nassau | Bahamas
Funafuti | Tuvalu | Niamey | Niger
Funafuti | Tuvalu | Nicosia | Cyprus
Funafuti | Tuvalu | Nouakchott | Mauritania
Gaborone | Botswana | Georgetown | Guyana
Gaborone | Botswana | Hanoi | Vietnam
Gaborone | Botswana | Harare | Zimbabwe
Gaborone | Botswana | Havana | Cuba
Gaborone | Botswana | Helsinki | Finland
Gaborone | Botswana | Islamabad | Pakistan
Gaborone | Botswana | Jakarta | Indonesia
Gaborone | Botswana | Kabul | Afghanistan
Gaborone | Botswana | Kampala | Uganda
Gaborone | Botswana | Kathmandu | Nepal
Gaborone | Botswana | Khartoum | Sudan
Gaborone | Botswana | Kiev | Ukraine
Gaborone | Botswana | Kigali | Rwanda
Gaborone | Botswana | Kingston | Jamaica
Gaborone | Botswana | Libreville | Gabon
Gaborone | Botswana | Lilongwe | Malawi
Gaborone | Botswana | Lima | Peru
Gaborone | Botswana | Lisbon | Portugal
Gaborone | Botswana | Ljubljana | Slovenia
Gaborone | Botswana | London | England
Gaborone | Botswana | Luanda | Angola
Gaborone | Botswana | Lusaka | Zambia
Gaborone | Botswana | Madrid | Spain
Gaborone | Botswana | Managua | Nicaragua
Gaborone | Botswana | Manama | Bahrain
Gaborone | Botswana | Manila | Philippines
Gaborone | Botswana | Maputo | Mozambique
Gaborone | Botswana | Minsk | Belarus
Gaborone | Botswana | Mogadishu | Somalia
Gaborone | Botswana | Monrovia | Liberia
Gaborone | Botswana | Montevideo | Uruguay
Gaborone | Botswana | Moscow | Russia
Gaborone | Botswana | Muscat | Oman
Gaborone | Botswana | Nairobi | Kenya
Gaborone | Botswana | Nassau | Bahamas
Gaborone | Botswana | Niamey | Niger
Gaborone | Botswana | Nicosia | Cyprus
Gaborone | Botswana | Nouakchott | Mauritania
Gaborone | Botswana | Nuuk | Greenland
Georgetown | Guyana | Hanoi | Vietnam
Georgetown | Guyana | Harare | Zimbabwe
Georgetown | Guyana | Havana | Cuba
Georgetown | Guyana | Helsinki | Finland
Georgetown | Guyana | Islamabad | Pakistan
Georgetown | Guyana | Jakarta | Indonesia
Georgetown | Guyana | Kabul | Afghanistan
Georgetown | Guyana | Kampala | Uganda
Georgetown | Guyana | Kathmandu | Nepal
Georgetown | Guyana | Khartoum | Sudan
Georgetown | Guyana | Kiev | Ukraine
Georgetown | Guyana | Kigali | Rwanda
Georgetown | Guyana | Kingston | Jamaica
Georgetown | Guyana | Libreville | Gabon
Georgetown | Guyana | Lilongwe | Malawi
Georgetown | Guyana | Lima | Peru
Georgetown | Guyana | Lisbon | Portugal
Georgetown | Guyana | Ljubljana | Slovenia
Georgetown | Guyana | London | England
Georgetown | Guyana | Luanda | Angola
Georgetown | Guyana | Lusaka | Zambia
Georgetown | Guyana | Madrid | Spain
Georgetown | Guyana | Managua | Nicaragua
Georgetown | Guyana | Manama | Bahrain
Georgetown | Guyana | Manila | Philippines
Georgetown | Guyana | Maputo | Mozambique
Georgetown | Guyana | Minsk | Belarus
Georgetown | Guyana | Mogadishu | Somalia
Georgetown | Guyana | Monrovia | Liberia
Georgetown | Guyana | Montevideo | Uruguay
Georgetown | Guyana | Moscow | Russia
Georgetown | Guyana | Muscat | Oman
Georgetown | Guyana | Nairobi | Kenya
Georgetown | Guyana | Nassau | Bahamas
Georgetown | Guyana | Niamey | Niger
Georgetown | Guyana | Nicosia | Cyprus
Georgetown | Guyana | Nouakchott | Mauritania
Georgetown | Guyana | Nuuk | Greenland
Georgetown | Guyana | Oslo | Norway
Hanoi | Vietnam | Harare | Zimbabwe
Hanoi | Vietnam | Havana | Cuba
Hanoi | Vietnam | Helsinki | Finland
Hanoi | Vietnam | Islamabad | Pakistan
Hanoi | Vietnam | Jakarta | Indonesia
Hanoi | Vietnam | Kabul | Afghanistan
Hanoi | Vietnam | Kampala | Uganda
Hanoi | Vietnam | Kathmandu | Nepal
Hanoi | Vietnam | Khartoum | Sudan
Hanoi | Vietnam | Kiev | Ukraine
Hanoi | Vietnam | Kigali | Rwanda
Hanoi | Vietnam | Kingston | Jamaica
Hanoi | Vietnam | Libreville | Gabon
Hanoi | Vietnam | Lilongwe | Malawi
Hanoi | Vietnam | Lima | Peru
Hanoi | Vietnam | Lisbon | Portugal
Hanoi | Vietnam | Ljubljana | Slovenia
Hanoi | Vietnam | London | England
Hanoi | Vietnam | Luanda | Angola
Hanoi | Vietnam | Lusaka | Zambia
Hanoi | Vietnam | Madrid | Spain
Hanoi | Vietnam | Managua | Nicaragua
Hanoi | Vietnam | Manama | Bahrain
Hanoi | Vietnam | Manila | Philippines
Hanoi | Vietnam | Maputo | Mozambique
Hanoi | Vietnam | Minsk | Belarus
Hanoi | Vietnam | Mogadishu | Somalia
Hanoi | Vietnam | Monrovia | Liberia
Hanoi | Vietnam | Montevideo | Uruguay
Hanoi | Vietnam | Moscow | Russia
Hanoi | Vietnam | Muscat | Oman
Hanoi | Vietnam | Nairobi | Kenya
Hanoi | Vietnam | Nassau | Bahamas
Hanoi | Vietnam | Niamey | Niger
Hanoi | Vietnam | Nicosia | Cyprus
Hanoi | Vietnam | Nouakchott | Mauritania
Hanoi | Vietnam | Nuuk | Greenland
Hanoi | Vietnam | Oslo | Norway
Hanoi | Vietnam | Ottawa | Canada
Harare | Zimbabwe | Havana | Cuba
Harare | Zimbabwe | Helsinki | Finland
Harare | Zimbabwe | Islamabad | Pakistan
Harare | Zimbabwe | Jakarta | Indonesia
Harare | Zimbabwe | Kabul | Afghanistan
Harare | Zimbabwe | Kampala | Uganda
Harare | Zimbabwe | Kathmandu | Nepal
Harare | Zimbabwe | Khartoum | Sudan
Harare | Zimbabwe | Kiev | Ukraine
Harare | Zimbabwe | Kigali | Rwanda
Harare | Zimbabwe | Kingston | Jamaica
Harare | Zimbabwe | Libreville | Gabon
Harare | Zimbabwe | Lilongwe | Malawi
Harare | Zimbabwe | Lima | Peru
Harare | Zimbabwe | Lisbon | Portugal
Harare | Zimbabwe | Ljubljana | Slovenia
Harare | Zimbabwe | London | England
Harare | Zimbabwe | Luanda | Angola
Harare | Zimbabwe | Lusaka | Zambia
Harare | Zimbabwe | Madrid | Spain
Harare | Zimbabwe | Managua | Nicaragua
Harare | Zimbabwe | Manama | Bahrain
Harare | Zimbabwe | Manila | Philippines
Harare | Zimbabwe | Maputo | Mozambique
Harare | Zimbabwe | Minsk | Belarus
Harare | Zimbabwe | Mogadishu | Somalia
Harare | Zimbabwe | Monrovia | Liberia
Harare | Zimbabwe | Montevideo | Uruguay
Harare | Zimbabwe | Moscow | Russia
Harare | Zimbabwe | Muscat | Oman
Harare | Zimbabwe | Nairobi | Kenya
Harare | Zimbabwe | Nassau | Bahamas
Harare | Zimbabwe | Niamey | Niger
Harare | Zimbabwe | Nicosia | Cyprus
Harare | Zimbabwe | Nouakchott | Mauritania
Harare | Zimbabwe | Nuuk | Greenland
Harare | Zimbabwe | Oslo | Norway
Harare | Zimbabwe | Ottawa | Canada
Harare | Zimbabwe | Paramaribo | Suriname
Havana | Cuba | Helsinki | Finland
Havana | Cuba | Islamabad | Pakistan
Havana | Cuba | Jakarta | Indonesia
Havana | Cuba | Kabul | Afghanistan
Havana | Cuba | Kampala | Uganda
Havana | Cuba | Kathmandu | Nepal
Havana | Cuba | Khartoum | Sudan
Havana | Cuba | Kiev | Ukraine
Havana | Cuba | Kigali | Rwanda
Havana | Cuba | Kingston | Jamaica
Havana | Cuba | Libreville | Gabon
Havana | Cuba | Lilongwe | Malawi
Havana | Cuba | Lima | Peru
Havana | Cuba | Lisbon | Portugal
Havana | Cuba | Ljubljana | Slovenia
Havana | Cuba | London | England
Havana | Cuba | Luanda | Angola
Havana | Cuba | Lusaka | Zambia
Havana | Cuba | Madrid | Spain
Havana | Cuba | Managua | Nicaragua
Havana | Cuba | Manama | Bahrain
Havana | Cuba | Manila | Philippines
Havana | Cuba | Maputo | Mozambique
Havana | Cuba | Minsk | Belarus
Havana | Cuba | Mogadishu | Somalia
Havana | Cuba | Monrovia | Liberia
Havana | Cuba | Montevideo | Uruguay
Havana | Cuba | Moscow | Russia
Havana | Cuba | Muscat | Oman
Havana | Cuba | Nairobi | Kenya
Havana | Cuba | Nassau | Bahamas
Havana | Cuba | Niamey | Niger
Havana | Cuba | Nicosia | Cyprus
Havana | Cuba | Nouakchott | Mauritania
Havana | Cuba | Nuuk | Greenland
Havana | Cuba | Oslo | Norway
Havana | Cuba | Ottawa | Canada
Havana | Cuba | Paramaribo | Suriname
Havana | Cuba | Paris | France
Helsinki | Finland | Islamabad | Pakistan
Helsinki | Finland | Jakarta | Indonesia
Helsinki | Finland | Kabul | Afghanistan
Helsinki | Finland | Kampala | Uganda
Helsinki | Finland | Kathmandu | Nepal
Helsinki | Finland | Khartoum | Sudan
Helsinki | Finland | Kiev | Ukraine
Helsinki | Finland | Kigali | Rwanda
Helsinki | Finland | Kingston | Jamaica
Helsinki | Finland | Libreville | Gabon
Helsinki | Finland | Lilongwe | Malawi
Helsinki | Finland | Lima | Peru
Helsinki | Finland | Lisbon | Portugal
Helsinki | Finland | Ljubljana | Slovenia
Helsinki | Finland | London | England
Helsinki | Finland | Luanda | Angola
Helsinki | Finland | Lusaka | Zambia
Helsinki | Finland | Madrid | Spain
Helsinki | Finland | Managua | Nicaragua
Helsinki | Finland | Manama | Bahrain
Helsinki | Finland | Manila | Philippines
Helsinki | Finland | Maputo | Mozambique
Helsinki | Finland | Minsk | Belarus
Helsinki | Finland | Mogadishu | Somalia
Helsinki | Finland | Monrovia | Liberia
Helsinki | Finland | Montevideo | Uruguay
Helsinki | Finland | Moscow | Russia
Helsinki | Finland | Muscat | Oman
Helsinki | Finland | Nairobi | Kenya
Helsinki | Finland | Nassau | Bahamas
Helsinki | Finland | Niamey | Niger
Helsinki | Finland | Nicosia | Cyprus
Helsinki | Finland | Nouakchott | Mauritania
Helsinki | Finland | Nuuk | Greenland
Helsinki | Finland | Oslo | Norway
Helsinki | Finland | Ottawa | Canada
Helsinki | Finland | Paramaribo | Suriname
Helsinki | Finland | Paris | France
Helsinki | Finland | Podgorica | Montenegro
gitextract_cg5yuxp9/
├── .gitignore
├── README.md
└── src/
├── codes/
│ ├── 00.run_etnlp_preprocessing.sh
│ ├── 01.run_etnlp_evaluator.sh
│ ├── 02.run_etnlp_extractor.sh
│ ├── 03.run_etnlp_visualizer_inter.sh
│ ├── 04.run_etnlp_visualizer_sbs.sh
│ ├── api/
│ │ ├── __init__.py
│ │ ├── embedding_evaluator.py
│ │ ├── embedding_extractor.py
│ │ ├── embedding_preprocessing.py
│ │ └── embedding_visualizer.py
│ ├── embeddings/
│ │ ├── __init__.py
│ │ ├── embedding_configs.py
│ │ ├── embedding_models.py
│ │ └── embedding_utils.py
│ ├── etnlp_api.py
│ ├── requirements.txt
│ ├── setup.py
│ ├── utils/
│ │ ├── __init__.py
│ │ ├── emb_utils.py
│ │ ├── embedding_io.py
│ │ ├── eval_utils.py
│ │ ├── file_utils.py
│ │ ├── string_utils.py
│ │ ├── vectors.py
│ │ └── word.py
│ └── visualizer/
│ ├── README.md
│ ├── __init__.py
│ ├── outof_w2vec.dict
│ ├── static/
│ │ └── style.css
│ ├── templates/
│ │ ├── app.html
│ │ └── search.html
│ └── visualizer_sbs.py
├── data/
│ ├── embedding_analogies/
│ │ ├── english/
│ │ │ └── english-word-analogy.txt
│ │ ├── portuguese/
│ │ │ ├── LX-4WAnalogies-ETNLP.txt
│ │ │ ├── LX-4WAnalogies.txt
│ │ │ ├── POST_TAG_vocabulary.txt
│ │ │ ├── evaluator_results.txt
│ │ │ └── vocab.txt
│ │ └── vi/
│ │ ├── Multi_evaluator_results.txt
│ │ ├── analogy_list_vi_ner.txt
│ │ └── elmo_results_out_dict.txt
│ ├── embedding_dicts/
│ │ ├── C2V.vec
│ │ ├── ELMO_23.vec
│ │ ├── FastText_23.vec
│ │ ├── MULTI_23.vec
│ │ ├── W2V_C2V_23.vec
│ │ ├── baomoi_c2v_dims_300.vec
│ │ └── vn_elmo_medium_c2v.vec
│ ├── glove2vec_dicts/
│ │ ├── glove1.vec
│ │ ├── glove1_w2v.vec
│ │ ├── glove2.vec
│ │ └── glove2_w2v.vec
│ └── vocab.txt
└── examples/
├── test1_etnlp_preprocessing.py
├── test2_etnlp_extractor.py
├── test3_etnlp_evaluator.py
└── test4_etnlp_visualizer.py
SYMBOL INDEX (91 symbols across 15 files)
FILE: src/codes/api/embedding_evaluator.py
class new_Word2VecKeyedVectors (line 14) | class new_Word2VecKeyedVectors(Word2VecKeyedVectors):
method __init__ (line 15) | def __init__(self, vector_size):
method most_similar (line 18) | def most_similar(self, positive=None, negative=None, topn=10, restrict...
method new_accuracy (line 87) | def new_accuracy(self, questions, restrict_vocab=30000, most_similar=m...
function convert_conll_format_to_normal (line 191) | def convert_conll_format_to_normal(connl_file, out_file):
function verify_word_analogies (line 228) | def verify_word_analogies(file):
function check_oov_of_word_analogies (line 252) | def check_oov_of_word_analogies(w2v_format_emb_file, analogy_file, is_vn...
function evaluator_api (line 292) | def evaluator_api(input_files, analoglist, output, embed_config=None):
FILE: src/codes/api/embedding_extractor.py
function get_multi_embedding_models (line 10) | def get_multi_embedding_models(config: EmbeddingConfigs):
function get_emb_dim (line 35) | def get_emb_dim(emb_file):
function extract_embedding_for_vocab_file (line 45) | def extract_embedding_for_vocab_file(paths_of_emb_models, vocab_words_fi...
function extract_embedding_vectors (line 91) | def extract_embedding_vectors(vocab_words_file, output_file, config: Emb...
FILE: src/codes/api/embedding_preprocessing.py
function convert_to_w2v (line 10) | def convert_to_w2v(vocab_file, embedding_file, out_file):
function test (line 54) | def test():
function load_and_save_2_word2vec_model (line 63) | def load_and_save_2_word2vec_model(input_model_path, output_model_path, ...
function load_and_save_2_word2vec_models (line 75) | def load_and_save_2_word2vec_models(input_embedding_files_str, output_em...
FILE: src/codes/api/embedding_visualizer.py
class TensorBoardTool (line 16) | class TensorBoardTool:
method __init__ (line 18) | def __init__(self, dir_path):
method run (line 21) | def run(self, emb_name, port):
function convert_multiple_emb_models_2_tf (line 33) | def convert_multiple_emb_models_2_tf(emb_name_arr, w2v_model_arr, output...
function convert_one_emb_model_2_tf (line 88) | def convert_one_emb_model_2_tf(emb_name, model, output_path, port):
function visualize_multiple_embeddings_individually (line 136) | def visualize_multiple_embeddings_individually(paths_of_emb_models):
function visualize_multiple_embeddings_all_in_one (line 174) | def visualize_multiple_embeddings_all_in_one(paths_of_emb_models, port):
function visualize_multiple_embeddings (line 216) | def visualize_multiple_embeddings(paths_of_emb_models, port):
FILE: src/codes/embeddings/embedding_configs.py
class EmbeddingConfigs (line 3) | class EmbeddingConfigs(object):
FILE: src/codes/embeddings/embedding_models.py
class Model_Constants (line 12) | class Model_Constants(object):
class Embedding_Model (line 19) | class Embedding_Model(object):
method __init__ (line 20) | def __init__(self, name, vector_dim):
method load_model (line 31) | def load_model(self, model_path):
method is_punct (line 52) | def is_punct(self, word):
method is_number (line 79) | def is_number(self, word):
method set_char_model (line 90) | def set_char_model(self, char_model):
method load_vocabs_list (line 93) | def load_vocabs_list(self, vocab_file_path):
method get_char_vector (line 102) | def get_char_vector(self, char_model, word):
method is_unknown_word (line 137) | def is_unknown_word(self, word):
method get_word_vector (line 148) | def get_word_vector(self, word):
method get_vector_of_unknown (line 222) | def get_vector_of_unknown(self, word):
class Embedding_Models (line 249) | class Embedding_Models(object):
method __init__ (line 253) | def __init__(self, list_models):
method add_model (line 256) | def add_model(self, emb_model, char_model):
method get_vector_of_document (line 274) | def get_vector_of_document(self, document):
method get_word_vector_of_multi_embeddings (line 307) | def get_word_vector_of_multi_embeddings(self, word):
FILE: src/codes/embeddings/embedding_utils.py
function reload_char2vec_model (line 6) | def reload_char2vec_model(model_path, model_dim):
function reload_embedding_models (line 12) | def reload_embedding_models(model_paths_list, model_names_list, model_di...
function save_embedding_models_tofolder (line 46) | def save_embedding_models_tofolder(dir_path, final_embeddings, reverse_d...
function save_embedding_models (line 80) | def save_embedding_models(FLAGS, final_embeddings, reverse_dictionary, v...
function reload_embeddings (line 93) | def reload_embeddings(trained_models_dir):
function create_single_utf8_file (line 106) | def create_single_utf8_file(input_dir, output_file):
FILE: src/codes/utils/emb_utils.py
function most_similar (line 19) | def most_similar(base_vector: Vector, words: List[Word]) -> List[Tuple[f...
function print_most_similar (line 29) | def print_most_similar(words: List[Word], text: str) -> None:
function read_word (line 43) | def read_word() -> str:
function find_word (line 47) | def find_word(text: str, words: List[Word]) -> Optional[Word]:
function closest_analogies_OLD (line 54) | def closest_analogies_OLD(
function closest_analogies_vectors (line 83) | def closest_analogies_vectors(
function get_avg_vector (line 121) | def get_avg_vector(word, embedding_words):
function run_paired_ttests (line 151) | def run_paired_ttests(all_map_arr, embedding_names):
function eval_word_analogy_4_all_embeddings (line 182) | def eval_word_analogy_4_all_embeddings(word_analogies_file, embedding_na...
function eval_word_analogies (line 226) | def eval_word_analogies(word_analogies_file, words: List[Word], embeddin...
function print_analogy (line 360) | def print_analogy(left2: str, left1: str, right2: str, words: List[Word]...
FILE: src/codes/utils/embedding_io.py
function save_model_to_file (line 13) | def save_model_to_file(embedding_model: List[Word], model_file_out: str):
function load_word_embeddings (line 32) | def load_word_embeddings(file_paths: str, emb_config: EmbeddingConfigs) ...
function load_word_embedding (line 54) | def load_word_embedding(file_path: str, emb_config: EmbeddingConfigs) ->...
function load_words_raw (line 88) | def load_words_raw(file_path: str, emb_config: EmbeddingConfigs) -> List...
function iter_len (line 168) | def iter_len(iter: Iterable[complex]) -> int:
function most_common_dimension (line 172) | def most_common_dimension(words: List[Word]) -> int:
function remove_duplicates (line 195) | def remove_duplicates(words: List[Word]) -> List[Word]:
function remove_stop_words (line 207) | def remove_stop_words(words: List[Word]) -> List[Word]:
FILE: src/codes/utils/eval_utils.py
function apk (line 15) | def apk(actual, predicted, k=10):
function mapk (line 50) | def mapk(actual, predicted, k=10, word_level=True):
function calc_map (line 79) | def calc_map(actual, predicted, topK=10):
function calc_map_character_level (line 108) | def calc_map_character_level(actual, predicted, topK=10):
function test_apk (line 144) | def test_apk(self):
function test_mapk (line 152) | def test_mapk(self):
FILE: src/codes/utils/file_utils.py
function save_obj (line 4) | def save_obj(obj, file_path):
function load_obj (line 9) | def load_obj(file_path):
function get_unique_vocab (line 14) | def get_unique_vocab(analogy_file_path, write_out_file):
FILE: src/codes/utils/string_utils.py
function convert_to_unicode (line 4) | def convert_to_unicode(text):
FILE: src/codes/utils/vectors.py
function l2_len (line 15) | def l2_len(v: vector_type) -> float:
function dot (line 19) | def dot(v1: vector_type, v2: vector_type) -> float:
function mean (line 24) | def mean(v1: vector_type, v2: vector_type) -> Vector:
function mean_list (line 35) | def mean_list(v1: List[Vector]) -> Vector:
function add (line 47) | def add(v1: vector_type, v2: vector_type) -> Vector:
function sub (line 52) | def sub(v1: vector_type, v2: vector_type) -> Vector:
function normalize (line 57) | def normalize(v: vector_type) -> Vector:
function cosine_similarity_normalized (line 61) | def cosine_similarity_normalized(v1: vector_type, v2: vector_type) -> fl...
FILE: src/codes/utils/word.py
class Word (line 7) | class Word:
method __init__ (line 10) | def __init__(self, text: str, vector: Vector, frequency: int) -> None:
method __repr__ (line 15) | def __repr__(self) -> str:
FILE: src/codes/visualizer/visualizer_sbs.py
function search (line 17) | def search():
function get_index (line 57) | def get_index():
function multi_search (line 62) | def multi_search():
Copy disabled (too large)
Download .json
Condensed preview — 59 files, each showing path, character count, and a content snippet. Download the .json file for the full structured content (14,087K chars).
[
{
"path": ".gitignore",
"chars": 1654,
"preview": "__pycache__/\n*.py[cod]\n*$py.class\n\n# C extensions\n*.so\n\n# Distribution / packaging\n.Python\nbuild/\ndist/\ndevelop-eggs/\ndo"
},
{
"path": "README.md",
"chars": 9698,
"preview": "ETNLP: A Toolkit for Extraction, Evaluation and Visualization of Pre-trained Word Embeddings\n=====\n\n# Table of contents\n"
},
{
"path": "src/codes/00.run_etnlp_preprocessing.sh",
"chars": 382,
"preview": "#!/bin/sh\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\nINPUT_FILES=\"../data/glove2vec_dicts/glove1.vec;../data/glove2vec_dicts/g"
},
{
"path": "src/codes/01.run_etnlp_evaluator.sh",
"chars": 453,
"preview": "#!/bin/sh\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\nINPUT_FILES=\"../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/"
},
{
"path": "src/codes/02.run_etnlp_extractor.sh",
"chars": 443,
"preview": "#!/bin/sh\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\nINPUT_FILES=\"../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/"
},
{
"path": "src/codes/03.run_etnlp_visualizer_inter.sh",
"chars": 285,
"preview": "#!/bin/sh\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\nINPUT_FILES=\"../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/"
},
{
"path": "src/codes/04.run_etnlp_visualizer_sbs.sh",
"chars": 343,
"preview": "#!/bin/sh\nexport PYTHONPATH=\"$PYTHONPATH:$PWD\"\nINPUT_FILES=\"../data/embedding_dicts/ELMO_23.vec;../data/embedding_dicts/"
},
{
"path": "src/codes/api/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/codes/api/embedding_evaluator.py",
"chars": 16905,
"preview": "import logging\nimport gensim\nimport argparse\nfrom gensim.models.keyedvectors import WordEmbeddingsKeyedVectors, Word2Vec"
},
{
"path": "src/codes/api/embedding_extractor.py",
"chars": 5528,
"preview": "from embeddings import embedding_utils\nfrom pathlib import Path\nimport numpy as np\nimport os\nimport logging\nimport gzip\n"
},
{
"path": "src/codes/api/embedding_preprocessing.py",
"chars": 4597,
"preview": "# Convert to a standard word2vec format\n\nimport gensim\nfrom utils import embedding_io\nimport sys\nfrom threading import T"
},
{
"path": "src/codes/api/embedding_visualizer.py",
"chars": 8237,
"preview": "# 1. Read embedding file\n# 2. Convert to tensorboard\n# 3. Visualize\n\n# encoding: utf-8\nimport sys, os\nimport gensim\nimpo"
},
{
"path": "src/codes/embeddings/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/codes/embeddings/embedding_configs.py",
"chars": 307,
"preview": "\n\nclass EmbeddingConfigs(object):\n \"\"\"\n Configuration information\n \"\"\"\n is_word2vec_format = True\n do"
},
{
"path": "src/codes/embeddings/embedding_models.py",
"chars": 11661,
"preview": "from gensim.models import KeyedVectors as Word2Vec\nimport numpy as np\nfrom embeddings import embedding_utils\nfrom utils "
},
{
"path": "src/codes/embeddings/embedding_utils.py",
"chars": 4261,
"preview": "import os\nfrom utils import file_utils\nfrom embeddings.embedding_models import Embedding_Model, Embedding_Models\n\n\ndef r"
},
{
"path": "src/codes/etnlp_api.py",
"chars": 4388,
"preview": "import argparse\nfrom api import embedding_preprocessing, embedding_evaluator, embedding_extractor, embedding_visualizer\n"
},
{
"path": "src/codes/requirements.txt",
"chars": 159,
"preview": "gensim==3.4.0\nscipy==1.1.0\nsix==1.12.0\nsetuptools==40.6.2\ntensorflow==1.12.0\nFlask==1.0.2\ntensorboard==1.12.0\nnumpy==1.1"
},
{
"path": "src/codes/setup.py",
"chars": 969,
"preview": "from setuptools import setup, find_packages\nfrom etnlp_api import __version__\n\n\nwith open(\"../../README.md\", \"r\") as fh:"
},
{
"path": "src/codes/utils/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/codes/utils/emb_utils.py",
"chars": 13702,
"preview": "from sklearn.metrics.pairwise import cosine_similarity\nfrom typing import Any, Iterable, List, Optional, Set, Tuple\n\nfro"
},
{
"path": "src/codes/utils/embedding_io.py",
"chars": 7221,
"preview": "from typing import Iterable, List, Set\n\nfrom itertools import groupby\nimport numpy as np\nimport re\nimport utils.vectors "
},
{
"path": "src/codes/utils/eval_utils.py",
"chars": 4984,
"preview": "\n\"\"\"\nMAP@K word level and character level are explained in detail in this paper:\n\ndpUGC: Learn Differentially Private Re"
},
{
"path": "src/codes/utils/file_utils.py",
"chars": 1107,
"preview": "import pickle\n\n\ndef save_obj(obj, file_path):\n with open(file_path + '.pkl', 'wb') as f:\n pickle.dump(obj, f, "
},
{
"path": "src/codes/utils/string_utils.py",
"chars": 707,
"preview": "import six\n\n\ndef convert_to_unicode(text):\n \"\"\"Converts `text` to Unicode (if it's not already), assuming utf-8 input"
},
{
"path": "src/codes/utils/vectors.py",
"chars": 1492,
"preview": "from typing import List, Any, Optional\n\nimport math\nimport numpy as np\n\n# Adopt from https://github.com/mkonicek/nlp/vec"
},
{
"path": "src/codes/utils/word.py",
"chars": 568,
"preview": "from typing import List\nfrom utils.vectors import Vector\n\n# Adopt from https://github.com/mkonicek/nlp/Word.py\n\n\nclass W"
},
{
"path": "src/codes/visualizer/README.md",
"chars": 383,
"preview": "# Requirements:\n- ```pip install gensim flask```\n- Download any pre-trained embeddings and put it into ../03.run_etnlp_v"
},
{
"path": "src/codes/visualizer/__init__.py",
"chars": 0,
"preview": ""
},
{
"path": "src/codes/visualizer/outof_w2vec.dict",
"chars": 298,
"preview": "\n\n\n\n\n\n\n\n\n\n'news' \n\n\n\n\n\n'news' \n\n\n\n\n\n'news' \n\n\n\n\n\n'news' \n\n\n\n\n\n'news' \n\n\n\n\n\n\n'news' \n\n\n\n\n\n'news' \n\n'back' \n'back' \n'back'"
},
{
"path": "src/codes/visualizer/static/style.css",
"chars": 1543,
"preview": ".container-4{\n overflow: hidden;\n width: 300px;\n vertical-align: middle;\n white-space: nowrap;\n}\n\n.container-4 input"
},
{
"path": "src/codes/visualizer/templates/app.html",
"chars": 480,
"preview": "<!DOCTYPE html>\n<html lang=\"en\">\n<head>\n <link rel=\"stylesheet\" type=\"text/css\" href=\"static/style.css\">\n <meta ch"
},
{
"path": "src/codes/visualizer/templates/search.html",
"chars": 1686,
"preview": "{% block content %}\n<div class=\"search\">\n\n<div class=\"container\">\n<form action=\"/search\" method=\"post\" role=\"form\">\n<div"
},
{
"path": "src/codes/visualizer/visualizer_sbs.py",
"chars": 3307,
"preview": "from flask import Flask, render_template\nfrom flask import request\nimport gensim\nfrom distutils.version import LooseVers"
},
{
"path": "src/data/embedding_analogies/english/english-word-analogy.txt",
"chars": 721247,
"preview": ": | capital-common-countries\nAthens | Greece | Baghdad | Iraq\nAthens | Greece | Bangkok | Thailand\nAthens | Greece | Bei"
},
{
"path": "src/data/embedding_analogies/portuguese/LX-4WAnalogies-ETNLP.txt",
"chars": 684462,
"preview": ": | capital-common-countries\nAtenas | Grécia | Bagdade | Iraque\nAtenas | Grécia | Banguecoque | Tailândia\nAtenas | Gréci"
},
{
"path": "src/data/embedding_analogies/portuguese/LX-4WAnalogies.txt",
"chars": 579086,
"preview": ": capital-common-countries\nAtenas Grécia Bagdade Iraque\nAtenas Grécia Banguecoque Tailândia\nAtenas Grécia Pequim China\nA"
},
{
"path": "src/data/embedding_analogies/portuguese/POST_TAG_vocabulary.txt",
"chars": 1428388,
"preview": "*rare*\nbloqueou\nfundation\npaidéia\nxodó\ngainsbourg\nmiudamente\nbenedikt\nimplantando\npaleolíticas\nfluorita\ndeclaram\nbalada\n"
},
{
"path": "src/data/embedding_analogies/portuguese/evaluator_results.txt",
"chars": 1159516,
"preview": ": | Word Analogy Task results\n\nEmbedding: glove_s300_analogy_POSTAG_vocab.vec\n\n[{'section': '| capital-common-countries'"
},
{
"path": "src/data/embedding_analogies/portuguese/vocab.txt",
"chars": 6948,
"preview": "Atenas\nGrécia\nBagdade\nIraque\nBanguecoque\nTailândia\nPequim\nChina\nBerlim\nAlemanha\nBerna\nSuíça\nCairo\nEgito\nCamberra\nAustrál"
},
{
"path": "src/data/embedding_analogies/vi/Multi_evaluator_results.txt",
"chars": 30,
"preview": ": | Word Analogy Task results\n"
},
{
"path": "src/data/embedding_analogies/vi/analogy_list_vi_ner.txt",
"chars": 125189,
"preview": ": | countries\nViên | Áo | Berlin | Đức\ncậu bé | cô gái | anh_trai | em_gái\ncậu bé | cô gái | anh_em | chị_em gái\ncậu bé "
},
{
"path": "src/data/embedding_analogies/vi/elmo_results_out_dict.txt",
"chars": 218938,
"preview": ": | Word Analogy Task results\nEmbedding: fastText_wiki_lowercase_300_NER.vec\n[{'section': '| countries', 'correct': [('c"
},
{
"path": "src/data/embedding_dicts/C2V.vec",
"chars": 4914691,
"preview": "1743 300\n̂ 0.010822 -0.026347 -0.015967 -0.015494 0.005062 0.028075 -0.007074 -0.007489 -0.009271 -0.031056 0.012506 0.0"
},
{
"path": "src/data/embedding_dicts/ELMO_23.vec",
"chars": 277537,
"preview": "23 1024\ntôi 0.027776154 0.02129553 -0.030680764 -0.01855098 -0.013765394 -0.017949548 -0.03710915 -0.010346569 0.0076523"
},
{
"path": "src/data/embedding_dicts/FastText_23.vec",
"chars": 88620,
"preview": "23 300\ntôi 0.01981081 0.020889733 -0.03780313 -0.0005338048 -0.044294827 0.14188156 0.006977781 -0.047029607 0.061489426"
},
{
"path": "src/data/embedding_dicts/MULTI_23.vec",
"chars": 434405,
"preview": "23 1624\ntôi 0.0025431418 0.0026816446 -0.004852841 -6.852528e-05 -0.00568619 0.018213535 0.0008957476 -0.0060372576 0.00"
},
{
"path": "src/data/embedding_dicts/W2V_C2V_23.vec",
"chars": 88576,
"preview": "23 300\ntôi 0.07352853 0.08678736 0.0036432447 -0.07279798 0.15125743 0.010289791 0.019780997 -0.1430503 -0.027089203 -0."
},
{
"path": "src/data/embedding_dicts/baomoi_c2v_dims_300.vec",
"chars": 1028529,
"preview": "367 300\n< -0.003531 0.04465 -0.062746 0.02843 0.035368 -0.126503 0.011095 0.044803 0.149568 0.005739 0.032506 0.006824 -"
},
{
"path": "src/data/embedding_dicts/vn_elmo_medium_c2v.vec",
"chars": 1821561,
"preview": "194 1024\n2 1.965785 1.67536 1.894273 1.703323 2.555757 3.085151 1.985448 1.962525 1.874219 1.813196 2.503823 1.641451 2."
},
{
"path": "src/data/glove2vec_dicts/glove1.vec",
"chars": 154,
"preview": "word10 0.123 0.134 0.532 0.152\nword20 0.934 0.412 0.532 0.159\nword30 0.334 0.241 0.324 0.188\nword90 0.334 0.241 0.324 0."
},
{
"path": "src/data/glove2vec_dicts/glove1_w2v.vec",
"chars": 159,
"preview": "5 4\nword10 0.123 0.134 0.532 0.152\nword20 0.934 0.412 0.532 0.159\nword30 0.334 0.241 0.324 0.188\nword90 0.334 0.241 0.32"
},
{
"path": "src/data/glove2vec_dicts/glove2.vec",
"chars": 119,
"preview": "word1 0.123 0.134 0.532 0.152\nword2 0.934 0.412 0.532 0.159\nword3 0.334 0.241 0.324 0.188\nword9 0.334 0.241 0.324 0.188"
},
{
"path": "src/data/glove2vec_dicts/glove2_w2v.vec",
"chars": 124,
"preview": "4 4\nword1 0.123 0.134 0.532 0.152\nword2 0.934 0.412 0.532 0.159\nword3 0.334 0.241 0.324 0.188\nword9 0.334 0.241 0.324 0."
},
{
"path": "src/data/vocab.txt",
"chars": 136,
"preview": "tôi\nyêu\nhà_nội\nghét\nem\niphone\nthích\nhận\nđắm_say\nđẹp\ngiận\nđà_nẵng\ncậu\nbé\ncô\ngái\nanh_trai\nem_gái\nngười \nđàn_ông\nphụ_nữ\nhoà"
},
{
"path": "src/examples/test1_etnlp_preprocessing.py",
"chars": 606,
"preview": "from etnlp_api import embedding_preprocessing as emb_prep\nfrom etnlp_api import embedding_config\n\nINPUT_FILES=\"../data/g"
},
{
"path": "src/examples/test2_etnlp_extractor.py",
"chars": 974,
"preview": "from etnlp_api import embedding_config\nfrom etnlp_api import embedding_extractor\n\n\nemb1 = \"<point_to_your_downloaded_fil"
},
{
"path": "src/examples/test3_etnlp_evaluator.py",
"chars": 579,
"preview": "from etnlp_api import embedding_evaluator\nimport os\nimport tensorflow as tf\nos.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'\ntf."
},
{
"path": "src/examples/test4_etnlp_visualizer.py",
"chars": 350,
"preview": "# from etnlp_api import embedding_config\nfrom etnlp_api import embedding_visualizer\n\nINPUT_FILES = \"../data/embedding_di"
}
]
About this extraction
This page contains the full source code of the vietnlp/etnlp GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 59 files (13.1 MB), approximately 3.4M tokens, and a symbol index with 91 extracted functions, classes, methods, constants, and types. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.
Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.