SYMBOL INDEX (15 symbols across 3 files) FILE: make_corpus.py function main (line 18) | def main(args): FILE: tokenization.py class BaseTokenizer (line 4) | class BaseTokenizer(object): method __init__ (line 5) | def __init__(self, do_lower_case=False, preserved_pattern=None): method tokenize_words (line 9) | def tokenize_words(self, text): method tokenize (line 12) | def tokenize(self, text): class RegExpTokenizer (line 36) | class RegExpTokenizer(BaseTokenizer): method __init__ (line 37) | def __init__(self, pattern=r'\w+|\S', do_lower_case=False, preserved_p... method tokenize_words (line 41) | def tokenize_words(self, text): class NLTKTokenizer (line 46) | class NLTKTokenizer(BaseTokenizer): method __init__ (line 47) | def __init__(self, do_lower_case=False, preserved_pattern=None): method tokenize_words (line 52) | def tokenize_words(self, text): class MeCabTokenizer (line 57) | class MeCabTokenizer(BaseTokenizer): method __init__ (line 58) | def __init__(self, mecab_option='', do_lower_case=False, preserved_pat... method tokenize_words (line 64) | def tokenize_words(self, text): FILE: train.py function main (line 16) | def main(args):