Repository: brightmart/roberta_zh Branch: master Commit: 438476f7da16 Files: 10 Total size: 228.5 KB Directory structure: gitextract_gfbv6jnd/ ├── README.md ├── create_pretrain_data.sh ├── create_pretraining_data.py ├── modeling.py ├── optimization.py ├── optimization_finetuning.py ├── resources/ │ └── vocab.txt ├── run_classifier.py ├── run_pretraining.py └── tokenization.py ================================================ FILE CONTENTS ================================================ ================================================ FILE: README.md ================================================ RoBERTa for Chinese, TensorFlow & PyTorch 中文预训练RoBERTa模型 ------------------------------------------------- RoBERTa是BERT的改进版，通过改进训练任务和数据生成方式、训练更久、使用更大批次、使用更多数据等获得了State of The Art的效果；可以用Bert直接加载。本项目是用TensorFlow实现了在大规模中文上RoBERTa的预训练，也会提供PyTorch的预训练模型和加载方式。 *** 2019-10-12：添加【阅读理解】不同模型上测试效果对比 *** *** 2019-09-08：添加国内下载地址、PyTorch版本、与多个模型bert-wwm、xlnet等模型效果初步对比 *** NLP自动标注工具（提效最多100X）-预约 Pre-trained model of albert, chinese version is also available for you now. 中文预训练RoBERTa模型-下载 ------------------------------------------------- *** 6层RoBERTa体验版 *** RoBERTa-zh-Layer6: Google Drive 或百度网盘，TensorFlow版本，Bert 直接加载, 大小为200M ###### ** 推荐 RoBERTa-zh-Large 通过验证** RoBERTa-zh-Large: Google Drive 或百度网盘，TensorFlow版本，Bert 直接加载 RoBERTa-zh-Large: Google Drive 或百度网盘，PyTorch版本，Bert的PyTorch版直接加载 RoBERTa 24/12层版训练数据：30G原始文本，近3亿个句子，100亿个中文字(token)，产生了2.5亿个训练数据(instance)；覆盖新闻、社区问答、多个百科数据等；本项目与中文预训练24层XLNet模型 XLNet_zh项目，使用相同的训练数据。 RoBERTa_zh_L12: Google Drive 或百度网盘 TensorFlow版本，Bert 直接加载 RoBERTa_zh_L12: Google Drive 或百度网盘 PyTorch版本，Bert的PyTorch版直接加载 --------------------------------------------------------------- Roberta_l24_zh_base TensorFlow版本，Bert 直接加载 24层base版训练数据：10G文本，包含新闻、社区问答、多个百科数据等。 What is RoBERTa: ------------------------------------------------- A robustly optimized method for pretraining natural language processing (NLP) systems that improves on Bidirectional Encoder Representations from Transformers, or BERT, the self-supervised method released by Google in 2018. RoBERTa, produces state-of-the-art results on the widely used NLP benchmark, General Language Understanding Evaluation (GLUE). The model delivered state-of-the-art performance on the MNLI, QNLI, RTE, STS-B, and RACE tasks and a sizable performance improvement on the GLUE benchmark. With a score of 88.5, RoBERTa reached the top position on the GLUE leaderboard, matching the performance of the previous leader, XLNet-Large. (Introduction from Facebook blog) 发布计划 Release Plan： ------------------------------------------------- 1、24层RoBERTa模型(roberta_l24_zh)，使用30G文件训练， 9月8日 2、12层RoBERTa模型(roberta_l12_zh)，使用30G文件训练， 9月8日 3、6层RoBERTa模型(roberta_l6_zh)，使用30G文件训练， 9月8日 4、PyTorch版本的模型(roberta_l6_zh_pytorch) 9月8日 5、30G中文语料，预训练格式，可直接训练(bert,xlent,gpt2) 待定 6、测试集测试和效果对比 9月14日效果测试与对比 Performance ------------------------------------------------- ### 互联网新闻情感分析：CCF-Sentiment-Analysis | 模型 | 线上F1 | | :------- | :---------: | | BERT | 80.3 | | Bert-wwm-ext | 80.5 | | XLNet | 79.6 | | Roberta-mid | 80.5 | | Roberta-large (max_seq_length=512, split_num=1) | 81.25 | 注：数据来源于guoday的开源项目；数据集和任务介绍见：CCF互联网新闻情感分析 ### 自然语言推断：XNLI | 模型 | 开发集 | 测试集 | | :------- | :---------: | :---------: | | BERT | 77.8 (77.4) | 77.8 (77.5) | | ERNIE | 79.7 (79.4) | 78.6 (78.2) | | BERT-wwm | 79.0 (78.4) | 78.2 (78.0) | | BERT-wwm-ext | 79.4 (78.6) | 78.7 (78.3) | | XLNet | 79.2 | 78.7 | | RoBERTa-zh-base | 79.8 |78.8 | | **RoBERTa-zh-Large** | **80.2 (80.0)** | **79.9 (79.5)** | 注：RoBERTa_l24_zh，只跑了两次，Performance可能还会提升; BERT-wwm-ext来自于这里；XLNet来自于这里; RoBERTa-zh-base，指12层RoBERTa中文模型 ### 问题匹配语任务：LCQMC(Sentence Pair Matching) | 模型 | 开发集(Dev) | 测试集(Test) | | :------- | :---------: | :---------: | | BERT | 89.4(88.4) | 86.9(86.4) | | ERNIE | 89.8 (89.6) | **87.2** (87.0) | | BERT-wwm |89.4 (89.2) | 87.0 (86.8) | | BERT-wwm-ext | - |- | | RoBERTa-zh-base | 88.7 | 87.0 | | **RoBERTa-zh-Large** | **89.9**(89.6) | **87.2**(86.7) | | RoBERTa-zh-Large(20w_steps) | 89.7| 87.0 | 注：RoBERTa_l24_zh，只跑了两次，Performance可能还会提升。保持训练轮次和论文一致： ### 阅读理解测试目前阅读理解类问题bert和roberta最优参数均为epoch2, batch=32, lr=3e-5, warmup=0.1 #### cmrc2018(阅读理解) | models | DEV | | ------ | ------ | | sibert_base | F1:87.521(88.628) EM:67.381(69.152) | | sialbert_middle | F1:87.6956(87.878) EM:67.897(68.624) | | 哈工大讯飞 roberta_wwm_ext_base | F1:87.521(88.628) EM:67.381(69.152) | | brightmart roberta_middle | F1:86.841(87.242) EM:67.195(68.313) | | brightmart roberta_large | **F1:88.608(89.431) EM:69.935(72.538)** | #### DRCD(阅读理解) | models | DEV | | ------ | ------ | | siBert_base | F1:93.343(93.524) EM:87.968(88.28) | | siALBert_middle | F1:93.865(93.975) EM:88.723(88.961) | | 哈工大讯飞 roberta_wwm_ext_base | F1:94.257(94.48) EM:89.291(89.642) | | brightmart roberta_large | **F1:94.933(95.057) EM:90.113(90.238)** | #### CJRC(带有yes,no,unkown的阅读理解) | models | DEV | | ------ | ------ | | siBert_base | F1:80.714(81.14) EM:64.44(65.04) | | siALBert_middle | F1:80.9838(81.299) EM:63.796(64.202) | | 哈工大讯飞 roberta_wwm_ext_base | F1:81.510(81.684) EM:64.924(65.574) | | brightmart roberta_large | F1:80.16(80.475) EM:65.249(66.133) | 阅读理解测试对比数据来源bert_cn_finetune ? 处地方，将会很快更新到具体的值 RoBERTa中文版 Chinese Version ------------------------------------------------- 本项目所指的中文预训练RoBERTa模型只指按照RoBERTa论文主要精神训练的模型。包括： 1、数据生成方式和任务改进：取消下一个句子预测，并且数据连续从一个文档中获得(见：Model Input Format and Next Sentence Prediction，DOC-SENTENCES) 2、更大更多样性的数据：使用30G中文训练，包含3亿个句子，100亿个字(即token）。由新闻、社区讨论、多个百科，包罗万象，覆盖数十万个主题，所以数据具有多样性（为了更有多样性，可以可以加入网络书籍、小说、故事类文学、微博等）。 3、训练更久：总共训练了近20万，总共见过近16亿个训练数据(instance)；在Cloud TPU v3-256 上训练了24小时，相当于在TPU v3-8(128G显存)上需要训练一个月。 4、更大批次：使用了超大（8k）的批次batch size。 5、调整优化器等超参数。除以上外，本项目中文版，使用了全词mask(whole word mask)。在全词Mask中，如果一个完整的词的部分WordPiece子词被mask，则同属该词的其他部分也会被mask，即全词Mask。本项目中并没有直接实现dynamic mask。通过复制一个训练样本得到多份数据，每份数据使用不同mask，并加大复制的份数，可间接得到dynamic mask效果。 ##### 使用说明 Instructions for Use 当前本项目是使用sequence length为256训练的，所以可能对长度在这个范围内的效果不错；如果你的任务的输入比较长（如序列长度为512），或许效果有影响。有同学结合滑动窗口的形式，将序列做拆分，还是得到了比较好的效果，见#issue-16 ##### 中文全词遮蔽 Whole Word Mask | 说明 | 样例 | | :------- | :--------- | | 原始文本 | 使用语言模型来预测下一个词的probability。 | | 分词文本 | 使用语言模型来预测下一个词的 probability 。 | | 原始Mask输入 | 使用语言 [MASK] 型来 [MASK] 测下一个词的 pro [MASK] ##lity 。 | | 全词Mask输入 | 使用语言 [MASK] [MASK] 来 [MASK] [MASK] 下一个词的 [MASK] [MASK] [MASK] 。 | 模型加载（以Sentence Pair Matching即句子对任务，LCQMC为例） ------------------------------------------------- 下载LCQMC数据集，包含训练、验证和测试集，训练集包含24万口语化描述的中文句子对，标签为1或0。1为句子语义相似，0为语义不相似。 tensorFlow版本： 1、复制本项目： git clone https://github.com/brightmart/roberta_zh 2、进到项目(roberta_zh)中。假设你将RoBERTa预训练模型下载并解压到该改项目的roberta_zh_large目录，即roberta_zh/roberta_zh_large 运行命令: export BERT_BASE_DIR=./roberta_zh_large export MY_DATA_DIR=./data/lcqmc python run_classifier.py \ --task_name=lcqmc_pair \ --do_train=true \ --do_eval=true \ --data_dir=$MY_DATA_DIR \ --vocab_file=$BERT_BASE_DIR/vocab.txt \ --bert_config_file=$BERT_BASE_DIR/bert_config_large.json \ --init_checkpoint=$BERT_BASE_DIR/roberta_zh_large_model.ckpt \ --max_seq_length=128 \ --train_batch_size=64 \ --learning_rate=2e-5 \ --num_train_epochs=3 \ --output_dir=./checkpoint_lcqmc 注：task_name为lcqmc_pair。这里已经在run_classifier.py中的添加一个processor,并加到processors中，用于指定做lcqmc任务，并加载训练和验证数据。 PyTorch加载方式，先参考issue 9；将很快提供更具体方式。预训练 Pre-training ------------------------------------------------- #### 1) 预训练的数据 data of pre-training 你可以使用你的任务相关领域的数据来训练，也可以从通用的语料中筛选出一部分与你领域相关的数据做训练。通用语料数据见nlp_chinese_corpus:包含多个拥有数千万句子的语料的数据集。 #### 2) 生成预训练数据 generate data for pre-training 包括使用参照DOC-SENTENCES的形式，连续从一个文档中获得数据；以及做全词遮蔽(whole word mask) shell脚本：批量将多个txt文本转化为tfrecord的数据。如将第1到10个txt转化为tfrecords文件： nohup bash create_pretrain_data.sh 1 10 & 注：在我们的实验中使用15%的比例做全词遮蔽，模型学习难度大、收敛困难，所以我们用了10%的比例； #### 3）运行预训练命令 pre-training 去掉next sentence prediction任务 export BERT_BASE_DIR= nohup python3 run_pretraining.py --input_file=./tf_records_all/tf*.tfrecord \ --output_dir=my_new_model_path --do_train=True --do_eval=True --bert_config_file=$BERT_BASE_DIR/bert_config.json \ --train_batch_size=8192 --max_seq_length=256 --max_predictions_per_seq=23 \ --num_train_steps=200000 --num_warmup_steps=10000 --learning_rate=1e-4 \ --save_checkpoints_steps=3000 --init_checkpoint=$BERT_BASE_DIR/bert_model.ckpt & 注：如果你重头开始训练，可以不指定init_checkpoint；如果你从现有的模型基础上训练，指定一下BERT_BASE_DIR的路径，并确保bert_config_file和init_checkpoint两个参数的值能对应到相应的文件上；领域上的预训练，可以不用训练特别久。 Learning Curve 学习曲线 -------------------------------------------------

对显存的要求 Trade off between batch Size and sequence length ------------------------------------------------- System | Seq Length | Max Batch Size ------------ | ---------- | -------------- `RoBERTa-Base` | 64 | 64 ... | 128 | 32 ... | 256 | 16 ... | 320 | 14 ... | 384 | 12 ... | 512 | 6 `RoBERTa-Large` | 64 | 12 ... | 128 | 6 ... | 256 | 2 ... | 320 | 1 ... | 384 | 0 ... | 512 | 0 #### 技术交流与问题讨论QQ群: 836811304 If you have any question, you can raise an issue, or send me an email: brightmart@hotmail.com; You can also send pull request to report you performance on your task or add methods on how to load models for PyTorch and so on. If you have ideas for generate best performance pre-training Chinese model, please also let me know. 请报告在你的任务上的准确率情况及与其他模型的比较。项目贡献者，还包括： ------------------------------------------------- skyhawk1990 ##### Research supported with Cloud TPUs from Google's TensorFlow Research Cloud (TFRC) Reference ------------------------------------------------- 1、RoBERTa: A Robustly Optimized BERT Pretraining Approach 2、Pre-Training with Whole Word Masking for Chinese BERT 3、BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding 4、LCQMC: A Large-scale Chinese Question Matching Corpus ================================================ FILE: create_pretrain_data.sh ================================================ #!/usr/bin/env bash echo $1,$2 for((i=$1;i<=$2;i++)); do python3 create_pretraining_data.py --do_whole_word_mask=True --input_file=./raw_text/news2016zh_$i.txt \ --output_file=./tf_records_all/tf_news2016zh_$i.tfrecord --vocab_file=./resources/vocab.txt \ --do_lower_case=True --max_seq_length=256 --max_predictions_per_seq=23 --masked_lm_prob=0.10 --random_seed=12345 --dupe_factor=5 done ================================================ FILE: create_pretraining_data.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """Create masked LM/next sentence masked_lm TF examples for BERT.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import collections import random import re import tokenization import tensorflow as tf import jieba flags = tf.flags FLAGS = flags.FLAGS flags.DEFINE_string("input_file", None, "Input raw text file (or comma-separated list of files).") flags.DEFINE_string( "output_file", None, "Output TF example file (or comma-separated list of files).") flags.DEFINE_string("vocab_file", None, "The vocabulary file that the BERT model was trained on.") flags.DEFINE_bool( "do_lower_case", True, "Whether to lower case the input text. Should be True for uncased " "models and False for cased models.") flags.DEFINE_bool( "do_whole_word_mask", False, "Whether to use whole word masking rather than per-WordPiece masking.") flags.DEFINE_integer("max_seq_length", 128, "Maximum sequence length.") flags.DEFINE_integer("max_predictions_per_seq", 20, "Maximum number of masked LM predictions per sequence.") flags.DEFINE_integer("random_seed", 12345, "Random seed for data generation.") flags.DEFINE_integer( "dupe_factor", 10, "Number of times to duplicate the input data (with different masks).") flags.DEFINE_float("masked_lm_prob", 0.15, "Masked LM probability.") flags.DEFINE_float( "short_seq_prob", 0.1, "Probability of creating sequences which are shorter than the " "maximum length.") class TrainingInstance(object): """A single training instance (sentence pair).""" def __init__(self, tokens, segment_ids, masked_lm_positions, masked_lm_labels, is_random_next): self.tokens = tokens self.segment_ids = segment_ids self.is_random_next = is_random_next self.masked_lm_positions = masked_lm_positions self.masked_lm_labels = masked_lm_labels def __str__(self): s = "" s += "tokens: %s\n" % (" ".join( [tokenization.printable_text(x) for x in self.tokens])) s += "segment_ids: %s\n" % (" ".join([str(x) for x in self.segment_ids])) s += "is_random_next: %s\n" % self.is_random_next s += "masked_lm_positions: %s\n" % (" ".join( [str(x) for x in self.masked_lm_positions])) s += "masked_lm_labels: %s\n" % (" ".join( [tokenization.printable_text(x) for x in self.masked_lm_labels])) s += "\n" return s def __repr__(self): return self.__str__() def write_instance_to_example_files(instances, tokenizer, max_seq_length, max_predictions_per_seq, output_files): """Create TF example files from `TrainingInstance`s.""" writers = [] for output_file in output_files: writers.append(tf.python_io.TFRecordWriter(output_file)) writer_index = 0 total_written = 0 for (inst_index, instance) in enumerate(instances): input_ids = tokenizer.convert_tokens_to_ids(instance.tokens) input_mask = [1] * len(input_ids) segment_ids = list(instance.segment_ids) assert len(input_ids) <= max_seq_length while len(input_ids) < max_seq_length: input_ids.append(0) input_mask.append(0) segment_ids.append(0) assert len(input_ids) == max_seq_length assert len(input_mask) == max_seq_length # print("length of segment_ids:",len(segment_ids),"max_seq_length:", max_seq_length) assert len(segment_ids) == max_seq_length masked_lm_positions = list(instance.masked_lm_positions) masked_lm_ids = tokenizer.convert_tokens_to_ids(instance.masked_lm_labels) masked_lm_weights = [1.0] * len(masked_lm_ids) while len(masked_lm_positions) < max_predictions_per_seq: masked_lm_positions.append(0) masked_lm_ids.append(0) masked_lm_weights.append(0.0) next_sentence_label = 1 if instance.is_random_next else 0 features = collections.OrderedDict() features["input_ids"] = create_int_feature(input_ids) features["input_mask"] = create_int_feature(input_mask) features["segment_ids"] = create_int_feature(segment_ids) features["masked_lm_positions"] = create_int_feature(masked_lm_positions) features["masked_lm_ids"] = create_int_feature(masked_lm_ids) features["masked_lm_weights"] = create_float_feature(masked_lm_weights) features["next_sentence_labels"] = create_int_feature([next_sentence_label]) tf_example = tf.train.Example(features=tf.train.Features(feature=features)) writers[writer_index].write(tf_example.SerializeToString()) writer_index = (writer_index + 1) % len(writers) total_written += 1 if inst_index < 20: tf.logging.info("*** Example ***") tf.logging.info("tokens: %s" % " ".join( [tokenization.printable_text(x) for x in instance.tokens])) for feature_name in features.keys(): feature = features[feature_name] values = [] if feature.int64_list.value: values = feature.int64_list.value elif feature.float_list.value: values = feature.float_list.value tf.logging.info( "%s: %s" % (feature_name, " ".join([str(x) for x in values]))) for writer in writers: writer.close() tf.logging.info("Wrote %d total instances", total_written) def create_int_feature(values): feature = tf.train.Feature(int64_list=tf.train.Int64List(value=list(values))) return feature def create_float_feature(values): feature = tf.train.Feature(float_list=tf.train.FloatList(value=list(values))) return feature def create_training_instances(input_files, tokenizer, max_seq_length, dupe_factor, short_seq_prob, masked_lm_prob, max_predictions_per_seq, rng): """Create `TrainingInstance`s from raw text.""" all_documents = [[]] # Input file format: # (1) One sentence per line. These should ideally be actual sentences, not # entire paragraphs or arbitrary spans of text. (Because we use the # sentence boundaries for the "next sentence prediction" task). # (2) Blank lines between documents. Document boundaries are needed so # that the "next sentence prediction" task doesn't span between documents. print("create_training_instances.started...") for input_file in input_files: with tf.gfile.GFile(input_file, "r") as reader: while True: line = tokenization.convert_to_unicode(reader.readline().replace("",""))# .replace("”","")) # 将、”替换掉。 if not line: break line = line.strip() # Empty lines are used as document delimiters if not line: all_documents.append([]) tokens = tokenizer.tokenize(line) if tokens: all_documents[-1].append(tokens) # Remove empty documents all_documents = [x for x in all_documents if x] rng.shuffle(all_documents) vocab_words = list(tokenizer.vocab.keys()) instances = [] for _ in range(dupe_factor): for document_index in range(len(all_documents)): instances.extend( create_instances_from_document( all_documents, document_index, max_seq_length, short_seq_prob, masked_lm_prob, max_predictions_per_seq, vocab_words, rng)) rng.shuffle(instances) print("create_training_instances.ended...") return instances def _is_chinese_char(cp): """Checks whether CP is the codepoint of a CJK character.""" # This defines a "chinese character" as anything in the CJK Unicode block: # https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block) # # Note that the CJK Unicode block is NOT all Japanese and Korean characters, # despite its name. The modern Korean Hangul alphabet is a different block, # as is Japanese Hiragana and Katakana. Those alphabets are used to write # space-separated words, so they are not treated specially and handled # like the all of the other languages. if ((cp >= 0x4E00 and cp <= 0x9FFF) or # (cp >= 0x3400 and cp <= 0x4DBF) or # (cp >= 0x20000 and cp <= 0x2A6DF) or # (cp >= 0x2A700 and cp <= 0x2B73F) or # (cp >= 0x2B740 and cp <= 0x2B81F) or # (cp >= 0x2B820 and cp <= 0x2CEAF) or (cp >= 0xF900 and cp <= 0xFAFF) or # (cp >= 0x2F800 and cp <= 0x2FA1F)): # return True def get_new_segment(segment): # 新增的方法 #### """ 输入一句话，返回一句经过处理的话: 为了支持中文全称mask，将被分开的词，将上特殊标记("#")，使得后续处理模块，能够知道哪些字是属于同一个词的。 :param segment: 一句话 :return: 一句处理过的话 """ seq_cws = jieba.lcut("".join(segment)) seq_cws_dict = {x: 1 for x in seq_cws} new_segment = [] i = 0 while i < len(segment): if len(re.findall('[\u4E00-\u9FA5]', segment[i]))==0: # 不是中文的，原文加进去。 new_segment.append(segment[i]) i += 1 continue has_add = False for length in range(3,0,-1): if i+length>len(segment): continue if ''.join(segment[i:i+length]) in seq_cws_dict: new_segment.append(segment[i]) for l in range(1, length): new_segment.append('##' + segment[i+l]) i += length has_add = True break if not has_add: new_segment.append(segment[i]) i += 1 return new_segment def get_raw_instance(document,max_sequence_length): # 新增的方法 """ 获取初步的训练实例，将整段按照max_sequence_length切分成多个部分,并以多个处理好的实例的形式返回。 :param document: 一整段 :param max_sequence_length: :return: a list. each element is a sequence of text """ max_sequence_length_allowed=max_sequence_length-2 document = [seq for seq in document if len(seq)max_sequence_length_allowed/2: # /2 result_list.append(curr_seq) # # 计算总共可以得到多少份 # num_instance=int(len(big_list)/max_sequence_length_allowed)+1 # print("num_instance:",num_instance) # # 切分成多份，添加到列表中 # result_list=[] # for j in range(num_instance): # index=j*max_sequence_length_allowed # end_index=index+max_sequence_length_allowed if j!=num_instance-1 else -1 # result_list.append(big_list[index:end_index]) return result_list def create_instances_from_document( # 新增的方法 # 目标按照RoBERTa的思路，使用DOC-SENTENCES，并会去掉NSP任务: 从一个文档中连续的获得文本，直到达到最大长度。如果是从下一个文档中获得，那么加上一个分隔符 # document即一整段话，包含多个句子。每个句子叫做segment. # 给定一个document即一整段话，生成一些instance. all_documents, document_index, max_seq_length, short_seq_prob, masked_lm_prob, max_predictions_per_seq, vocab_words, rng): """Creates `TrainingInstance`s for a single document.""" document = all_documents[document_index] # Account for [CLS], [SEP], [SEP] max_num_tokens = max_seq_length - 3 # We *usually* want to fill up the entire sequence since we are padding # to `max_seq_length` anyways, so short sequences are generally wasted # computation. However, we *sometimes* # (i.e., short_seq_prob == 0.1 == 10% of the time) want to use shorter # sequences to minimize the mismatch between pre-training and fine-tuning. # The `target_seq_length` is just a rough target however, whereas # `max_seq_length` is a hard limit. #target_seq_length = max_num_tokens #if rng.random() < short_seq_prob: # target_seq_length = rng.randint(2, max_num_tokens) instances = [] raw_text_list_list=get_raw_instance(document, max_seq_length) # document即一整段话，包含多个句子。每个句子叫做segment. for j, raw_text_list in enumerate(raw_text_list_list): #################################################################################################################### raw_text_list = get_new_segment(raw_text_list) # 结合分词的中文的whole mask设置即在需要的地方加上“##” # 1、设置token, segment_ids is_random_next=True # this will not be used, so it's value doesn't matter tokens = [] segment_ids = [] tokens.append("[CLS]") segment_ids.append(0) for token in raw_text_list: tokens.append(token) segment_ids.append(0) tokens.append("[SEP]") segment_ids.append(0) ################################################################################################################ # 2、调用原有的方法 (tokens, masked_lm_positions, masked_lm_labels) = create_masked_lm_predictions( tokens, masked_lm_prob, max_predictions_per_seq, vocab_words, rng) instance = TrainingInstance( tokens=tokens, segment_ids=segment_ids, is_random_next=is_random_next, masked_lm_positions=masked_lm_positions, masked_lm_labels=masked_lm_labels) instances.append(instance) return instances def create_instances_from_document_original( all_documents, document_index, max_seq_length, short_seq_prob, masked_lm_prob, max_predictions_per_seq, vocab_words, rng): """Creates `TrainingInstance`s for a single document.""" document = all_documents[document_index] # Account for [CLS], [SEP], [SEP] max_num_tokens = max_seq_length - 3 # We *usually* want to fill up the entire sequence since we are padding # to `max_seq_length` anyways, so short sequences are generally wasted # computation. However, we *sometimes* # (i.e., short_seq_prob == 0.1 == 10% of the time) want to use shorter # sequences to minimize the mismatch between pre-training and fine-tuning. # The `target_seq_length` is just a rough target however, whereas # `max_seq_length` is a hard limit. target_seq_length = max_num_tokens if rng.random() < short_seq_prob: target_seq_length = rng.randint(2, max_num_tokens) # We DON'T just concatenate all of the tokens from a document into a long # sequence and choose an arbitrary split point because this would make the # next sentence prediction task too easy. Instead, we split the input into # segments "A" and "B" based on the actual "sentences" provided by the user # input. instances = [] current_chunk = [] current_length = 0 i = 0 print("document_index:",document_index,"document:",type(document)," ;document:",document) # document即一整段话，包含多个句子。每个句子叫做segment. while i < len(document): segment = document[i] # 取到一个部分（可能是一段话） print("i:",i," ;segment:",segment) #################################################################################################################### segment = get_new_segment(segment) # 结合分词的中文的whole mask设置即在需要的地方加上“##” ################################################################################################################### current_chunk.append(segment) current_length += len(segment) print("#####condition:",i == len(document) - 1 or current_length >= target_seq_length) if i == len(document) - 1 or current_length >= target_seq_length: if current_chunk: # `a_end` is how many segments from `current_chunk` go into the `A` # (first) sentence. a_end = 1 if len(current_chunk) >= 2: a_end = rng.randint(1, len(current_chunk) - 1) tokens_a = [] for j in range(a_end): tokens_a.extend(current_chunk[j]) tokens_b = [] # Random next is_random_next = False if len(current_chunk) == 1 or rng.random() < 0.5: is_random_next = True target_b_length = target_seq_length - len(tokens_a) # This should rarely go for more than one iteration for large # corpora. However, just to be careful, we try to make sure that # the random document is not the same as the document # we're processing. for _ in range(10): random_document_index = rng.randint(0, len(all_documents) - 1) if random_document_index != document_index: break random_document = all_documents[random_document_index] random_start = rng.randint(0, len(random_document) - 1) for j in range(random_start, len(random_document)): tokens_b.extend(random_document[j]) if len(tokens_b) >= target_b_length: break # We didn't actually use these segments so we "put them back" so # they don't go to waste. num_unused_segments = len(current_chunk) - a_end i -= num_unused_segments # Actual next else: is_random_next = False for j in range(a_end, len(current_chunk)): tokens_b.extend(current_chunk[j]) truncate_seq_pair(tokens_a, tokens_b, max_num_tokens, rng) assert len(tokens_a) >= 1 assert len(tokens_b) >= 1 tokens = [] segment_ids = [] tokens.append("[CLS]") segment_ids.append(0) for token in tokens_a: tokens.append(token) segment_ids.append(0) tokens.append("[SEP]") segment_ids.append(0) for token in tokens_b: tokens.append(token) segment_ids.append(1) tokens.append("[SEP]") segment_ids.append(1) (tokens, masked_lm_positions, masked_lm_labels) = create_masked_lm_predictions( tokens, masked_lm_prob, max_predictions_per_seq, vocab_words, rng) instance = TrainingInstance( tokens=tokens, segment_ids=segment_ids, is_random_next=is_random_next, masked_lm_positions=masked_lm_positions, masked_lm_labels=masked_lm_labels) instances.append(instance) current_chunk = [] current_length = 0 i += 1 return instances MaskedLmInstance = collections.namedtuple("MaskedLmInstance", ["index", "label"]) def create_masked_lm_predictions(tokens, masked_lm_prob, max_predictions_per_seq, vocab_words, rng): """Creates the predictions for the masked LM objective.""" cand_indexes = [] for (i, token) in enumerate(tokens): if token == "[CLS]" or token == "[SEP]": continue # Whole Word Masking means that if we mask all of the wordpieces # corresponding to an original word. When a word has been split into # WordPieces, the first token does not have any marker and any subsequence # tokens are prefixed with ##. So whenever we see the ## token, we # append it to the previous set of word indexes. # # Note that Whole Word Masking does *not* change the training code # at all -- we still predict each WordPiece independently, softmaxed # over the entire vocabulary. if (FLAGS.do_whole_word_mask and len(cand_indexes) >= 1 and token.startswith("##")): cand_indexes[-1].append(i) else: cand_indexes.append([i]) rng.shuffle(cand_indexes) output_tokens = [t[2:] if len(re.findall('##[\u4E00-\u9FA5]', t))>0 else t for t in tokens] num_to_predict = min(max_predictions_per_seq, max(1, int(round(len(tokens) * masked_lm_prob)))) masked_lms = [] covered_indexes = set() for index_set in cand_indexes: if len(masked_lms) >= num_to_predict: break # If adding a whole-word mask would exceed the maximum number of # predictions, then just skip this candidate. if len(masked_lms) + len(index_set) > num_to_predict: continue is_any_index_covered = False for index in index_set: if index in covered_indexes: is_any_index_covered = True break if is_any_index_covered: continue for index in index_set: covered_indexes.add(index) masked_token = None # 80% of the time, replace with [MASK] if rng.random() < 0.8: masked_token = "[MASK]" else: # 10% of the time, keep original if rng.random() < 0.5: masked_token = tokens[index][2:] if len(re.findall('##[\u4E00-\u9FA5]', tokens[index]))>0 else tokens[index] # 10% of the time, replace with random word else: masked_token = vocab_words[rng.randint(0, len(vocab_words) - 1)] output_tokens[index] = masked_token masked_lms.append(MaskedLmInstance(index=index, label=tokens[index])) assert len(masked_lms) <= num_to_predict masked_lms = sorted(masked_lms, key=lambda x: x.index) masked_lm_positions = [] masked_lm_labels = [] for p in masked_lms: masked_lm_positions.append(p.index) masked_lm_labels.append(p.label) # tf.logging.info('%s' % (tokens)) # tf.logging.info('%s' % (output_tokens)) return (output_tokens, masked_lm_positions, masked_lm_labels) def truncate_seq_pair(tokens_a, tokens_b, max_num_tokens, rng): """Truncates a pair of sequences to a maximum sequence length.""" while True: total_length = len(tokens_a) + len(tokens_b) if total_length <= max_num_tokens: break trunc_tokens = tokens_a if len(tokens_a) > len(tokens_b) else tokens_b assert len(trunc_tokens) >= 1 # We want to sometimes truncate from the front and sometimes from the # back to add more randomness and avoid biases. if rng.random() < 0.5: del trunc_tokens[0] else: trunc_tokens.pop() def main(_): tf.logging.set_verbosity(tf.logging.INFO) tokenizer = tokenization.FullTokenizer( vocab_file=FLAGS.vocab_file, do_lower_case=FLAGS.do_lower_case) input_files = [] for input_pattern in FLAGS.input_file.split(","): input_files.extend(tf.gfile.Glob(input_pattern)) tf.logging.info("*** Reading from input files ***") for input_file in input_files: tf.logging.info(" %s", input_file) rng = random.Random(FLAGS.random_seed) instances = create_training_instances( input_files, tokenizer, FLAGS.max_seq_length, FLAGS.dupe_factor, FLAGS.short_seq_prob, FLAGS.masked_lm_prob, FLAGS.max_predictions_per_seq, rng) output_files = FLAGS.output_file.split(",") tf.logging.info("*** Writing to output files ***") for output_file in output_files: tf.logging.info(" %s", output_file) write_instance_to_example_files(instances, tokenizer, FLAGS.max_seq_length, FLAGS.max_predictions_per_seq, output_files) if __name__ == "__main__": flags.mark_flag_as_required("input_file") flags.mark_flag_as_required("output_file") flags.mark_flag_as_required("vocab_file") tf.app.run() ================================================ FILE: modeling.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """The main BERT model and related functions.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import collections import copy import json import math import re import numpy as np import six import tensorflow as tf class BertConfig(object): """Configuration for `BertModel`.""" def __init__(self, vocab_size, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, hidden_act="gelu", hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, max_position_embeddings=512, type_vocab_size=16, initializer_range=0.02): """Constructs BertConfig. Args: vocab_size: Vocabulary size of `inputs_ids` in `BertModel`. hidden_size: Size of the encoder layers and the pooler layer. num_hidden_layers: Number of hidden layers in the Transformer encoder. num_attention_heads: Number of attention heads for each attention layer in the Transformer encoder. intermediate_size: The size of the "intermediate" (i.e., feed-forward) layer in the Transformer encoder. hidden_act: The non-linear activation function (function or string) in the encoder and pooler. hidden_dropout_prob: The dropout probability for all fully connected layers in the embeddings, encoder, and pooler. attention_probs_dropout_prob: The dropout ratio for the attention probabilities. max_position_embeddings: The maximum sequence length that this model might ever be used with. Typically set this to something large just in case (e.g., 512 or 1024 or 2048). type_vocab_size: The vocabulary size of the `token_type_ids` passed into `BertModel`. initializer_range: The stdev of the truncated_normal_initializer for initializing all weight matrices. """ self.vocab_size = vocab_size self.hidden_size = hidden_size self.num_hidden_layers = num_hidden_layers self.num_attention_heads = num_attention_heads self.hidden_act = hidden_act self.intermediate_size = intermediate_size self.hidden_dropout_prob = hidden_dropout_prob self.attention_probs_dropout_prob = attention_probs_dropout_prob self.max_position_embeddings = max_position_embeddings self.type_vocab_size = type_vocab_size self.initializer_range = initializer_range @classmethod def from_dict(cls, json_object): """Constructs a `BertConfig` from a Python dictionary of parameters.""" config = BertConfig(vocab_size=None) for (key, value) in six.iteritems(json_object): config.__dict__[key] = value return config @classmethod def from_json_file(cls, json_file): """Constructs a `BertConfig` from a json file of parameters.""" with tf.gfile.GFile(json_file, "r") as reader: text = reader.read() return cls.from_dict(json.loads(text)) def to_dict(self): """Serializes this instance to a Python dictionary.""" output = copy.deepcopy(self.__dict__) return output def to_json_string(self): """Serializes this instance to a JSON string.""" return json.dumps(self.to_dict(), indent=2, sort_keys=True) + "\n" class BertModel(object): """BERT model ("Bidirectional Encoder Representations from Transformers"). Example usage: ```python # Already been converted into WordPiece token ids input_ids = tf.constant([[31, 51, 99], [15, 5, 0]]) input_mask = tf.constant([[1, 1, 1], [1, 1, 0]]) token_type_ids = tf.constant([[0, 0, 1], [0, 2, 0]]) config = modeling.BertConfig(vocab_size=32000, hidden_size=512, num_hidden_layers=8, num_attention_heads=6, intermediate_size=1024) model = modeling.BertModel(config=config, is_training=True, input_ids=input_ids, input_mask=input_mask, token_type_ids=token_type_ids) label_embeddings = tf.get_variable(...) pooled_output = model.get_pooled_output() logits = tf.matmul(pooled_output, label_embeddings) ... ``` """ def __init__(self, config, is_training, input_ids, input_mask=None, token_type_ids=None, use_one_hot_embeddings=False, scope=None): """Constructor for BertModel. Args: config: `BertConfig` instance. is_training: bool. true for training model, false for eval model. Controls whether dropout will be applied. input_ids: int32 Tensor of shape [batch_size, seq_length]. input_mask: (optional) int32 Tensor of shape [batch_size, seq_length]. token_type_ids: (optional) int32 Tensor of shape [batch_size, seq_length]. use_one_hot_embeddings: (optional) bool. Whether to use one-hot word embeddings or tf.embedding_lookup() for the word embeddings. scope: (optional) variable scope. Defaults to "bert". Raises: ValueError: The config is invalid or one of the input tensor shapes is invalid. """ config = copy.deepcopy(config) if not is_training: config.hidden_dropout_prob = 0.0 config.attention_probs_dropout_prob = 0.0 input_shape = get_shape_list(input_ids, expected_rank=2) batch_size = input_shape[0] seq_length = input_shape[1] if input_mask is None: input_mask = tf.ones(shape=[batch_size, seq_length], dtype=tf.int32) if token_type_ids is None: token_type_ids = tf.zeros(shape=[batch_size, seq_length], dtype=tf.int32) with tf.variable_scope(scope, default_name="bert"): with tf.variable_scope("embeddings"): # Perform embedding lookup on the word ids. (self.embedding_output, self.embedding_table) = embedding_lookup( input_ids=input_ids, vocab_size=config.vocab_size, embedding_size=config.hidden_size, initializer_range=config.initializer_range, word_embedding_name="word_embeddings", use_one_hot_embeddings=use_one_hot_embeddings) # Add positional embeddings and token type embeddings, then layer # normalize and perform dropout. self.embedding_output = embedding_postprocessor( input_tensor=self.embedding_output, use_token_type=True, token_type_ids=token_type_ids, token_type_vocab_size=config.type_vocab_size, token_type_embedding_name="token_type_embeddings", use_position_embeddings=True, position_embedding_name="position_embeddings", initializer_range=config.initializer_range, max_position_embeddings=config.max_position_embeddings, dropout_prob=config.hidden_dropout_prob) with tf.variable_scope("encoder"): # This converts a 2D mask of shape [batch_size, seq_length] to a 3D # mask of shape [batch_size, seq_length, seq_length] which is used # for the attention scores. attention_mask = create_attention_mask_from_input_mask( input_ids, input_mask) # Run the stacked transformer. # `sequence_output` shape = [batch_size, seq_length, hidden_size]. self.all_encoder_layers = transformer_model( input_tensor=self.embedding_output, attention_mask=attention_mask, hidden_size=config.hidden_size, num_hidden_layers=config.num_hidden_layers, num_attention_heads=config.num_attention_heads, intermediate_size=config.intermediate_size, intermediate_act_fn=get_activation(config.hidden_act), hidden_dropout_prob=config.hidden_dropout_prob, attention_probs_dropout_prob=config.attention_probs_dropout_prob, initializer_range=config.initializer_range, do_return_all_layers=True) self.sequence_output = self.all_encoder_layers[-1] # [batch_size, seq_length, hidden_size] # The "pooler" converts the encoded sequence tensor of shape # [batch_size, seq_length, hidden_size] to a tensor of shape # [batch_size, hidden_size]. This is necessary for segment-level # (or segment-pair-level) classification tasks where we need a fixed # dimensional representation of the segment. with tf.variable_scope("pooler"): # We "pool" the model by simply taking the hidden state corresponding # to the first token. We assume that this has been pre-trained first_token_tensor = tf.squeeze(self.sequence_output[:, 0:1, :], axis=1) self.pooled_output = tf.layers.dense( first_token_tensor, config.hidden_size, activation=tf.tanh, kernel_initializer=create_initializer(config.initializer_range)) def get_pooled_output(self): return self.pooled_output def get_sequence_output(self): """Gets final hidden layer of encoder. Returns: float Tensor of shape [batch_size, seq_length, hidden_size] corresponding to the final hidden of the transformer encoder. """ return self.sequence_output def get_all_encoder_layers(self): return self.all_encoder_layers def get_embedding_output(self): """Gets output of the embedding lookup (i.e., input to the transformer). Returns: float Tensor of shape [batch_size, seq_length, hidden_size] corresponding to the output of the embedding layer, after summing the word embeddings with the positional embeddings and the token type embeddings, then performing layer normalization. This is the input to the transformer. """ return self.embedding_output def get_embedding_table(self): return self.embedding_table def gelu(x): """Gaussian Error Linear Unit. This is a smoother version of the RELU. Original paper: https://arxiv.org/abs/1606.08415 Args: x: float Tensor to perform activation. Returns: `x` with the GELU activation applied. """ cdf = 0.5 * (1.0 + tf.tanh( (np.sqrt(2 / np.pi) * (x + 0.044715 * tf.pow(x, 3))))) return x * cdf def get_activation(activation_string): """Maps a string to a Python function, e.g., "relu" => `tf.nn.relu`. Args: activation_string: String name of the activation function. Returns: A Python function corresponding to the activation function. If `activation_string` is None, empty, or "linear", this will return None. If `activation_string` is not a string, it will return `activation_string`. Raises: ValueError: The `activation_string` does not correspond to a known activation. """ # We assume that anything that"s not a string is already an activation # function, so we just return it. if not isinstance(activation_string, six.string_types): return activation_string if not activation_string: return None act = activation_string.lower() if act == "linear": return None elif act == "relu": return tf.nn.relu elif act == "gelu": return gelu elif act == "tanh": return tf.tanh else: raise ValueError("Unsupported activation: %s" % act) def get_assignment_map_from_checkpoint(tvars, init_checkpoint): """Compute the union of the current variables and checkpoint variables.""" assignment_map = {} initialized_variable_names = {} name_to_variable = collections.OrderedDict() for var in tvars: name = var.name m = re.match("^(.*):\\d+$", name) if m is not None: name = m.group(1) name_to_variable[name] = var init_vars = tf.train.list_variables(init_checkpoint) assignment_map = collections.OrderedDict() for x in init_vars: (name, var) = (x[0], x[1]) if name not in name_to_variable: continue assignment_map[name] = name initialized_variable_names[name] = 1 initialized_variable_names[name + ":0"] = 1 return (assignment_map, initialized_variable_names) def dropout(input_tensor, dropout_prob): """Perform dropout. Args: input_tensor: float Tensor. dropout_prob: Python float. The probability of dropping out a value (NOT of *keeping* a dimension as in `tf.nn.dropout`). Returns: A version of `input_tensor` with dropout applied. """ if dropout_prob is None or dropout_prob == 0.0: return input_tensor output = tf.nn.dropout(input_tensor, 1.0 - dropout_prob) return output def layer_norm(input_tensor, name=None): """Run layer normalization on the last dimension of the tensor.""" return tf.contrib.layers.layer_norm( inputs=input_tensor, begin_norm_axis=-1, begin_params_axis=-1, scope=name) def layer_norm_and_dropout(input_tensor, dropout_prob, name=None): """Runs layer normalization followed by dropout.""" output_tensor = layer_norm(input_tensor, name) output_tensor = dropout(output_tensor, dropout_prob) return output_tensor def create_initializer(initializer_range=0.02): """Creates a `truncated_normal_initializer` with the given range.""" return tf.truncated_normal_initializer(stddev=initializer_range) def embedding_lookup(input_ids, vocab_size, embedding_size=128, initializer_range=0.02, word_embedding_name="word_embeddings", use_one_hot_embeddings=False): """Looks up words embeddings for id tensor. Args: input_ids: int32 Tensor of shape [batch_size, seq_length] containing word ids. vocab_size: int. Size of the embedding vocabulary. embedding_size: int. Width of the word embeddings. initializer_range: float. Embedding initialization range. word_embedding_name: string. Name of the embedding table. use_one_hot_embeddings: bool. If True, use one-hot method for word embeddings. If False, use `tf.gather()`. Returns: float Tensor of shape [batch_size, seq_length, embedding_size]. """ # This function assumes that the input is of shape [batch_size, seq_length, # num_inputs]. # # If the input is a 2D tensor of shape [batch_size, seq_length], we # reshape to [batch_size, seq_length, 1]. if input_ids.shape.ndims == 2: input_ids = tf.expand_dims(input_ids, axis=[-1]) embedding_table = tf.get_variable( name=word_embedding_name, shape=[vocab_size, embedding_size], initializer=create_initializer(initializer_range)) flat_input_ids = tf.reshape(input_ids, [-1]) if use_one_hot_embeddings: one_hot_input_ids = tf.one_hot(flat_input_ids, depth=vocab_size) output = tf.matmul(one_hot_input_ids, embedding_table) else: output = tf.gather(embedding_table, flat_input_ids) input_shape = get_shape_list(input_ids) output = tf.reshape(output, input_shape[0:-1] + [input_shape[-1] * embedding_size]) return (output, embedding_table) def embedding_postprocessor(input_tensor, use_token_type=False, token_type_ids=None, token_type_vocab_size=16, token_type_embedding_name="token_type_embeddings", use_position_embeddings=True, position_embedding_name="position_embeddings", initializer_range=0.02, max_position_embeddings=512, dropout_prob=0.1): """Performs various post-processing on a word embedding tensor. Args: input_tensor: float Tensor of shape [batch_size, seq_length, embedding_size]. use_token_type: bool. Whether to add embeddings for `token_type_ids`. token_type_ids: (optional) int32 Tensor of shape [batch_size, seq_length]. Must be specified if `use_token_type` is True. token_type_vocab_size: int. The vocabulary size of `token_type_ids`. token_type_embedding_name: string. The name of the embedding table variable for token type ids. use_position_embeddings: bool. Whether to add position embeddings for the position of each token in the sequence. position_embedding_name: string. The name of the embedding table variable for positional embeddings. initializer_range: float. Range of the weight initialization. max_position_embeddings: int. Maximum sequence length that might ever be used with this model. This can be longer than the sequence length of input_tensor, but cannot be shorter. dropout_prob: float. Dropout probability applied to the final output tensor. Returns: float tensor with same shape as `input_tensor`. Raises: ValueError: One of the tensor shapes or input values is invalid. """ input_shape = get_shape_list(input_tensor, expected_rank=3) batch_size = input_shape[0] seq_length = input_shape[1] width = input_shape[2] output = input_tensor if use_token_type: if token_type_ids is None: raise ValueError("`token_type_ids` must be specified if" "`use_token_type` is True.") token_type_table = tf.get_variable( name=token_type_embedding_name, shape=[token_type_vocab_size, width], initializer=create_initializer(initializer_range)) # This vocab will be small so we always do one-hot here, since it is always # faster for a small vocabulary. flat_token_type_ids = tf.reshape(token_type_ids, [-1]) one_hot_ids = tf.one_hot(flat_token_type_ids, depth=token_type_vocab_size) token_type_embeddings = tf.matmul(one_hot_ids, token_type_table) token_type_embeddings = tf.reshape(token_type_embeddings, [batch_size, seq_length, width]) output += token_type_embeddings if use_position_embeddings: assert_op = tf.assert_less_equal(seq_length, max_position_embeddings) with tf.control_dependencies([assert_op]): full_position_embeddings = tf.get_variable( name=position_embedding_name, shape=[max_position_embeddings, width], initializer=create_initializer(initializer_range)) # Since the position embedding table is a learned variable, we create it # using a (long) sequence length `max_position_embeddings`. The actual # sequence length might be shorter than this, for faster training of # tasks that do not have long sequences. # # So `full_position_embeddings` is effectively an embedding table # for position [0, 1, 2, ..., max_position_embeddings-1], and the current # sequence has positions [0, 1, 2, ... seq_length-1], so we can just # perform a slice. position_embeddings = tf.slice(full_position_embeddings, [0, 0], [seq_length, -1]) num_dims = len(output.shape.as_list()) # Only the last two dimensions are relevant (`seq_length` and `width`), so # we broadcast among the first dimensions, which is typically just # the batch size. position_broadcast_shape = [] for _ in range(num_dims - 2): position_broadcast_shape.append(1) position_broadcast_shape.extend([seq_length, width]) position_embeddings = tf.reshape(position_embeddings, position_broadcast_shape) output += position_embeddings output = layer_norm_and_dropout(output, dropout_prob) return output def create_attention_mask_from_input_mask(from_tensor, to_mask): """Create 3D attention mask from a 2D tensor mask. Args: from_tensor: 2D or 3D Tensor of shape [batch_size, from_seq_length, ...]. to_mask: int32 Tensor of shape [batch_size, to_seq_length]. Returns: float Tensor of shape [batch_size, from_seq_length, to_seq_length]. """ from_shape = get_shape_list(from_tensor, expected_rank=[2, 3]) batch_size = from_shape[0] from_seq_length = from_shape[1] to_shape = get_shape_list(to_mask, expected_rank=2) to_seq_length = to_shape[1] to_mask = tf.cast( tf.reshape(to_mask, [batch_size, 1, to_seq_length]), tf.float32) # We don't assume that `from_tensor` is a mask (although it could be). We # don't actually care if we attend *from* padding tokens (only *to* padding) # tokens so we create a tensor of all ones. # # `broadcast_ones` = [batch_size, from_seq_length, 1] broadcast_ones = tf.ones( shape=[batch_size, from_seq_length, 1], dtype=tf.float32) # Here we broadcast along two dimensions to create the mask. mask = broadcast_ones * to_mask return mask def attention_layer(from_tensor, to_tensor, attention_mask=None, num_attention_heads=1, size_per_head=512, query_act=None, key_act=None, value_act=None, attention_probs_dropout_prob=0.0, initializer_range=0.02, do_return_2d_tensor=False, batch_size=None, from_seq_length=None, to_seq_length=None): """Performs multi-headed attention from `from_tensor` to `to_tensor`. This is an implementation of multi-headed attention based on "Attention is all you Need". If `from_tensor` and `to_tensor` are the same, then this is self-attention. Each timestep in `from_tensor` attends to the corresponding sequence in `to_tensor`, and returns a fixed-with vector. This function first projects `from_tensor` into a "query" tensor and `to_tensor` into "key" and "value" tensors. These are (effectively) a list of tensors of length `num_attention_heads`, where each tensor is of shape [batch_size, seq_length, size_per_head]. Then, the query and key tensors are dot-producted and scaled. These are softmaxed to obtain attention probabilities. The value tensors are then interpolated by these probabilities, then concatenated back to a single tensor and returned. In practice, the multi-headed attention are done with transposes and reshapes rather than actual separate tensors. Args: from_tensor: float Tensor of shape [batch_size, from_seq_length, from_width]. to_tensor: float Tensor of shape [batch_size, to_seq_length, to_width]. attention_mask: (optional) int32 Tensor of shape [batch_size, from_seq_length, to_seq_length]. The values should be 1 or 0. The attention scores will effectively be set to -infinity for any positions in the mask that are 0, and will be unchanged for positions that are 1. num_attention_heads: int. Number of attention heads. size_per_head: int. Size of each attention head. query_act: (optional) Activation function for the query transform. key_act: (optional) Activation function for the key transform. value_act: (optional) Activation function for the value transform. attention_probs_dropout_prob: (optional) float. Dropout probability of the attention probabilities. initializer_range: float. Range of the weight initializer. do_return_2d_tensor: bool. If True, the output will be of shape [batch_size * from_seq_length, num_attention_heads * size_per_head]. If False, the output will be of shape [batch_size, from_seq_length, num_attention_heads * size_per_head]. batch_size: (Optional) int. If the input is 2D, this might be the batch size of the 3D version of the `from_tensor` and `to_tensor`. from_seq_length: (Optional) If the input is 2D, this might be the seq length of the 3D version of the `from_tensor`. to_seq_length: (Optional) If the input is 2D, this might be the seq length of the 3D version of the `to_tensor`. Returns: float Tensor of shape [batch_size, from_seq_length, num_attention_heads * size_per_head]. (If `do_return_2d_tensor` is true, this will be of shape [batch_size * from_seq_length, num_attention_heads * size_per_head]). Raises: ValueError: Any of the arguments or tensor shapes are invalid. """ def transpose_for_scores(input_tensor, batch_size, num_attention_heads, seq_length, width): output_tensor = tf.reshape( input_tensor, [batch_size, seq_length, num_attention_heads, width]) output_tensor = tf.transpose(output_tensor, [0, 2, 1, 3]) return output_tensor from_shape = get_shape_list(from_tensor, expected_rank=[2, 3]) to_shape = get_shape_list(to_tensor, expected_rank=[2, 3]) if len(from_shape) != len(to_shape): raise ValueError( "The rank of `from_tensor` must match the rank of `to_tensor`.") if len(from_shape) == 3: batch_size = from_shape[0] from_seq_length = from_shape[1] to_seq_length = to_shape[1] elif len(from_shape) == 2: if (batch_size is None or from_seq_length is None or to_seq_length is None): raise ValueError( "When passing in rank 2 tensors to attention_layer, the values " "for `batch_size`, `from_seq_length`, and `to_seq_length` " "must all be specified.") # Scalar dimensions referenced here: # B = batch size (number of sequences) # F = `from_tensor` sequence length # T = `to_tensor` sequence length # N = `num_attention_heads` # H = `size_per_head` from_tensor_2d = reshape_to_matrix(from_tensor) to_tensor_2d = reshape_to_matrix(to_tensor) # `query_layer` = [B*F, N*H] query_layer = tf.layers.dense( from_tensor_2d, num_attention_heads * size_per_head, activation=query_act, name="query", kernel_initializer=create_initializer(initializer_range)) # `key_layer` = [B*T, N*H] key_layer = tf.layers.dense( to_tensor_2d, num_attention_heads * size_per_head, activation=key_act, name="key", kernel_initializer=create_initializer(initializer_range)) # `value_layer` = [B*T, N*H] value_layer = tf.layers.dense( to_tensor_2d, num_attention_heads * size_per_head, activation=value_act, name="value", kernel_initializer=create_initializer(initializer_range)) # `query_layer` = [B, N, F, H] query_layer = transpose_for_scores(query_layer, batch_size, num_attention_heads, from_seq_length, size_per_head) # `key_layer` = [B, N, T, H] key_layer = transpose_for_scores(key_layer, batch_size, num_attention_heads, to_seq_length, size_per_head) # Take the dot product between "query" and "key" to get the raw # attention scores. # `attention_scores` = [B, N, F, T] attention_scores = tf.matmul(query_layer, key_layer, transpose_b=True) attention_scores = tf.multiply(attention_scores, 1.0 / math.sqrt(float(size_per_head))) if attention_mask is not None: # `attention_mask` = [B, 1, F, T] attention_mask = tf.expand_dims(attention_mask, axis=[1]) # Since attention_mask is 1.0 for positions we want to attend and 0.0 for # masked positions, this operation will create a tensor which is 0.0 for # positions we want to attend and -10000.0 for masked positions. adder = (1.0 - tf.cast(attention_mask, tf.float32)) * -10000.0 # Since we are adding it to the raw scores before the softmax, this is # effectively the same as removing these entirely. attention_scores += adder # Normalize the attention scores to probabilities. # `attention_probs` = [B, N, F, T] attention_probs = tf.nn.softmax(attention_scores) # This is actually dropping out entire tokens to attend to, which might # seem a bit unusual, but is taken from the original Transformer paper. attention_probs = dropout(attention_probs, attention_probs_dropout_prob) # `value_layer` = [B, T, N, H] value_layer = tf.reshape( value_layer, [batch_size, to_seq_length, num_attention_heads, size_per_head]) # `value_layer` = [B, N, T, H] value_layer = tf.transpose(value_layer, [0, 2, 1, 3]) # `context_layer` = [B, N, F, H] context_layer = tf.matmul(attention_probs, value_layer) # `context_layer` = [B, F, N, H] context_layer = tf.transpose(context_layer, [0, 2, 1, 3]) if do_return_2d_tensor: # `context_layer` = [B*F, N*H] context_layer = tf.reshape( context_layer, [batch_size * from_seq_length, num_attention_heads * size_per_head]) else: # `context_layer` = [B, F, N*H] context_layer = tf.reshape( context_layer, [batch_size, from_seq_length, num_attention_heads * size_per_head]) return context_layer def transformer_model(input_tensor, attention_mask=None, hidden_size=768, num_hidden_layers=12, num_attention_heads=12, intermediate_size=3072, intermediate_act_fn=gelu, hidden_dropout_prob=0.1, attention_probs_dropout_prob=0.1, initializer_range=0.02, do_return_all_layers=False): """Multi-headed, multi-layer Transformer from "Attention is All You Need". This is almost an exact implementation of the original Transformer encoder. See the original paper: https://arxiv.org/abs/1706.03762 Also see: https://github.com/tensorflow/tensor2tensor/blob/master/tensor2tensor/models/transformer.py Args: input_tensor: float Tensor of shape [batch_size, seq_length, hidden_size]. attention_mask: (optional) int32 Tensor of shape [batch_size, seq_length, seq_length], with 1 for positions that can be attended to and 0 in positions that should not be. hidden_size: int. Hidden size of the Transformer. num_hidden_layers: int. Number of layers (blocks) in the Transformer. num_attention_heads: int. Number of attention heads in the Transformer. intermediate_size: int. The size of the "intermediate" (a.k.a., feed forward) layer. intermediate_act_fn: function. The non-linear activation function to apply to the output of the intermediate/feed-forward layer. hidden_dropout_prob: float. Dropout probability for the hidden layers. attention_probs_dropout_prob: float. Dropout probability of the attention probabilities. initializer_range: float. Range of the initializer (stddev of truncated normal). do_return_all_layers: Whether to also return all layers or just the final layer. Returns: float Tensor of shape [batch_size, seq_length, hidden_size], the final hidden layer of the Transformer. Raises: ValueError: A Tensor shape or parameter is invalid. """ if hidden_size % num_attention_heads != 0: raise ValueError( "The hidden size (%d) is not a multiple of the number of attention " "heads (%d)" % (hidden_size, num_attention_heads)) attention_head_size = int(hidden_size / num_attention_heads) input_shape = get_shape_list(input_tensor, expected_rank=3) batch_size = input_shape[0] seq_length = input_shape[1] input_width = input_shape[2] # The Transformer performs sum residuals on all layers so the input needs # to be the same as the hidden size. if input_width != hidden_size: raise ValueError("The width of the input tensor (%d) != hidden size (%d)" % (input_width, hidden_size)) # We keep the representation as a 2D tensor to avoid re-shaping it back and # forth from a 3D tensor to a 2D tensor. Re-shapes are normally free on # the GPU/CPU but may not be free on the TPU, so we want to minimize them to # help the optimizer. prev_output = reshape_to_matrix(input_tensor) all_layer_outputs = [] for layer_idx in range(num_hidden_layers): with tf.variable_scope("layer_%d" % layer_idx): layer_input = prev_output with tf.variable_scope("attention"): attention_heads = [] with tf.variable_scope("self"): attention_head = attention_layer( from_tensor=layer_input, to_tensor=layer_input, attention_mask=attention_mask, num_attention_heads=num_attention_heads, size_per_head=attention_head_size, attention_probs_dropout_prob=attention_probs_dropout_prob, initializer_range=initializer_range, do_return_2d_tensor=True, batch_size=batch_size, from_seq_length=seq_length, to_seq_length=seq_length) attention_heads.append(attention_head) attention_output = None if len(attention_heads) == 1: attention_output = attention_heads[0] else: # In the case where we have other sequences, we just concatenate # them to the self-attention head before the projection. attention_output = tf.concat(attention_heads, axis=-1) # Run a linear projection of `hidden_size` then add a residual # with `layer_input`. with tf.variable_scope("output"): attention_output = tf.layers.dense( attention_output, hidden_size, kernel_initializer=create_initializer(initializer_range)) attention_output = dropout(attention_output, hidden_dropout_prob) attention_output = layer_norm(attention_output + layer_input) # The activation is only applied to the "intermediate" hidden layer. with tf.variable_scope("intermediate"): intermediate_output = tf.layers.dense( attention_output, intermediate_size, activation=intermediate_act_fn, kernel_initializer=create_initializer(initializer_range)) # Down-project back to `hidden_size` then add the residual. with tf.variable_scope("output"): layer_output = tf.layers.dense( intermediate_output, hidden_size, kernel_initializer=create_initializer(initializer_range)) layer_output = dropout(layer_output, hidden_dropout_prob) layer_output = layer_norm(layer_output + attention_output) prev_output = layer_output all_layer_outputs.append(layer_output) if do_return_all_layers: final_outputs = [] for layer_output in all_layer_outputs: final_output = reshape_from_matrix(layer_output, input_shape) final_outputs.append(final_output) return final_outputs else: final_output = reshape_from_matrix(prev_output, input_shape) return final_output def get_shape_list(tensor, expected_rank=None, name=None): """Returns a list of the shape of tensor, preferring static dimensions. Args: tensor: A tf.Tensor object to find the shape of. expected_rank: (optional) int. The expected rank of `tensor`. If this is specified and the `tensor` has a different rank, and exception will be thrown. name: Optional name of the tensor for the error message. Returns: A list of dimensions of the shape of tensor. All static dimensions will be returned as python integers, and dynamic dimensions will be returned as tf.Tensor scalars. """ if name is None: name = tensor.name if expected_rank is not None: assert_rank(tensor, expected_rank, name) shape = tensor.shape.as_list() non_static_indexes = [] for (index, dim) in enumerate(shape): if dim is None: non_static_indexes.append(index) if not non_static_indexes: return shape dyn_shape = tf.shape(tensor) for index in non_static_indexes: shape[index] = dyn_shape[index] return shape def reshape_to_matrix(input_tensor): """Reshapes a >= rank 2 tensor to a rank 2 tensor (i.e., a matrix).""" ndims = input_tensor.shape.ndims if ndims < 2: raise ValueError("Input tensor must have at least rank 2. Shape = %s" % (input_tensor.shape)) if ndims == 2: return input_tensor width = input_tensor.shape[-1] output_tensor = tf.reshape(input_tensor, [-1, width]) return output_tensor def reshape_from_matrix(output_tensor, orig_shape_list): """Reshapes a rank 2 tensor back to its original rank >= 2 tensor.""" if len(orig_shape_list) == 2: return output_tensor output_shape = get_shape_list(output_tensor) orig_dims = orig_shape_list[0:-1] width = output_shape[-1] return tf.reshape(output_tensor, orig_dims + [width]) def assert_rank(tensor, expected_rank, name=None): """Raises an exception if the tensor rank is not of the expected rank. Args: tensor: A tf.Tensor to check the rank of. expected_rank: Python integer or list of integers, expected rank. name: Optional name of the tensor for the error message. Raises: ValueError: If the expected shape doesn't match the actual shape. """ if name is None: name = tensor.name expected_rank_dict = {} if isinstance(expected_rank, six.integer_types): expected_rank_dict[expected_rank] = True else: for x in expected_rank: expected_rank_dict[x] = True actual_rank = tensor.shape.ndims if actual_rank not in expected_rank_dict: scope_name = tf.get_variable_scope().name raise ValueError( "For the tensor `%s` in scope `%s`, the actual rank " "`%d` (shape = %s) is not equal to the expected rank `%s`" % (name, scope_name, actual_rank, str(tensor.shape), str(expected_rank))) ================================================ FILE: optimization.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """Functions and classes related to optimization (weight updates).""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import re import tensorflow as tf def create_optimizer(loss, init_lr, num_train_steps, num_warmup_steps, use_tpu): """Creates an optimizer training op.""" global_step = tf.train.get_or_create_global_step() learning_rate = tf.constant(value=init_lr, shape=[], dtype=tf.float32) # Implements linear decay of the learning rate. learning_rate = tf.train.polynomial_decay( learning_rate, global_step, num_train_steps, end_learning_rate=0.0, power=1.0, cycle=False) # Implements linear warmup. I.e., if global_step < num_warmup_steps, the # learning rate will be `global_step/num_warmup_steps * init_lr`. if num_warmup_steps: global_steps_int = tf.cast(global_step, tf.int32) warmup_steps_int = tf.constant(num_warmup_steps, dtype=tf.int32) global_steps_float = tf.cast(global_steps_int, tf.float32) warmup_steps_float = tf.cast(warmup_steps_int, tf.float32) warmup_percent_done = global_steps_float / warmup_steps_float warmup_learning_rate = init_lr * warmup_percent_done is_warmup = tf.cast(global_steps_int < warmup_steps_int, tf.float32) learning_rate = ( (1.0 - is_warmup) * learning_rate + is_warmup * warmup_learning_rate) # It is recommended that you use this optimizer for fine tuning, since this # is how the model was trained (note that the Adam m/v variables are NOT # loaded from init_checkpoint.) optimizer = AdamWeightDecayOptimizer( learning_rate=learning_rate, weight_decay_rate=0.01, beta_1=0.9, beta_2=0.98, # 0.98 ONLY USED FOR PRETRAIN. MUST CHANGE AT FINE-TUNING 0.999, epsilon=1e-6, exclude_from_weight_decay=["LayerNorm", "layer_norm", "bias"]) if use_tpu: optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer) tvars = tf.trainable_variables() # tvars=find_train_variables(tvars) # fix parameters from layer 0 to layer9. grads = tf.gradients(loss, tvars) # This is how the model was pre-trained. (grads, _) = tf.clip_by_global_norm(grads, clip_norm=1.0) train_op = optimizer.apply_gradients( zip(grads, tvars), global_step=global_step) # Normally the global step update is done inside of `apply_gradients`. # However, `AdamWeightDecayOptimizer` doesn't do this. But if you use # a different optimizer, you should probably take this line out. new_global_step = global_step + 1 train_op = tf.group(train_op, [global_step.assign(new_global_step)]) return train_op def find_train_variables(tvars): """ get trainable variables only to train from layer 9 to last layer :param tvars: a list :return: a new tvars, which is list """ # bert/encoder/layer_21, bert/encoder/layer_9, bert/encoder/layer_20/attention/output/dense/bias:0, bert/encoder/layer_20/attention/output/dense/kernel: tvars_result_list=[] for var in tvars: if 'cls/predictions' in var.name or 'bert/pooler/dense' in var.name: # 最后几层 tvars_result_list.append(var) else: # 后半个网络的参数 layer_number_list=re.findall("layer_(.+?)/", var.name) if len(layer_number_list)>0 and isinstance(layer_number_list[0],int): # 匹配到了一个数字 layer_number=int(layer_number_list[0]) if layer_number>=9: tvars_result_list.append(var) # print train variables for i,var_ in enumerate(tvars_result_list): print("####find_train_variables.i:",i, "variable name:",var_.name) print("####find_train_variables:length of tvars_result_list:",tvars_result_list) return tvars_result_list class AdamWeightDecayOptimizer(tf.train.Optimizer): """A basic Adam optimizer that includes "correct" L2 weight decay.""" def __init__(self, learning_rate, weight_decay_rate=0.0, beta_1=0.9, beta_2=0.999, epsilon=1e-6, exclude_from_weight_decay=None, name="AdamWeightDecayOptimizer"): """Constructs a AdamWeightDecayOptimizer.""" super(AdamWeightDecayOptimizer, self).__init__(False, name) self.learning_rate = learning_rate self.weight_decay_rate = weight_decay_rate self.beta_1 = beta_1 self.beta_2 = beta_2 self.epsilon = epsilon self.exclude_from_weight_decay = exclude_from_weight_decay def apply_gradients(self, grads_and_vars, global_step=None, name=None): """See base class.""" assignments = [] for (grad, param) in grads_and_vars: if grad is None or param is None: continue param_name = self._get_variable_name(param.name) m = tf.get_variable( name=param_name + "/adam_m", shape=param.shape.as_list(), dtype=tf.float32, trainable=False, initializer=tf.zeros_initializer()) v = tf.get_variable( name=param_name + "/adam_v", shape=param.shape.as_list(), dtype=tf.float32, trainable=False, initializer=tf.zeros_initializer()) # Standard Adam update. next_m = ( tf.multiply(self.beta_1, m) + tf.multiply(1.0 - self.beta_1, grad)) next_v = ( tf.multiply(self.beta_2, v) + tf.multiply(1.0 - self.beta_2, tf.square(grad))) update = next_m / (tf.sqrt(next_v) + self.epsilon) # Just adding the square of the weights to the loss function is *not* # the correct way of using L2 regularization/weight decay with Adam, # since that will interact with the m and v parameters in strange ways. # # Instead we want ot decay the weights in a manner that doesn't interact # with the m/v parameters. This is equivalent to adding the square # of the weights to the loss with plain (non-momentum) SGD. if self._do_use_weight_decay(param_name): update += self.weight_decay_rate * param update_with_lr = self.learning_rate * update next_param = param - update_with_lr assignments.extend( [param.assign(next_param), m.assign(next_m), v.assign(next_v)]) return tf.group(*assignments, name=name) def _do_use_weight_decay(self, param_name): """Whether to use L2 weight decay for `param_name`.""" if not self.weight_decay_rate: return False if self.exclude_from_weight_decay: for r in self.exclude_from_weight_decay: if re.search(r, param_name) is not None: return False return True def _get_variable_name(self, param_name): """Get the variable name from the tensor name.""" m = re.match("^(.*):\\d+$", param_name) if m is not None: param_name = m.group(1) return param_name ================================================ FILE: optimization_finetuning.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """Functions and classes related to optimization (weight updates).""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import re import tensorflow as tf def create_optimizer(loss, init_lr, num_train_steps, num_warmup_steps, use_tpu): """Creates an optimizer training op.""" global_step = tf.train.get_or_create_global_step() learning_rate = tf.constant(value=init_lr, shape=[], dtype=tf.float32) # Implements linear decay of the learning rate. learning_rate = tf.train.polynomial_decay( learning_rate, global_step, num_train_steps, end_learning_rate=0.0, power=1.0, cycle=False) # Implements linear warmup. I.e., if global_step < num_warmup_steps, the # learning rate will be `global_step/num_warmup_steps * init_lr`. if num_warmup_steps: global_steps_int = tf.cast(global_step, tf.int32) warmup_steps_int = tf.constant(num_warmup_steps, dtype=tf.int32) global_steps_float = tf.cast(global_steps_int, tf.float32) warmup_steps_float = tf.cast(warmup_steps_int, tf.float32) warmup_percent_done = global_steps_float / warmup_steps_float warmup_learning_rate = init_lr * warmup_percent_done is_warmup = tf.cast(global_steps_int < warmup_steps_int, tf.float32) learning_rate = ( (1.0 - is_warmup) * learning_rate + is_warmup * warmup_learning_rate) # It is recommended that you use this optimizer for fine tuning, since this # is how the model was trained (note that the Adam m/v variables are NOT # loaded from init_checkpoint.) optimizer = AdamWeightDecayOptimizer( learning_rate=learning_rate, weight_decay_rate=0.01, beta_1=0.9, beta_2=0.999, # 0.98 ONLY USED FOR PRETRAIN. MUST CHANGE AT FINE-TUNING 0.999, epsilon=1e-6, exclude_from_weight_decay=["LayerNorm", "layer_norm", "bias"]) if use_tpu: optimizer = tf.contrib.tpu.CrossShardOptimizer(optimizer) tvars = tf.trainable_variables() grads = tf.gradients(loss, tvars) # This is how the model was pre-trained. (grads, _) = tf.clip_by_global_norm(grads, clip_norm=1.0) train_op = optimizer.apply_gradients( zip(grads, tvars), global_step=global_step) # Normally the global step update is done inside of `apply_gradients`. # However, `AdamWeightDecayOptimizer` doesn't do this. But if you use # a different optimizer, you should probably take this line out. new_global_step = global_step + 1 train_op = tf.group(train_op, [global_step.assign(new_global_step)]) return train_op class AdamWeightDecayOptimizer(tf.train.Optimizer): """A basic Adam optimizer that includes "correct" L2 weight decay.""" def __init__(self, learning_rate, weight_decay_rate=0.0, beta_1=0.9, beta_2=0.999, epsilon=1e-6, exclude_from_weight_decay=None, name="AdamWeightDecayOptimizer"): """Constructs a AdamWeightDecayOptimizer.""" super(AdamWeightDecayOptimizer, self).__init__(False, name) self.learning_rate = learning_rate self.weight_decay_rate = weight_decay_rate self.beta_1 = beta_1 self.beta_2 = beta_2 self.epsilon = epsilon self.exclude_from_weight_decay = exclude_from_weight_decay def apply_gradients(self, grads_and_vars, global_step=None, name=None): """See base class.""" assignments = [] for (grad, param) in grads_and_vars: if grad is None or param is None: continue param_name = self._get_variable_name(param.name) m = tf.get_variable( name=param_name + "/adam_m", shape=param.shape.as_list(), dtype=tf.float32, trainable=False, initializer=tf.zeros_initializer()) v = tf.get_variable( name=param_name + "/adam_v", shape=param.shape.as_list(), dtype=tf.float32, trainable=False, initializer=tf.zeros_initializer()) # Standard Adam update. next_m = ( tf.multiply(self.beta_1, m) + tf.multiply(1.0 - self.beta_1, grad)) next_v = ( tf.multiply(self.beta_2, v) + tf.multiply(1.0 - self.beta_2, tf.square(grad))) update = next_m / (tf.sqrt(next_v) + self.epsilon) # Just adding the square of the weights to the loss function is *not* # the correct way of using L2 regularization/weight decay with Adam, # since that will interact with the m and v parameters in strange ways. # # Instead we want ot decay the weights in a manner that doesn't interact # with the m/v parameters. This is equivalent to adding the square # of the weights to the loss with plain (non-momentum) SGD. if self._do_use_weight_decay(param_name): update += self.weight_decay_rate * param update_with_lr = self.learning_rate * update next_param = param - update_with_lr assignments.extend( [param.assign(next_param), m.assign(next_m), v.assign(next_v)]) return tf.group(*assignments, name=name) def _do_use_weight_decay(self, param_name): """Whether to use L2 weight decay for `param_name`.""" if not self.weight_decay_rate: return False if self.exclude_from_weight_decay: for r in self.exclude_from_weight_decay: if re.search(r, param_name) is not None: return False return True def _get_variable_name(self, param_name): """Get the variable name from the tensor name.""" m = re.match("^(.*):\\d+$", param_name) if m is not None: param_name = m.group(1) return param_name ================================================ FILE: resources/vocab.txt ================================================ [PAD] [unused1] [unused2] [unused3] [unused4] [unused5] [unused6] [unused7] [unused8] [unused9] [unused10] [unused11] [unused12] [unused13] [unused14] [unused15] [unused16] [unused17] [unused18] [unused19] [unused20] [unused21] [unused22] [unused23] [unused24] [unused25] [unused26] [unused27] [unused28] [unused29] [unused30] [unused31] [unused32] [unused33] [unused34] [unused35] [unused36] [unused37] [unused38] [unused39] [unused40] [unused41] [unused42] [unused43] [unused44] [unused45] [unused46] [unused47] [unused48] [unused49] [unused50] [unused51] [unused52] [unused53] [unused54] [unused55] [unused56] [unused57] [unused58] [unused59] [unused60] [unused61] [unused62] [unused63] [unused64] [unused65] [unused66] [unused67] [unused68] [unused69] [unused70] [unused71] [unused72] [unused73] [unused74] [unused75] [unused76] [unused77] [unused78] [unused79] [unused80] [unused81] [unused82] [unused83] [unused84] [unused85] [unused86] [unused87] [unused88] [unused89] [unused90] [unused91] [unused92] [unused93] [unused94] [unused95] [unused96] [unused97] [unused98] [unused99] [UNK] [CLS] [SEP] [MASK] ! " # $ % & ' ( ) * + , - . / 0 1 2 3 4 5 6 7 8 9 : ; < = > ? @ [ \ ] ^ _ a b c d e f g h i j k l m n o p q r s t u v w x y z { | } ~ £ ¤ ¥ § © « ® ° ± ² ³ µ · ¹ º » ¼ × ß æ ÷ ø đ ŋ ɔ ə ɡ ʰ ˇ ˈ ˊ ˋ ˍ ː ˙ ˚ ˢ α β γ δ ε η θ ι κ λ μ ν ο π ρ ς σ τ υ φ χ ψ ω а б в г д е ж з и к л м н о п р с т у ф х ц ч ш ы ь я і ا ب ة ت د ر س ع ل م ن ه و ي ۩ ก ง น ม ย ร อ า เ ๑ ་ ღ ᄀ ᄁ ᄂ ᄃ ᄅ ᄆ ᄇ ᄈ ᄉ ᄋ ᄌ ᄎ ᄏ ᄐ ᄑ ᄒ ᅡ ᅢ ᅣ ᅥ ᅦ ᅧ ᅨ ᅩ ᅪ ᅬ ᅭ ᅮ ᅯ ᅲ ᅳ ᅴ ᅵ ᆨ ᆫ ᆯ ᆷ ᆸ ᆺ ᆻ ᆼ ᗜ ᵃ ᵉ ᵍ ᵏ ᵐ ᵒ ᵘ ‖ „ † • ‥ ‧   ‰ ′ ″ ‹ › ※ ‿ ⁄ ⁱ ⁺ ⁿ ₁ ₂ ₃ ₄ € ℃ № ™ ⅰ ⅱ ⅲ ⅳ ⅴ ← ↑ → ↓ ↔ ↗ ↘ ⇒ ∀ − ∕ ∙ √ ∞ ∟ ∠ ∣ ∥ ∩ ∮ ∶ ∼ ∽ ≈ ≒ ≡ ≤ ≥ ≦ ≧ ≪ ≫ ⊙ ⋅ ⋈ ⋯ ⌒ ① ② ③ ④ ⑤ ⑥ ⑦ ⑧ ⑨ ⑩ ⑴ ⑵ ⑶ ⑷ ⑸ ⒈ ⒉ ⒊ ⒋ ⓒ ⓔ ⓘ ─ ━ │ ┃ ┅ ┆ ┊ ┌ └ ├ ┣ ═ ║ ╚ ╞ ╠ ╭ ╮ ╯ ╰ ╱ ╳ ▂ ▃ ▅ ▇ █ ▉ ▋ ▌ ▍ ▎ ■ □ ▪ ▫ ▬ ▲ △ ▶ ► ▼ ▽ ◆ ◇ ○ ◎ ● ◕ ◠ ◢ ◤ ☀ ★ ☆ ☕ ☞ ☺ ☼ ♀ ♂ ♠ ♡ ♣ ♥ ♦ ♪ ♫ ♬ ✈ ✔ ✕ ✖ ✦ ✨ ✪ ✰ ✿ ❀ ❤ ➜ ➤ ⦿ 、。〃々〇〈〉《》「」『』【】〓〔〕〖〗〜〝〞ぁあぃいうぇえおかきくけこさしすせそたちっつてとなにぬねのはひふへほまみむめもゃやゅゆょよらりるれろわをん゜ゝァアィイゥウェエォオカキクケコサシスセソタチッツテトナニヌネノハヒフヘホマミムメモャヤュユョヨラリルレロワヲンヶ・ーヽㄅㄆㄇㄉㄋㄌㄍㄎㄏㄒㄚㄛㄞㄟㄢㄤㄥㄧㄨ ㆍ ㈦㊣㎡㗎一丁七万丈三上下不与丐丑专且丕世丘丙业丛东丝丞丟両丢两严並丧丨个丫中丰串临丶丸丹为主丼丽举丿乂乃久么义之乌乍乎乏乐乒乓乔乖乗乘乙乜九乞也习乡书乩买乱乳乾亀亂了予争事二于亏云互五井亘亙亚些亜亞亟亡亢交亥亦产亨亩享京亭亮亲亳亵人亿什仁仃仄仅仆仇今介仍从仏仑仓仔仕他仗付仙仝仞仟代令以仨仪们仮仰仲件价任份仿企伉伊伍伎伏伐休伕众优伙会伝伞伟传伢伤伦伪伫伯估伴伶伸伺似伽佃但佇佈位低住佐佑体佔何佗佘余佚佛作佝佞佟你佢佣佤佥佩佬佯佰佳併佶佻佼使侃侄來侈例侍侏侑侖侗供依侠価侣侥侦侧侨侬侮侯侵侶侷便係促俄俊俎俏俐俑俗俘俚保俞俟俠信俨俩俪俬俭修俯俱俳俸俺俾倆倉個倌倍倏們倒倔倖倘候倚倜借倡値倦倩倪倫倬倭倶债值倾偃假偈偉偌偎偏偕做停健側偵偶偷偻偽偿傀傅傍傑傘備傚傢傣傥储傩催傭傲傳債傷傻傾僅働像僑僕僖僚僥僧僭僮僱僵價僻儀儂億儆儉儋儒儕儘償儡優儲儷儼儿兀允元兄充兆兇先光克兌免児兑兒兔兖党兜兢入內全兩八公六兮兰共兲关兴兵其具典兹养兼兽冀内円冇冈冉冊册再冏冒冕冗写军农冠冢冤冥冨冪冬冯冰冲决况冶冷冻冼冽冾净凄准凇凈凉凋凌凍减凑凛凜凝几凡凤処凪凭凯凰凱凳凶凸凹出击函凿刀刁刃分切刈刊刍刎刑划列刘则刚创初删判別刨利刪别刮到制刷券刹刺刻刽剁剂剃則剉削剋剌前剎剐剑剔剖剛剜剝剣剤剥剧剩剪副割創剷剽剿劃劇劈劉劊劍劏劑力劝办功加务劣动助努劫劭励劲劳労劵効劾势勁勃勇勉勋勐勒動勖勘務勛勝勞募勢勤勧勳勵勸勺勻勾勿匀包匆匈匍匐匕化北匙匝匠匡匣匪匮匯匱匹区医匾匿區十千卅升午卉半卍华协卑卒卓協单卖南単博卜卞卟占卡卢卤卦卧卫卮卯印危即却卵卷卸卻卿厂厄厅历厉压厌厕厘厚厝原厢厥厦厨厩厭厮厲厳去县叁参參又叉及友双反収发叔取受变叙叛叟叠叡叢口古句另叨叩只叫召叭叮可台叱史右叵叶号司叹叻叼叽吁吃各吆合吉吊吋同名后吏吐向吒吓吕吖吗君吝吞吟吠吡否吧吨吩含听吭吮启吱吳吴吵吶吸吹吻吼吽吾呀呂呃呆呈告呋呎呐呓呕呗员呛呜呢呤呦周呱呲味呵呷呸呻呼命咀咁咂咄咆咋和咎咏咐咒咔咕咖咗咘咙咚咛咣咤咦咧咨咩咪咫咬咭咯咱咲咳咸咻咽咿哀品哂哄哆哇哈哉哋哌响哎哏哐哑哒哔哗哟員哥哦哧哨哩哪哭哮哲哺哼哽唁唄唆唇唉唏唐唑唔唠唤唧唬售唯唰唱唳唷唸唾啃啄商啉啊問啓啕啖啜啞啟啡啤啥啦啧啪啫啬啮啰啱啲啵啶啷啸啻啼啾喀喂喃善喆喇喉喊喋喎喏喔喘喙喚喜喝喟喧喪喫喬單喰喱喲喳喵営喷喹喺喻喽嗅嗆嗇嗎嗑嗒嗓嗔嗖嗚嗜嗝嗟嗡嗣嗤嗦嗨嗪嗬嗯嗰嗲嗳嗶嗷嗽嘀嘅嘆嘈嘉嘌嘍嘎嘔嘖嘗嘘嘚嘛嘜嘞嘟嘢嘣嘤嘧嘩嘭嘮嘯嘰嘱嘲嘴嘶嘸嘹嘻嘿噁噌噎噓噔噗噙噜噠噢噤器噩噪噬噱噴噶噸噹噻噼嚀嚇嚎嚏嚐嚓嚕嚟嚣嚥嚨嚮嚴嚷嚼囂囉囊囍囑囔囗囚四囝回囟因囡团団囤囧囪囫园困囱囲図围囹固国图囿圃圄圆圈國圍圏園圓圖團圜土圣圧在圩圭地圳场圻圾址坂均坊坍坎坏坐坑块坚坛坝坞坟坠坡坤坦坨坪坯坳坵坷垂垃垄型垒垚垛垠垢垣垦垩垫垭垮垵埂埃埋城埔埕埗域埠埤埵執埸培基埼堀堂堃堅堆堇堑堕堙堡堤堪堯堰報場堵堺堿塊塌塑塔塗塘塚塞塢塩填塬塭塵塾墀境墅墉墊墒墓増墘墙墜增墟墨墩墮墳墻墾壁壅壆壇壊壑壓壕壘壞壟壢壤壩士壬壮壯声売壳壶壹壺壽处备変复夏夔夕外夙多夜够夠夢夥大天太夫夭央夯失头夷夸夹夺夾奂奄奇奈奉奋奎奏奐契奔奕奖套奘奚奠奢奥奧奪奬奮女奴奶奸她好如妃妄妆妇妈妊妍妒妓妖妘妙妝妞妣妤妥妨妩妪妮妲妳妹妻妾姆姉姊始姍姐姑姒姓委姗姚姜姝姣姥姦姨姪姫姬姹姻姿威娃娄娅娆娇娉娑娓娘娛娜娟娠娣娥娩娱娲娴娶娼婀婁婆婉婊婕婚婢婦婧婪婭婴婵婶婷婺婿媒媚媛媞媧媲媳媽媾嫁嫂嫉嫌嫑嫔嫖嫘嫚嫡嫣嫦嫩嫲嫵嫻嬅嬉嬌嬗嬛嬢嬤嬪嬰嬴嬷嬸嬿孀孃子孑孔孕孖字存孙孚孛孜孝孟孢季孤学孩孪孫孬孰孱孳孵學孺孽孿宁它宅宇守安宋完宏宓宕宗官宙定宛宜宝实実宠审客宣室宥宦宪宫宮宰害宴宵家宸容宽宾宿寂寄寅密寇富寐寒寓寛寝寞察寡寢寥實寧寨審寫寬寮寰寵寶寸对寺寻导対寿封専射将將專尉尊尋對導小少尔尕尖尘尚尝尤尧尬就尴尷尸尹尺尻尼尽尾尿局屁层屄居屆屈屉届屋屌屍屎屏屐屑展屜属屠屡屢層履屬屯山屹屿岀岁岂岌岐岑岔岖岗岘岙岚岛岡岩岫岬岭岱岳岷岸峇峋峒峙峡峤峥峦峨峪峭峯峰峴島峻峽崁崂崆崇崎崑崔崖崗崙崛崧崩崭崴崽嵇嵊嵋嵌嵐嵘嵩嵬嵯嶂嶄嶇嶋嶙嶺嶼嶽巅巍巒巔巖川州巡巢工左巧巨巩巫差己已巳巴巷巻巽巾巿币市布帅帆师希帐帑帕帖帘帚帛帜帝帥带帧師席帮帯帰帳帶帷常帼帽幀幂幄幅幌幔幕幟幡幢幣幫干平年并幸幹幺幻幼幽幾广庁広庄庆庇床序庐库应底庖店庙庚府庞废庠度座庫庭庵庶康庸庹庾廁廂廃廈廉廊廓廖廚廝廟廠廢廣廬廳延廷建廿开弁异弃弄弈弊弋式弑弒弓弔引弗弘弛弟张弥弦弧弩弭弯弱張強弹强弼弾彅彆彈彌彎归当录彗彙彝形彤彥彦彧彩彪彫彬彭彰影彷役彻彼彿往征径待徇很徉徊律後徐徑徒従徕得徘徙徜從徠御徨復循徬微徳徴徵德徹徼徽心必忆忌忍忏忐忑忒忖志忘忙応忠忡忤忧忪快忱念忻忽忿怀态怂怅怆怎怏怒怔怕怖怙怜思怠怡急怦性怨怪怯怵总怼恁恃恆恋恍恐恒恕恙恚恢恣恤恥恨恩恪恫恬恭息恰恳恵恶恸恺恻恼恿悄悅悉悌悍悔悖悚悟悠患悦您悩悪悬悯悱悲悴悵悶悸悻悼悽情惆惇惊惋惑惕惘惚惜惟惠惡惦惧惨惩惫惬惭惮惯惰惱想惴惶惹惺愁愆愈愉愍意愕愚愛愜感愣愤愧愫愷愿慄慈態慌慎慑慕慘慚慟慢慣慧慨慫慮慰慳慵慶慷慾憂憊憋憎憐憑憔憚憤憧憨憩憫憬憲憶憾懂懇懈應懊懋懑懒懦懲懵懶懷懸懺懼懾懿戀戈戊戌戍戎戏成我戒戕或战戚戛戟戡戦截戬戮戰戲戳戴戶户戸戻戾房所扁扇扈扉手才扎扑扒打扔払托扛扣扦执扩扪扫扬扭扮扯扰扱扳扶批扼找承技抄抉把抑抒抓投抖抗折抚抛抜択抟抠抡抢护报抨披抬抱抵抹押抽抿拂拄担拆拇拈拉拋拌拍拎拐拒拓拔拖拗拘拙拚招拜拟拡拢拣拥拦拧拨择括拭拮拯拱拳拴拷拼拽拾拿持挂指挈按挎挑挖挙挚挛挝挞挟挠挡挣挤挥挨挪挫振挲挹挺挽挾捂捅捆捉捋捌捍捎捏捐捕捞损捡换捣捧捨捩据捱捲捶捷捺捻掀掂掃掇授掉掌掏掐排掖掘掙掛掠採探掣接控推掩措掬掰掲掳掴掷掸掺揀揃揄揆揉揍描提插揖揚換握揣揩揪揭揮援揶揸揹揽搀搁搂搅損搏搐搓搔搖搗搜搞搡搪搬搭搵搶携搽摀摁摄摆摇摈摊摒摔摘摞摟摧摩摯摳摸摹摺摻撂撃撅撇撈撐撑撒撓撕撚撞撤撥撩撫撬播撮撰撲撵撷撸撻撼撿擀擁擂擄擅擇擊擋操擎擒擔擘據擞擠擡擢擦擬擰擱擲擴擷擺擼擾攀攏攒攔攘攙攜攝攞攢攣攤攥攪攫攬支收攸改攻放政故效敌敍敎敏救敕敖敗敘教敛敝敞敢散敦敬数敲整敵敷數斂斃文斋斌斎斐斑斓斗料斛斜斟斡斤斥斧斩斫斬断斯新斷方於施旁旃旅旋旌旎族旖旗无既日旦旧旨早旬旭旮旱时旷旺旻昀昂昆昇昉昊昌明昏易昔昕昙星映春昧昨昭是昱昴昵昶昼显晁時晃晉晋晌晏晒晓晔晕晖晗晚晝晞晟晤晦晨晩普景晰晴晶晷智晾暂暄暇暈暉暌暐暑暖暗暝暢暧暨暫暮暱暴暸暹曄曆曇曉曖曙曜曝曠曦曬曰曲曳更書曹曼曾替最會月有朋服朐朔朕朗望朝期朦朧木未末本札朮术朱朴朵机朽杀杂权杆杈杉李杏材村杓杖杜杞束杠条来杨杭杯杰東杳杵杷杼松板极构枇枉枋析枕林枚果枝枢枣枪枫枭枯枰枱枳架枷枸柄柏某柑柒染柔柘柚柜柞柠柢查柩柬柯柱柳柴柵査柿栀栃栄栅标栈栉栋栎栏树栓栖栗校栩株样核根格栽栾桀桁桂桃桅框案桉桌桎桐桑桓桔桜桠桡桢档桥桦桧桨桩桶桿梁梅梆梏梓梗條梟梢梦梧梨梭梯械梳梵梶检棂棄棉棋棍棒棕棗棘棚棟棠棣棧森棱棲棵棹棺椁椅椋植椎椒検椪椭椰椹椽椿楂楊楓楔楚楝楞楠楣楨楫業楮極楷楸楹楼楽概榄榆榈榉榔榕榖榛榜榨榫榭榮榱榴榷榻槁槃構槌槍槎槐槓様槛槟槤槭槲槳槻槽槿樁樂樊樑樓標樞樟模樣権横樫樯樱樵樸樹樺樽樾橄橇橋橐橘橙機橡橢橫橱橹橼檀檄檎檐檔檗檜檢檬檯檳檸檻櫃櫚櫛櫥櫸櫻欄權欒欖欠次欢欣欧欲欸欺欽款歆歇歉歌歎歐歓歙歛歡止正此步武歧歩歪歯歲歳歴歷歸歹死歼殁殃殆殇殉殊残殒殓殖殘殞殡殤殭殯殲殴段殷殺殼殿毀毁毂毅毆毋母毎每毒毓比毕毗毘毙毛毡毫毯毽氈氏氐民氓气氖気氙氛氟氡氢氣氤氦氧氨氪氫氮氯氰氲水氷永氹氾汀汁求汆汇汉汎汐汕汗汙汛汝汞江池污汤汨汩汪汰汲汴汶汹決汽汾沁沂沃沅沈沉沌沏沐沒沓沖沙沛沟没沢沣沥沦沧沪沫沭沮沱河沸油治沼沽沾沿況泄泉泊泌泓法泗泛泞泠泡波泣泥注泪泫泮泯泰泱泳泵泷泸泻泼泽泾洁洄洋洒洗洙洛洞津洩洪洮洱洲洵洶洸洹活洼洽派流浃浄浅浆浇浊测济浏浑浒浓浔浙浚浜浣浦浩浪浬浮浯浴海浸涂涅涇消涉涌涎涓涔涕涙涛涝涞涟涠涡涣涤润涧涨涩涪涮涯液涵涸涼涿淀淄淅淆淇淋淌淑淒淖淘淙淚淞淡淤淦淨淩淪淫淬淮深淳淵混淹淺添淼清済渉渊渋渍渎渐渔渗渙渚減渝渠渡渣渤渥渦温測渭港渲渴游渺渾湃湄湊湍湖湘湛湟湧湫湮湯湳湾湿満溃溅溉溏源準溜溝溟溢溥溧溪溫溯溱溴溶溺溼滁滂滄滅滇滋滌滑滓滔滕滙滚滝滞滟满滢滤滥滦滨滩滬滯滲滴滷滸滾滿漁漂漆漉漏漓演漕漠漢漣漩漪漫漬漯漱漲漳漸漾漿潆潇潋潍潑潔潘潛潜潞潟潢潤潦潧潭潮潰潴潸潺潼澀澄澆澈澍澎澗澜澡澤澧澱澳澹激濁濂濃濑濒濕濘濛濟濠濡濤濫濬濮濯濱濺濾瀅瀆瀉瀋瀏瀑瀕瀘瀚瀛瀝瀞瀟瀧瀨瀬瀰瀾灌灏灑灘灝灞灣火灬灭灯灰灵灶灸灼災灾灿炀炁炅炉炊炎炒炔炕炖炙炜炫炬炭炮炯炳炷炸点為炼炽烁烂烃烈烊烏烘烙烛烟烤烦烧烨烩烫烬热烯烷烹烽焉焊焕焖焗焘焙焚焜無焦焯焰焱然焼煅煉煊煌煎煒煖煙煜煞煤煥煦照煨煩煮煲煸煽熄熊熏熒熔熙熟熠熨熬熱熵熹熾燁燃燄燈燉燊燎燒燔燕燙燜營燥燦燧燭燮燴燻燼燿爆爍爐爛爪爬爭爰爱爲爵父爷爸爹爺爻爽爾牆片版牌牍牒牙牛牝牟牠牡牢牦牧物牯牲牴牵特牺牽犀犁犄犊犍犒犢犧犬犯状犷犸犹狀狂狄狈狎狐狒狗狙狞狠狡狩独狭狮狰狱狸狹狼狽猎猕猖猗猙猛猜猝猥猩猪猫猬献猴猶猷猾猿獄獅獎獐獒獗獠獣獨獭獰獲獵獷獸獺獻獼獾玄率玉王玑玖玛玟玠玥玩玫玮环现玲玳玷玺玻珀珂珅珈珉珊珍珏珐珑珙珞珠珣珥珩珪班珮珲珺現球琅理琇琉琊琍琏琐琛琢琥琦琨琪琬琮琰琲琳琴琵琶琺琼瑀瑁瑄瑋瑕瑗瑙瑚瑛瑜瑞瑟瑠瑣瑤瑩瑪瑯瑰瑶瑾璀璁璃璇璉璋璎璐璜璞璟璧璨環璽璿瓊瓏瓒瓜瓢瓣瓤瓦瓮瓯瓴瓶瓷甄甌甕甘甙甚甜生產産甥甦用甩甫甬甭甯田由甲申电男甸町画甾畀畅界畏畑畔留畜畝畢略畦番畫異畲畳畴當畸畹畿疆疇疊疏疑疔疖疗疙疚疝疟疡疣疤疥疫疮疯疱疲疳疵疸疹疼疽疾痂病症痈痉痊痍痒痔痕痘痙痛痞痠痢痣痤痧痨痪痫痰痱痴痹痺痼痿瘀瘁瘋瘍瘓瘘瘙瘟瘠瘡瘢瘤瘦瘧瘩瘪瘫瘴瘸瘾療癇癌癒癖癜癞癡癢癣癥癫癬癮癱癲癸発登發白百皂的皆皇皈皋皎皑皓皖皙皚皮皰皱皴皺皿盂盃盅盆盈益盎盏盐监盒盔盖盗盘盛盜盞盟盡監盤盥盧盪目盯盱盲直相盹盼盾省眈眉看県眙眞真眠眦眨眩眯眶眷眸眺眼眾着睁睇睏睐睑睛睜睞睡睢督睥睦睨睪睫睬睹睽睾睿瞄瞅瞇瞋瞌瞎瞑瞒瞓瞞瞟瞠瞥瞧瞩瞪瞬瞭瞰瞳瞻瞼瞿矇矍矗矚矛矜矢矣知矩矫短矮矯石矶矽矾矿码砂砌砍砒研砖砗砚砝砣砥砧砭砰砲破砷砸砺砼砾础硅硐硒硕硝硫硬确硯硼碁碇碉碌碍碎碑碓碗碘碚碛碟碣碧碩碰碱碳碴確碼碾磁磅磊磋磐磕磚磡磨磬磯磲磷磺礁礎礙礡礦礪礫礴示礼社祀祁祂祇祈祉祎祐祕祖祗祚祛祜祝神祟祠祢祥票祭祯祷祸祺祿禀禁禄禅禍禎福禛禦禧禪禮禱禹禺离禽禾禿秀私秃秆秉秋种科秒秘租秣秤秦秧秩秭积称秸移秽稀稅程稍税稔稗稚稜稞稟稠稣種稱稲稳稷稹稻稼稽稿穀穂穆穌積穎穗穢穩穫穴究穷穹空穿突窃窄窈窍窑窒窓窕窖窗窘窜窝窟窠窥窦窨窩窪窮窯窺窿竄竅竇竊立竖站竜竞竟章竣童竭端競竹竺竽竿笃笆笈笋笏笑笔笙笛笞笠符笨第笹笺笼筆等筊筋筍筏筐筑筒答策筛筝筠筱筲筵筷筹签简箇箋箍箏箐箔箕算箝管箩箫箭箱箴箸節篁範篆篇築篑篓篙篝篠篡篤篩篪篮篱篷簇簌簍簡簦簧簪簫簷簸簽簾簿籁籃籌籍籐籟籠籤籬籮籲米类籼籽粄粉粑粒粕粗粘粟粤粥粧粪粮粱粲粳粵粹粼粽精粿糅糊糍糕糖糗糙糜糞糟糠糧糬糯糰糸系糾紀紂約紅紉紊紋納紐紓純紗紘紙級紛紜素紡索紧紫紮累細紳紹紺終絃組絆経結絕絞絡絢給絨絮統絲絳絵絶絹綁綏綑經継続綜綠綢綦綫綬維綱網綴綵綸綺綻綽綾綿緊緋総緑緒緘線緝緞締緣編緩緬緯練緹緻縁縄縈縛縝縣縫縮縱縴縷總績繁繃繆繇繋織繕繚繞繡繩繪繫繭繳繹繼繽纂續纍纏纓纔纖纜纠红纣纤约级纨纪纫纬纭纯纰纱纲纳纵纶纷纸纹纺纽纾线绀练组绅细织终绊绍绎经绑绒结绔绕绘给绚绛络绝绞统绡绢绣绥绦继绩绪绫续绮绯绰绳维绵绶绷绸绻综绽绾绿缀缄缅缆缇缈缉缎缓缔缕编缘缙缚缜缝缠缢缤缥缨缩缪缭缮缰缱缴缸缺缽罂罄罌罐网罔罕罗罚罡罢罩罪置罰署罵罷罹羁羅羈羊羌美羔羚羞羟羡羣群羥羧羨義羯羲羸羹羽羿翁翅翊翌翎習翔翘翟翠翡翦翩翰翱翳翹翻翼耀老考耄者耆耋而耍耐耒耕耗耘耙耦耨耳耶耷耸耻耽耿聂聆聊聋职聒联聖聘聚聞聪聯聰聲聳聴聶職聽聾聿肃肄肅肆肇肉肋肌肏肓肖肘肚肛肝肠股肢肤肥肩肪肮肯肱育肴肺肽肾肿胀胁胃胄胆背胍胎胖胚胛胜胝胞胡胤胥胧胫胭胯胰胱胳胴胶胸胺能脂脅脆脇脈脉脊脍脏脐脑脓脖脘脚脛脣脩脫脯脱脲脳脸脹脾腆腈腊腋腌腎腐腑腓腔腕腥腦腩腫腭腮腰腱腳腴腸腹腺腻腼腾腿膀膈膊膏膑膘膚膛膜膝膠膦膨膩膳膺膻膽膾膿臀臂臃臆臉臊臍臓臘臟臣臥臧臨自臬臭至致臺臻臼臾舀舂舅舆與興舉舊舌舍舎舐舒舔舖舗舛舜舞舟航舫般舰舱舵舶舷舸船舺舾艇艋艘艙艦艮良艰艱色艳艷艹艺艾节芃芈芊芋芍芎芒芙芜芝芡芥芦芩芪芫芬芭芮芯花芳芷芸芹芻芽芾苁苄苇苋苍苏苑苒苓苔苕苗苛苜苞苟苡苣若苦苫苯英苷苹苻茁茂范茄茅茉茎茏茗茜茧茨茫茬茭茯茱茲茴茵茶茸茹茼荀荃荆草荊荏荐荒荔荖荘荚荞荟荠荡荣荤荥荧荨荪荫药荳荷荸荻荼荽莅莆莉莊莎莒莓莖莘莞莠莢莧莪莫莱莲莴获莹莺莽莿菀菁菅菇菈菊菌菏菓菖菘菜菟菠菡菩華菱菲菸菽萁萃萄萊萋萌萍萎萘萝萤营萦萧萨萩萬萱萵萸萼落葆葉著葚葛葡董葦葩葫葬葭葯葱葳葵葷葺蒂蒋蒐蒔蒙蒜蒞蒟蒡蒨蒲蒸蒹蒻蒼蒿蓁蓄蓆蓉蓋蓑蓓蓖蓝蓟蓦蓬蓮蓼蓿蔑蔓蔔蔗蔘蔚蔡蔣蔥蔫蔬蔭蔵蔷蔺蔻蔼蔽蕁蕃蕈蕉蕊蕎蕙蕤蕨蕩蕪蕭蕲蕴蕻蕾薄薅薇薈薊薏薑薔薙薛薦薨薩薪薬薯薰薹藉藍藏藐藓藕藜藝藤藥藩藹藻藿蘆蘇蘊蘋蘑蘚蘭蘸蘼蘿虎虏虐虑虔處虚虛虜虞號虢虧虫虬虱虹虻虽虾蚀蚁蚂蚊蚌蚓蚕蚜蚝蚣蚤蚩蚪蚯蚱蚵蛀蛆蛇蛊蛋蛎蛐蛔蛙蛛蛟蛤蛭蛮蛰蛳蛹蛻蛾蜀蜂蜃蜆蜇蜈蜊蜍蜒蜓蜕蜗蜘蜚蜜蜡蜢蜥蜱蜴蜷蜻蜿蝇蝈蝉蝌蝎蝕蝗蝙蝟蝠蝦蝨蝴蝶蝸蝼螂螃融螞螢螨螯螳螺蟀蟄蟆蟋蟎蟑蟒蟠蟬蟲蟹蟻蟾蠅蠍蠔蠕蠛蠟蠡蠢蠣蠱蠶蠹蠻血衄衅衆行衍術衔街衙衛衝衞衡衢衣补表衩衫衬衮衰衲衷衹衾衿袁袂袄袅袈袋袍袒袖袜袞袤袪被袭袱裁裂装裆裊裏裔裕裘裙補裝裟裡裤裨裱裳裴裸裹製裾褂複褐褒褓褔褚褥褪褫褲褶褻襁襄襟襠襪襬襯襲西要覃覆覇見規覓視覚覦覧親覬観覷覺覽觀见观规觅视览觉觊觎觐觑角觞解觥触觸言訂計訊討訓訕訖託記訛訝訟訣訥訪設許訳訴訶診註証詆詐詔評詛詞詠詡詢詣試詩詫詬詭詮詰話該詳詹詼誅誇誉誌認誓誕誘語誠誡誣誤誥誦誨說説読誰課誹誼調諄談請諏諒論諗諜諡諦諧諫諭諮諱諳諷諸諺諾謀謁謂謄謊謎謐謔謗謙講謝謠謨謬謹謾譁證譎譏識譙譚譜警譬譯議譲譴護譽讀變讓讚讞计订认讥讧讨让讪讫训议讯记讲讳讴讶讷许讹论讼讽设访诀证诃评诅识诈诉诊诋词诏译试诗诘诙诚诛话诞诟诠诡询诣诤该详诧诩诫诬语误诰诱诲说诵诶请诸诺读诽课诿谀谁调谄谅谆谈谊谋谌谍谎谏谐谑谒谓谔谕谗谘谙谚谛谜谟谢谣谤谥谦谧谨谩谪谬谭谯谱谲谴谶谷豁豆豇豈豉豊豌豎豐豔豚象豢豪豫豬豹豺貂貅貌貓貔貘貝貞負財貢貧貨販貪貫責貯貰貳貴貶買貸費貼貽貿賀賁賂賃賄資賈賊賑賓賜賞賠賡賢賣賤賦質賬賭賴賺購賽贅贈贊贍贏贓贖贛贝贞负贡财责贤败账货质贩贪贫贬购贮贯贰贱贲贴贵贷贸费贺贻贼贾贿赁赂赃资赅赈赊赋赌赎赏赐赓赔赖赘赚赛赝赞赠赡赢赣赤赦赧赫赭走赳赴赵赶起趁超越趋趕趙趟趣趨足趴趵趸趺趾跃跄跆跋跌跎跑跖跚跛距跟跡跤跨跩跪路跳践跷跹跺跻踉踊踌踏踐踝踞踟踢踩踪踮踱踴踵踹蹂蹄蹇蹈蹉蹊蹋蹑蹒蹙蹟蹣蹤蹦蹩蹬蹭蹲蹴蹶蹺蹼蹿躁躇躉躊躋躍躏躪身躬躯躲躺軀車軋軌軍軒軟転軸軼軽軾較載輒輓輔輕輛輝輟輩輪輯輸輻輾輿轄轅轆轉轍轎轟车轧轨轩转轭轮软轰轲轴轶轻轼载轿较辄辅辆辇辈辉辊辍辐辑输辕辖辗辘辙辛辜辞辟辣辦辨辩辫辭辮辯辰辱農边辺辻込辽达迁迂迄迅过迈迎运近返还这进远违连迟迢迤迥迦迩迪迫迭述迴迷迸迹迺追退送适逃逅逆选逊逍透逐递途逕逗這通逛逝逞速造逢連逮週進逵逶逸逻逼逾遁遂遅遇遊運遍過遏遐遑遒道達違遗遙遛遜遞遠遢遣遥遨適遭遮遲遴遵遶遷選遺遼遽避邀邁邂邃還邇邈邊邋邏邑邓邕邛邝邢那邦邨邪邬邮邯邰邱邳邵邸邹邺邻郁郅郊郎郑郜郝郡郢郤郦郧部郫郭郴郵郷郸都鄂鄉鄒鄔鄙鄞鄢鄧鄭鄰鄱鄲鄺酉酊酋酌配酐酒酗酚酝酢酣酥酩酪酬酮酯酰酱酵酶酷酸酿醃醇醉醋醍醐醒醚醛醜醞醣醪醫醬醮醯醴醺釀釁采釉释釋里重野量釐金釗釘釜針釣釦釧釵鈀鈉鈍鈎鈔鈕鈞鈣鈦鈪鈴鈺鈾鉀鉄鉅鉉鉑鉗鉚鉛鉤鉴鉻銀銃銅銑銓銖銘銜銬銭銮銳銷銹鋁鋅鋒鋤鋪鋰鋸鋼錄錐錘錚錠錢錦錨錫錮錯録錳錶鍊鍋鍍鍛鍥鍰鍵鍺鍾鎂鎊鎌鎏鎔鎖鎗鎚鎧鎬鎮鎳鏈鏖鏗鏘鏞鏟鏡鏢鏤鏽鐘鐮鐲鐳鐵鐸鐺鑄鑊鑑鑒鑣鑫鑰鑲鑼鑽鑾鑿针钉钊钎钏钒钓钗钙钛钜钝钞钟钠钡钢钣钤钥钦钧钨钩钮钯钰钱钳钴钵钺钻钼钾钿铀铁铂铃铄铅铆铉铎铐铛铜铝铠铡铢铣铤铨铩铬铭铮铰铲铵银铸铺链铿销锁锂锄锅锆锈锉锋锌锏锐锑错锚锟锡锢锣锤锥锦锭键锯锰锲锵锹锺锻镀镁镂镇镉镌镍镐镑镕镖镗镛镜镣镭镯镰镳镶長长門閃閉開閎閏閑閒間閔閘閡関閣閥閨閩閱閲閹閻閾闆闇闊闌闍闔闕闖闘關闡闢门闪闫闭问闯闰闲间闵闷闸闹闺闻闽闾阀阁阂阅阆阇阈阉阎阐阑阔阕阖阙阚阜队阡阪阮阱防阳阴阵阶阻阿陀陂附际陆陇陈陋陌降限陕陛陝陞陟陡院陣除陨险陪陰陲陳陵陶陷陸険陽隅隆隈隊隋隍階随隐隔隕隘隙際障隠隣隧隨險隱隴隶隸隻隼隽难雀雁雄雅集雇雉雋雌雍雎雏雑雒雕雖雙雛雜雞離難雨雪雯雰雲雳零雷雹電雾需霁霄霆震霈霉霊霍霎霏霑霓霖霜霞霧霭霰露霸霹霽霾靂靄靈青靓靖静靚靛靜非靠靡面靥靦革靳靴靶靼鞅鞋鞍鞏鞑鞘鞠鞣鞦鞭韆韋韌韓韜韦韧韩韬韭音韵韶韻響頁頂頃項順須頌預頑頒頓頗領頜頡頤頫頭頰頷頸頹頻頼顆題額顎顏顔願顛類顧顫顯顱顴页顶顷项顺须顼顽顾顿颁颂预颅领颇颈颉颊颌颍颐频颓颔颖颗题颚颛颜额颞颠颡颢颤颦颧風颯颱颳颶颼飄飆风飒飓飕飘飙飚飛飞食飢飨飩飪飯飲飼飽飾餃餅餉養餌餐餒餓餘餚餛餞餡館餮餵餾饅饈饋饌饍饑饒饕饗饞饥饨饪饬饭饮饯饰饱饲饴饵饶饷饺饼饽饿馀馁馄馅馆馈馋馍馏馒馔首馗香馥馨馬馭馮馳馴駁駄駅駆駐駒駕駛駝駭駱駿騁騎騏験騙騨騰騷驀驅驊驍驒驕驗驚驛驟驢驥马驭驮驯驰驱驳驴驶驷驸驹驻驼驾驿骁骂骄骅骆骇骈骊骋验骏骐骑骗骚骛骜骞骠骡骤骥骧骨骯骰骶骷骸骼髂髅髋髏髒髓體髖高髦髪髮髯髻鬃鬆鬍鬓鬚鬟鬢鬣鬥鬧鬱鬼魁魂魄魅魇魍魏魔魘魚魯魷鮑鮨鮪鮭鮮鯉鯊鯖鯛鯨鯰鯽鰍鰓鰭鰲鰻鰾鱈鱉鱔鱗鱷鱸鱼鱿鲁鲈鲍鲑鲛鲜鲟鲢鲤鲨鲫鲱鲲鲶鲷鲸鳃鳄鳅鳌鳍鳕鳖鳗鳝鳞鳥鳩鳳鳴鳶鴉鴕鴛鴦鴨鴻鴿鵑鵜鵝鵡鵬鵰鵲鶘鶩鶯鶴鷗鷲鷹鷺鸚鸞鸟鸠鸡鸢鸣鸥鸦鸨鸪鸭鸯鸳鸵鸽鸾鸿鹂鹃鹄鹅鹈鹉鹊鹌鹏鹑鹕鹘鹜鹞鹤鹦鹧鹫鹭鹰鹳鹵鹹鹼鹽鹿麂麋麒麓麗麝麟麥麦麩麴麵麸麺麻麼麽麾黃黄黍黎黏黑黒黔默黛黜黝點黠黨黯黴鼋鼎鼐鼓鼠鼬鼹鼻鼾齁齊齋齐齒齡齢齣齦齿龄龅龈龊龋龌龍龐龔龕龙龚龛龜龟︰︱︶︿﹁﹂﹍﹏ ﹐ ﹑ ﹒ ﹔ ﹕ ﹖ ﹗ ﹙ ﹚ ﹝ ﹞ ﹡ ﹣ ！＂＃＄％＆＇（）＊＋，－．／０１２３４５６７８９：；＜＝＞？＠［＼］＾＿｀ａｂｃｄｅｆｇｈｉｊｋｌｍｎｏｐｑｒｓｔｕｖｗｘｙｚ｛｜｝～｡｢｣､･ｯｰｲｸｼｽﾄﾉﾌﾗﾙﾝﾞﾟ￣￥ 👍 🔥 😂 😎 ... yam 10 2017 12 11 2016 20 30 15 06 lofter ##s 2015 by 16 14 18 13 24 17 2014 21 ##0 22 19 25 23 com 100 00 05 2013 ##a 03 09 08 28 ##2 50 01 04 ##1 27 02 2012 ##3 26 ##e 07 ##8 ##5 ##6 ##4 ##9 ##7 29 2011 40 ##t 2010 ##o ##d ##i 2009 ##n app www the ##m 31 ##c ##l ##y ##r ##g 2008 60 http 200 qq ##p 80 ##f google pixnet 90 cookies tripadvisor 500 ##er ##k 35 ##h facebook 2007 2000 70 ##b of ##x ##u 45 300 iphone 32 1000 2006 48 ip 36 in 38 3d ##w ##ing 55 ctrip ##on ##v 33 ##の to 34 400 id 2005 it 37 windows llc top 99 42 39 000 led at ##an 41 51 52 46 49 43 53 44 ##z android 58 and 59 2004 56 vr ##か 5000 2003 47 blogthis twitter 54 ##le 150 ok 2018 57 75 cn no ios ##in ##mm ##00 800 on te 3000 65 2001 360 95 ig lv 120 ##ng ##を ##us ##に pc てす ── 600 ##te 85 2002 88 ##ed html ncc wifi email 64 blog is ##10 ##て mail online ##al dvd ##ic studio ##は ##℃ ##ia ##と line vip 72 ##q 98 ##ce ##en for ##is ##ra ##es ##j usb net cp 1999 asia 4g ##cm diy new 3c ##お ta 66 language vs apple tw 86 web ##ne ipad 62 you ##re 101 68 ##tion ps de bt pony atm ##2017 1998 67 ##ch ceo ##or go ##na av pro cafe 96 pinterest 97 63 pixstyleme3c ##ta more said ##2016 1997 mp3 700 ##ll nba jun ##20 92 tv 1995 pm 61 76 nbsp 250 ##ie linux ##ma cd 110 hd ##17 78 ##ion 77 6000 am ##th ##st 94 ##se ##et 69 180 gdp my 105 81 abc 89 flash 79 one 93 1990 1996 ##ck gps ##も ##ly web885 106 2020 91 ##ge 4000 1500 xd boss isbn 1994 org ##ry me love ##11 0fork 73 ##12 3g ##ter ##ar 71 82 ##la hotel 130 1970 pk 83 87 140 ie ##os ##30 ##el 74 ##50 seo cpu ##ml p2p 84 may ##る sun tue internet cc posted youtube ##at ##ン ##man ii ##ル ##15 abs nt pdf yahoo ago 1980 ##it news mac 104 ##てす ##me ##り java 1992 spa ##de ##nt hk all plus la 1993 ##mb ##16 ##ve west ##da 160 air ##い ##ps から ##to 1989 logo htc php https fi momo ##son sat ##ke ##80 ebd suv wi day apk ##88 ##um mv galaxy wiki or brake ##ス 1200 する this 1991 mon ##こ ❤2017 po ##ない javascript life home june ##ss system 900 ##ー ##０ pp 1988 world fb 4k br ##as ic ai leonardo safari ##60 live free xx wed win7 kiehl ##co lg o2o ##go us 235 1949 mm しい vfm kanye ##90 ##2015 ##id jr ##ey 123 rss ##sa ##ro ##am ##no thu fri 350 ##sh ##ki 103 comments name ##のて ##pe ##ine max 1987 8000 uber ##mi ##ton wordpress office 1986 1985 ##ment 107 bd win10 ##ld ##li gmail bb dior ##rs ##ri ##rd ##ます up cad ##® dr して read ##21 をお ##io ##99 url 1984 pvc paypal show policy ##40 ##ty ##18 with ##★ ##01 txt 102 ##ba dna from post mini ar taiwan john ##ga privacy agoda ##13 ##ny word ##24 ##22 ##by ##ur ##hz 1982 ##ang 265 cookie netscape 108 ##ka ##～ ##ad house share note ibm code hello nike sim survey ##016 1979 1950 wikia ##32 ##017 5g cbc ##tor ##kg 1983 ##rt ##14 campaign store 2500 os ##ct ##ts ##° 170 api ##ns 365 excel ##な ##ao ##ら ##し～～ ##nd university 163 には 518 ##70 ##ya ##il ##25 pierre ipo 0020 897 ##23 hotels ##ian のお 125 years 6606 ##ers ##26 high ##day time ##ay bug ##line ##く ##す ##be xp talk2yam yamservice 10000 coco ##dy sony ##ies 1978 microsoft david people ##ha 1960 instagram intel その ##ot iso 1981 ##va 115 ##mo ##land xxx man co ltxsw ##ation baby 220 ##pa ##ol 1945 7000 tag 450 ##ue msn ##31 oppo ##ト ##ca control ##om st chrome ##ure ##ん be ##き lol ##19 した ##bo 240 lady ##100 ##way ##から 4600 ##ko ##do ##un 4s corporation 168 ##ni herme ##28 ｃｐ 978 ##up ##06 ui ##ds ppt admin three します bbc re 128 ##48 ca ##015 ##35 hp ##ee tpp ##た ##ive ×× root ##cc ##ました ##ble ##ity adobe park 114 et oled city ##ex ##ler ##ap china ##book 20000 view ##ice global ##km your hong ##mg out ##ms ng ebay ##29 menu ubuntu ##cy rom ##view open ktv do server ##lo if english ##ね ##５ ##oo 1600 ##02 step1 kong club 135 july inc 1976 mr hi ##net touch ##ls ##ii michael lcd ##05 ##33 phone james step2 1300 ios9 ##box dc ##２ ##ley samsung 111 280 pokemon css ##ent ##les いいえ ##１ s8 atom play bmw ##said sa etf ctrl ♥yoyo♥ ##55 2025 ##2014 ##66 adidas amazon 1958 ##ber ##ner visa ##77 ##der 1800 connectivity ##hi firefox 109 118 hr so style mark pop ol skip 1975 as ##27 ##ir ##61 190 mba ##う ##ai le ##ver 1900 cafe2017 lte super 113 129 ##ron amd like ##☆ are ##ster we ##sk paul data international ##ft longchamp ssd good ##ート ##ti reply ##my ↓↓↓ apr star ##ker source 136 js 112 get force photo ##one 126 ##2013 ##ow link bbs 1972 goods ##lin python 119 ##ip game ##ics ##ません blue ##● 520 ##45 page itunes ##03 1955 260 1968 gt gif 618 ##ff ##47 group くたさい about bar ganji ##nce music lee not 1977 1971 1973 ##per an faq comment ##って days ##ock 116 ##bs 1974 1969 v1 player 1956 xbox sql fm f1 139 ##ah 210 ##lv ##mp ##000 melody 1957 ##３ 550 17life 199 1966 xml market ##au ##71 999 ##04 what gl ##95 ##age tips ##68 book ##ting mysql can 1959 230 ##ung wonderland watch 10℃ ##ction 9000 mar mobile 1946 1962 article ##db part ▲top party って 1967 1964 1948 ##07 ##ore ##op この dj ##78 ##38 010 main 225 1965 ##ong art 320 ad 134 020 ##73 117 pm2 japan 228 ##08 ts 1963 ##ica der sm ##36 2019 ##wa ct ##７ ##や ##64 1937 homemesh search ##85 ##れは ##tv ##di macbook ##９ ##くたさい service ##♥ type った 750 ##ier ##si ##75 ##います ##ok best ##ット goris lock ##った cf 3m big ##ut ftp carol ##vi １０ 1961 happy sd ##ac 122 anti pe cnn iii 1920 138 ##ラ 1940 esp jan tags ##98 ##51 august vol ##86 154 ##™ ##fs ##れ ##sion design ac ##ム press jordan ppp that key check ##６ ##tt ##㎡ 1080p ##lt power ##42 1952 ##bc vivi ##ック he 133 121 jpg ##rry 201 175 3500 1947 nb ##ted ##rn しています 1954 usd ##t00 master ##ンク 001 model ##58 al ##09 1953 ##34 ram goo ても ##ui 127 1930 red ##ary rpg item ##pm ##41 270 ##za project ##2012 hot td blogabstract ##ger ##62 650 ##44 gr2 ##します ##ｍ black electronic nfc year asus また html5 cindy ##hd m3 132 esc ##od booking ##53 fed tvb ##81 ##ina mit 165 ##いる chan 192 distribution next になる peter bios steam cm 1941 にも pk10 ##ix ##65 ##91 dec nasa ##ana icecat 00z b1 will ##46 li se ##ji ##み ##ard oct ##ain jp ##ze ##bi cio ##56 smart h5 ##39 ##port curve vpn ##nm ##dia utc ##あり 12345678910 ##52 rmvb chanel a4 miss ##and ##im media who ##63 she girl 5s 124 vera ##して class vivo king ##フ ##ei national ab 1951 5cm 888 145 ipod ap 1100 5mm 211 ms 2756 ##69 mp4 msci ##po ##89 131 mg index 380 ##bit ##out ##zz ##97 ##67 158 apec ##８ photoshop opec ￥799 ては ##96 ##tes ##ast 2g ○○ ##ール￥2899 ##ling ##よ ##ory 1938 ##ical kitty content ##43 step3 ##cn win8 155 vc 1400 iphone7 robert ##した tcl 137 beauty ##87 en dollars ##ys ##oc step pay yy a1 ##2011 ##lly ##ks ##♪ 1939 188 download 1944 sep exe ph います school gb center pr street ##board uv ##37 ##lan winrar ##que ##ua ##com 1942 1936 480 gpu ##４ ettoday fu tom ##54 ##ren ##via 149 ##72 b2b 144 ##79 ##tch rose arm mb ##49 ##ial ##nn nvidia step4 mvp 00㎡ york 156 ##イ how cpi 591 2765 gov kg joe ##xx mandy pa ##ser copyright fashion 1935 don ##け ecu ##ist ##art erp wap have ##lm talk ##ek ##ning ##if ch ##ite video 1943 cs san iot look ##84 ##2010 ##ku october ##ux trump ##hs ##ide box 141 first ##ins april ##ight ##83 185 angel protected aa 151 162 x1 m2 ##fe ##× ##ho size 143 min ofo fun gomaji ex hdmi food dns march chris kevin ##のか ##lla ##pp ##ec ag ems 6s 720p ##rm ##ham off ##92 asp team fandom ed 299 ▌♥ ##ell info されています ##82 sina 4066 161 ##able ##ctor 330 399 315 dll rights ltd idc jul 3kg 1927 142 ma surface ##76 ##ク～～～ 304 mall eps 146 green ##59 map space donald v2 sodu ##light 1931 148 1700 まて 310 reserved htm ##han ##57 2d 178 mod ##ise ##tions 152 ti ##shi doc 1933 icp 055 wang ##ram shopping aug ##pi ##well now wam b2 からお ##hu 236 1928 ##gb 266 f2 ##93 153 mix ##ef ##uan bwl ##plus ##res core ##ess tea 5℃ hktvmall nhk ##ate list ##ese 301 feb 4m inn ての nov 159 12345 daniel ##ci pass ##bet ##nk coffee 202 ssl airbnb ##ute fbi woshipm skype ea cg sp ##fc ##www yes edge alt 007 ##94 fpga ##ght ##gs iso9001 さい ##ile ##wood ##uo image lin icon american ##em 1932 set says ##king ##tive blogger ##74 なと 256 147 ##ox ##zy ##red ##ium ##lf nokia claire ##リ ##ding november lohas ##500 ##tic ##マ ##cs ##ある ##che ##ire ##gy ##ult db january win ##カ 166 road ptt ##ま ##つ 198 ##fa ##mer anna pchome はい udn ef 420 ##time ##tte 2030 ##ア g20 white かかります 1929 308 garden eleven di ##おります chen 309b 777 172 young cosplay ちてない 4500 bat ##123 ##tra ##ては kindle npc steve etc ##ern ##｜ call xperia ces travel sk s7 ##ous 1934 ##int みいたたけます 183 edu file cho qr ##car ##our 186 ##ant ##ｄ eric 1914 rends ##jo ##する mastercard ##2000 kb ##min 290 ##ino vista ##ris ##ud jack 2400 ##set 169 pos 1912 ##her ##ou taipei しく 205 beta ##ませんか 232 ##fi express 255 body ##ill aphojoy user december meiki ##ick tweet richard ##av ##ᆫ iphone6 ##dd ちてすか views ##mark 321 pd ##００ times ##▲ level ##ash 10g point 5l ##ome 208 koreanmall ##ak george q2 206 wma tcp ##200 スタッフ full mlb ##lle ##watch tm run 179 911 smith business ##und 1919 color ##tal 222 171 ##less moon 4399 ##rl update pcb shop 499 157 little なし end ##mhz van dsp easy 660 ##house ##key history ##ｏ oh ##001 ##hy ##web oem let was ##2009 ##gg review ##wan 182 ##°c 203 uc title ##val united 233 2021 ##ons doi trivago overdope sbs ##ance ##ち grand special 573032185 imf 216 wx17house ##so ##ーム audi ##he london william ##rp ##ake science beach cfa amp ps4 880 ##800 ##link ##hp crm ferragamo bell make ##eng 195 under zh photos 2300 ##style ##ント via 176 da ##gi company i7 ##ray thomas 370 ufo i5 ##max plc ben back research 8g 173 mike ##pc ##ッフ september 189 ##ace vps february 167 pantos wp lisa 1921 ★★ jquery night long offer ##berg ##news 1911 ##いて ray fks wto せます over 164 340 ##all ##rus 1924 ##888 ##works blogtitle loftpermalink ##→ 187 martin test ling km ##め 15000 fda v3 ##ja ##ロｗedding かある outlet family ##ea をこ ##top story ##ness salvatore ##lu 204 swift 215 room している oracle ##ul 1925 sam b2c week pi rock ##のは ##ａ ##けと ##ean ##300 ##gle cctv after chinese ##back powered x2 ##tan 1918 ##nes ##イン canon only 181 ##zi ##las say ##oe 184 ##sd 221 ##bot ##world ##zo sky made top100 just 1926 pmi 802 234 gap ##vr 177 les 174 ▲topoct ball vogue vi ing ofweek cos ##list ##ort ▲topmay ##なら ##lon として last ##tc ##of ##bus ##gen real eva ##コ a3 nas ##lie ##ria ##coin ##bt ▲topapr his 212 cat nata vive health ⋯⋯ drive sir ▲topmar du cup ##カー ##ook ##よう ##sy alex msg tour しました 3ce ##word 193 ebooks r8 block 318 ##より 2200 nice pvp 207 months 1905 rewards ##ther 1917 0800 ##xi ##チ ##sc micro 850 gg blogfp op 1922 daily m1 264 true ##bb ml ##tar ##のお ##ky anthony 196 253 ##yo state 218 ##ara ##aa ##rc ##tz ##ston より gear ##eo ##ade ge see 1923 ##win ##ura ss heart ##den ##ita down ##sm el png 2100 610 rakuten whatsapp bay dream add ##use 680 311 pad gucci mpv ##ode ##fo island ▲topjun ##▼ 223 jason 214 chicago ##❤ しの ##hone io ##れる ##ことか sogo be2 ##ology 990 cloud vcd ##con 2～3 ##ford ##joy ##kb ##こさいます ##rade but ##ach docker ##ful rfid ul ##ase hit ford ##star 580 ##○ １１ a2 sdk reading edited ##are cmos ##mc 238 siri light ##ella ##ため bloomberg ##read pizza ##ison jimmy ##vm college node journal ba 18k ##play 245 ##cer ２０ magic ##yu 191 jump 288 tt ##ings asr ##lia 3200 step5 network ##cd mc いします 1234 pixstyleme 273 ##600 2800 money ★★★★★ 1280 １２ 430 bl みの act ##tus tokyo ##rial ##life emba ##ae saas tcs ##rk ##wang summer ##sp ko ##ving 390 premium ##その netflix ##ヒ uk mt ##lton right frank two 209 える ##ple ##cal 021 ##んな ##sen ##ville hold nexus dd ##ius てお ##mah ##なく tila zero 820 ce ##tin resort ##ws charles old p10 5d report ##360 ##ru ##には bus vans lt ##est pv ##レ links rebecca ##ツ ##dm azure ##365 きな limited bit 4gb ##mon 1910 moto ##eam 213 1913 var eos なとの 226 blogspot された 699 e3 dos dm fc ##ments ##ik ##kw boy ##bin ##ata 960 er ##せ 219 ##vin ##tu ##ula 194 ##∥ station ##ろ ##ature 835 files zara hdr top10 nature 950 magazine s6 marriott ##シ avira case ##っと tab ##ran tony ##home oculus im ##ral jean saint cry 307 rosie ##force ##ini ice ##bert のある ##nder ##mber pet 2600 ##◆ plurk ▲topdec ##sis 00kg ▲topnov 720 ##ence tim ##ω ##nc ##ても ##name log ips great ikea malaysia unix ##イト 3600 ##ncy ##nie 12000 akb48 ##ye ##oid 404 ##chi ##いた oa xuehai ##1000 ##orm ##rf 275 さん ##ware ##リー 980 ho ##pro text ##era 560 bob 227 ##ub ##2008 8891 scp avi ##zen 2022 mi wu museum qvod apache lake jcb ▲topaug ★★★ ni ##hr hill 302 ne weibo 490 ruby ##ーシ ##ヶ ##row 4d ▲topjul iv ##ish github 306 mate 312 ##スト ##lot ##ane andrew のハイト ##tina t1 rf ed2k ##vel ##900 way final りの ns 5a 705 197 ##メ sweet bytes ##ene ▲topjan 231 ##cker ##2007 ##px 100g topapp 229 helpapp rs low 14k g4g care 630 ldquo あり ##fork leave rm edition ##gan ##zon ##qq ▲topsep ##google ##ism gold 224 explorer ##zer toyota category select visual ##labels restaurant ##md posts s1 ##ico もっと angelababy 123456 217 sports s3 mbc 1915 してくたさい shell x86 candy ##new kbs face xl 470 ##here 4a swissinfo v8 ▲topfeb dram ##ual ##vice 3a ##wer sport q1 ios10 public int card ##ｃ ep au rt ##れた 1080 bill ##mll kim ３０ 460 wan ##uk ##ミ x3 298 0t scott ##ming 239 e5 ##3d h7n9 worldcat brown ##あります ##vo ##led ##580 ##ax 249 410 ##ert paris ##～6 polo 925 ##lr 599 ##ナ capital ##hing bank cv 1g ##chat ##ｓ ##たい adc ##ule 2m ##ｅ digital hotmail 268 ##pad 870 bbq quot ##ring before wali ##まて mcu 2k 2b という costco 316 north 333 switch ##city ##ｐ philips ##mann management panasonic ##cl ##vd ##ping ##rge alice ##lk ##ましょう css3 ##ney vision alpha ##ular ##400 ##tter lz にお ##ありません mode gre 1916 pci ##tm 237 1～2 ##yan ##そについて ##let ##キ work war coach ah mary ##ᅵ huang ##pt a8 pt follow ##berry 1895 ##ew a5 ghost ##ション ##wn ##og south ##code girls ##rid action villa git r11 table games ##cket error ##anonymoussaid ##ag here ##ame ##gc qa ##■ ##lis gmp ##gin vmalife ##cher yu wedding ##tis demo dragon 530 soho social bye ##rant river orz acer 325 ##↑ ##ース ##ats 261 del ##ven 440 ups ##ように ##ター 305 value macd yougou ##dn 661 ##ano ll ##urt ##rent continue script ##wen ##ect paper 263 319 shift ##chel ##フト ##cat 258 x5 fox 243 ##さん car aaa ##blog loading ##yn ##tp kuso 799 si sns イカせるテンマヒンクテンマ3 rmb vdc forest central prime help ultra ##rmb ##ような 241 square 688 ##しいのないフロクに ##field ##reen ##ors ##ju c1 start 510 ##air ##map cdn ##wo cba stephen m8 100km ##get opera ##base ##ood vsa com™ ##aw ##ail 251 なのて count t2 ##ᅡ ##een 2700 hop ##gp vsc tree ##eg ##ose 816 285 ##ories ##shop alphago v4 1909 simon ##ᆼ fluke62max zip スホンサー ##sta louis cr bas ##～10 bc ##yer hadoop ##ube ##wi 1906 0755 hola ##low place centre 5v d3 ##fer 252 ##750 ##media 281 540 0l exchange 262 series ##ハー ##san eb ##bank ##ｋ q3 ##nge ##mail take ##lp 259 1888 client east cache event vincent ##ールをきを ##nse sui 855 adchoice ##и ##stry ##なたの 246 ##zone ga apps sea ##ab 248 cisco ##タ ##rner kymco ##care dha ##pu ##yi minkoff royal p1 への annie 269 collection kpi playstation 257 になります 866 bh ##bar queen 505 radio 1904 andy armani ##xy manager iherb ##ery ##share spring raid johnson 1908 ##ob volvo hall ##ball v6 our taylor ##hk bi 242 ##cp kate bo water technology ##rie サイトは 277 ##ona ##sl hpv 303 gtx hip rdquo jayz stone ##lex ##rum namespace ##やり 620 ##ale ##atic des ##erson ##ql ##ves ##type enter ##この ##てきます d2 ##168 ##mix ##bian との a9 jj ky ##lc access movie ##hc リストに tower ##ration ##mit ます ##nch ua tel prefix ##o2 1907 ##point 1901 ott ～10 ##http ##ury baidu ##ink member ##logy bigbang nownews ##js ##shot ##tb ##こと 247 eba ##tics ##lus ける v5 spark ##ama there ##ions god ##lls ##down hiv ##ress burberry day2 ##kv ◆◆ jeff related film edit joseph 283 ##ark cx 32gb order g9 30000 ##ans ##tty s5 ##bee かあります thread xr buy sh 005 land spotify mx ##ari 276 ##verse ×email sf why ##ことて 244 7headlines nego sunny dom exo 401 666 positioning fit rgb ##tton 278 kiss alexa adam lp みリストを ##ｇ mp ##ties ##llow amy ##du np 002 institute 271 ##rth ##lar 2345 590 ##des sidebar １５ imax site ##cky ##kit ##ime ##009 season 323 ##fun ##ンター ##ひ gogoro a7 pu lily fire twd600 ##ッセーシをいて ##vis 30ml ##cture ##をお information ##オ close friday ##くれる yi nick てすか ##tta ##tel 6500 ##lock cbd economy 254 かお 267 tinker double 375 8gb voice ##app oops channel today 985 ##right raw xyz ##＋ jim edm ##cent 7500 supreme 814 ds ##its ##asia dropbox ##てすか ##tti books 272 100ml ##tle ##ller ##ken ##more ##boy sex 309 ##dom t3 ##ider ##なります ##unch 1903 810 feel 5500 ##かった ##put により s2 mo ##gh men ka amoled div ##tr ##n1 port howard ##tags ken dnf ##nus adsense ##а ide ##へ buff thunder ##town ##ique has ##body auto pin ##erry tee てした 295 number ##the ##013 object psp cool udnbkk 16gb ##mic miui ##tro most r2 ##alk ##nity 1880 ±0 ##いました 428 s4 law version ##oa n1 sgs docomo ##tf ##ack henry fc2 ##ded ##sco ##014 ##rite 286 0mm linkedin ##ada ##now wii ##ndy ucbug ##◎ sputniknews legalminer ##ika ##xp 2gb ##bu q10 oo b6 come ##rman cheese ming maker ##gm nikon ##fig ppi kelly ##ります jchere てきます ted md 003 fgo tech ##tto dan soc ##gl ##len hair earth 640 521 img ##pper ##a1 ##てきる ##ロク acca ##ition ##ference suite ##ig outlook ##mond ##cation 398 ##pr 279 101vip 358 ##999 282 64gb 3800 345 airport ##over 284 ##おり jones ##ith lab ##su ##いるのて co2 town piece ##llo no1 vmware 24h ##qi focus reader ##admin ##ora tb false ##log 1898 know lan 838 ##ces f4 ##ume motel stop ##oper na flickr netcomponents ##af ##─ pose williams local ##ound ##cg ##site ##iko いお 274 5m gsm con ##ath 1902 friends ##hip cell 317 ##rey 780 cream ##cks 012 ##dp facebooktwitterpinterestgoogle sso 324 shtml song swiss ##mw ##キンク lumia xdd string tiffany 522 marc られた insee russell sc dell ##ations ｏｋ camera 289 ##vs ##flow ##late classic 287 ##nter stay g1 mtv 512 ##ever ##lab ##nger qe sata ryan d1 50ml cms ##cing su 292 3300 editor 296 ##nap security sunday association ##ens ##700 ##bra acg ##かり sofascore とは mkv ##ign jonathan gary build labels ##oto tesla moba qi gohappy general ajax 1024 ##かるサイト society ##test ##urs wps fedora ##ich mozilla 328 ##480 ##dr usa urn ##lina ##ｒ grace ##die ##try ##ader 1250 ##なり elle 570 ##chen ##ᆯ price ##ten uhz ##ough eq ##hen states push session balance wow 506 ##cus ##py when ##ward ##ep 34e wong library prada ##サイト ##cle running ##ree 313 ck date q4 ##ctive ##ool ##＞ mk ##ira ##163 388 die secret rq dota buffet は１ヶ e6 ##ez pan 368 ha ##card ##cha 2a ##さ alan day3 eye f3 ##end france keep adi rna tvbs ##ala solo nova ##え ##tail ##ょう support ##ries ##なる ##ved base copy iis fps ##ways hero hgih profile fish mu ssh entertainment chang ##wd click cake ##ond pre ##tom kic pixel ##ov ##fl product 6a ##pd dear ##gate es yumi audio ##² ##sky echo bin where ##ture 329 ##ape find sap isis ##なと nand ##101 ##load ##ream band a6 525 never ##post festival 50cm ##we 555 guide 314 zenfone ##ike 335 gd forum jessica strong alexander ##ould software allen ##ious program 360° else lohasthree ##gar することかてきます please ##れます rc ##ggle ##ric bim 50000 ##own eclipse 355 brian 3ds ##side 061 361 ##other ##ける ##tech ##ator 485 engine ##ged ##ｔ plaza ##fit cia ngo westbrook shi tbs 50mm ##みませんか sci 291 reuters ##ily contextlink ##hn af ##cil bridge very ##cel 1890 cambridge ##ize 15g ##aid ##data 790 frm ##head award butler ##sun meta ##mar america ps3 puma pmid ##すか lc 670 kitchen ##lic オーフン5 きなしソフトサーヒスそして day1 future ★★★★ ##text ##page ##rris pm1 ##ket fans ##っています 1001 christian bot kids trackback ##hai c3 display ##hl n2 1896 idea さんも ##sent airmail ##ug ##men pwm けます 028 ##lution 369 852 awards schemas 354 asics wikipedia font ##tional ##vy c2 293 ##れている ##dget ##ein っている contact pepper スキル 339 ##～5 294 ##uel ##ument 730 ##hang みてす q5 ##sue rain ##ndi wei swatch ##cept わせ 331 popular ##ste ##tag p2 501 trc 1899 ##west ##live justin honda ping messenger ##rap v9 543 ##とは unity appqq はすへて 025 leo ##tone ##テ ##ass uniqlo ##010 502 her jane memory moneydj ##tical human 12306 していると ##m2 coc miacare ##mn tmt ##core vim kk ##may fan target use too 338 435 2050 867 737 fast ##2c services ##ope omega energy ##わ pinkoi 1a ##なから ##rain jackson ##ement ##シャンルの 374 366 そんな p9 rd ##ᆨ 1111 ##tier ##vic zone ##│ 385 690 dl isofix cpa m4 322 kimi めて davis ##lay lulu ##uck 050 weeks qs ##hop 920 ##ｎ ae ##ear ～5 eia 405 ##fly korea jpeg boost ##ship small ##リア 1860 eur 297 425 valley ##iel simple ##ude rn k2 ##ena されます non patrick しているから ##ナー feed 5757 30g process well qqmei ##thing they aws lu pink ##ters ##kin または board ##vertisement wine ##ien unicode ##dge r1 359 ##tant いを ##twitter ##3c cool1 される ##れて ##ｌ isp ##012 standard 45㎡2 402 ##150 matt ##fu 326 ##iner googlemsn pixnetfacebookyahoo ##ラン x7 886 ##uce メーカー sao ##ev ##きました ##file 9678 403 xddd shirt 6l ##rio ##hat 3mm givenchy ya bang ##lio monday crystal ロクイン ##abc 336 head 890 ubuntuforumwikilinuxpastechat ##vc ##～20 ##rity cnc 7866 ipv6 null 1897 ##ost yang imsean tiger ##fet ##ンス 352 ##＝ dji 327 ji maria ##come ##んて foundation 3100 ##beth ##なった 1m 601 active ##aft ##don 3p sr 349 emma ##khz living 415 353 1889 341 709 457 sas x6 ##face pptv x4 ##mate han sophie ##jing 337 fifa ##mand other sale inwedding ##gn てきちゃいます ##mmy ##pmlast bad nana nbc してみてくたさいねなとはお ##wu ##かあります ##あ note7 single ##340 せからこしてくたさい♪このしにはとんとんワークケートをするとあなたにもっとマッチしたならワークケートへもみつかっちゃうかもワークケートの ##bel window ##dio ##ht union age 382 １４ ##ivity ##ｙコメント domain neo ##isa ##lter 5k f5 steven ##cts powerpoint tft self g2 ft ##テル zol ##act mwc 381 343 もう nbapop 408 てある eds ace ##room previous author tomtom il ##ets hu financial ☆☆☆ っています bp 5t chi 1gb ##hg fairmont cross 008 gay h2 function ##けて 356 also 1b 625 ##ータ ##raph 1894 3～5 ##ils i3 334 avenue ##host による ##bon ##tsu message navigation 50g fintech h6 ##ことを 8cm ##ject ##vas ##firm credit ##wf xxxx form ##nor ##space huawei plan json sbl ##dc machine 921 392 wish ##120 ##sol windows7 edward ##ために development washington ##nsis lo 818 ##sio ##ym ##bor planet ##～8 ##wt ieee gpa ##めて camp ann gm ##tw ##oka connect ##rss ##work ##atus wall chicken soul 2mm ##times fa ##ather ##cord 009 ##eep hitachi gui harry ##pan e1 disney ##press ##ーション wind 386 frigidaire ##tl liu hsu 332 basic von ev いたてきるスホンサーサイト learning ##ull expedia archives change ##wei santa cut ins 6gb turbo brand cf1 508 004 return 747 ##rip h1 ##nis ##をこ 128gb ##にお 3t application しており emc rx ##oon 384 quick 412 15058 wilson wing chapter ##bug beyond ##cms ##dar ##oh zoom e2 trip sb ##nba rcep 342 aspx ci 080 gc gnu める ##count advanced dance dv ##url ##ging 367 8591 am09 shadow battle 346 ##ｉ ##cia ##という emily ##のてす ##tation host ff techorz sars ##mini ##mporary ##ering nc 4200 798 ##next cma ##mbps ##gas ##ift ##dot ##ィ 455 ##～17 amana ##りの 426 ##ros ir 00㎡1 ##eet ##ible ##↓ 710 ˋ▽ˊ ##aka dcs iq ##ｖ l1 ##lor maggie ##011 ##iu 588 ##～1 830 ##gt 1tb articles create ##burg ##iki database fantasy ##rex ##cam dlc dean ##you hard path gaming victoria maps cb ##lee ##itor overchicstoretvhome systems ##xt 416 p3 sarah 760 ##nan 407 486 x9 install second 626 ##ann ##ph ##rcle ##nic 860 ##nar ec ##とう 768 metro chocolate ##rian ～4 ##table ##しています skin ##sn 395 mountain ##0mm inparadise 6m 7x24 ib 4800 ##jia eeworld creative g5 g3 357 parker ecfa village からの 18000 sylvia サーヒス hbl ##ques ##onsored ##x2 ##きます ##v4 ##tein ie6 383 ##stack 389 ver ##ads ##baby sound bbe ##110 ##lone ##uid ads 022 gundam 351 thinkpad 006 scrum match ##ave mems ##470 ##oy ##なりました ##talk glass lamigo span ##eme job ##a5 jay wade kde 498 ##lace ocean tvg ##covery ##r3 ##ners ##rea junior think ##aine cover ##ision ##sia ↓↓ ##bow msi 413 458 406 ##love 711 801 soft z2 ##pl 456 1840 mobil mind ##uy 427 nginx ##oi めた ##rr 6221 ##mple ##sson ##ーシてす 371 ##nts 91tv comhd crv3000 ##uard 1868 397 deep lost field gallery ##bia rate spf redis traction 930 icloud 011 なら fe jose 372 ##tory into sohu fx 899 379 kicstart2 ##hia すく ##～3 ##sit ra ２４ ##walk ##xure 500g ##pact pacific xa natural carlo ##250 ##walker 1850 ##can cto gigi 516 ##サー pen ##hoo ob matlab ##ｂ ##yy 13913459 ##iti mango ##bbs sense c5 oxford ##ニア walker jennifer ##ola course ##bre 701 ##pus ##rder lucky 075 ##ぁ ivy なお ##nia sotheby side ##ugh joy ##orage ##ush ##bat ##dt 364 r9 ##2d ##gio 511 country wear ##lax ##～7 ##moon 393 seven study 411 348 lonzo 8k ##ェ evolution ##イフ ##kk gs kd ##レス arduino 344 b12 ##lux arpg ##rdon cook ##x5 dark five ##als ##ida とても sign 362 ##ちの something 20mm ##nda 387 ##posted fresh tf 1870 422 cam ##mine ##skip ##form ##ssion education 394 ##tee dyson stage ##jie want ##night epson pack あります ##ppy テリヘル ##█ wd ##eh ##rence left ##lvin golden mhz discovery ##trix ##n2 loft ##uch ##dra ##sse speed ～1 1mdb sorry welcome ##urn wave gaga ##lmer teddy ##160 トラックハックせよ 611 ##f2016 378 rp ##sha rar ##あなたに ##きた 840 holiday ##ュー 373 074 ##vg ##nos ##rail gartner gi 6p ##dium kit 488 b3 eco ##ろう 20g sean ##stone autocad nu ##np f16 write 029 m5 ##ias images atp ##dk fsm 504 1350 ve 52kb ##xxx ##のに ##cake 414 unit lim ru 1v ##ification published angela 16g analytics ak ##ｑ ##nel gmt ##icon again ##₂ ##bby ios11 445 かこさいます waze いてす ##ハ 9985 ##ust ##ティー framework ##007 iptv delete 52sykb cl wwdc 027 30cm ##fw ##ての 1389 ##xon brandt ##ses ##dragon tc vetements anne monte modern official ##へて ##ere ##nne ##oud もちろん５０ etnews ##a2 ##graphy 421 863 ##ちゃん 444 ##rtex ##てお l2 ##gma mount ccd たと archive morning tan ddos e7 ##ホ day4 ##ウ gis 453 its 495 factory bruce pg ##ito ってくたさい guest cdma ##lling 536 n3 しかし 3～4 mega eyes ro １３ women dac church ##jun singapore ##facebook 6991 starbucks ##tos ##stin ##shine zen ##mu tina 20℃ 1893 ##たけて 503 465 request ##gence qt ##っ 1886 347 363 q7 ##zzi diary ##tore 409 ##ead 468 cst ##osa canada agent va ##jiang ##ちは ##ーク ##lam sg ##nix ##sday ##よって g6 ##master bing ##zl charlie １６ 8mm nb40 ##ーン thai ##ルフ ln284ct ##itz ##2f bonnie ##food ##lent originals ##stro ##lts 418 ∟∣ ##bscribe children ntd yesstyle ##かも hmv ##tment d5 2cm arts sms ##pn ##я ##いい topios9 539 lifestyle virtual ##ague xz ##deo muji 024 unt ##nnis ##ᅩ faq1 1884 396 ##ette fly 64㎡はしめまして 441 curry ##pop のこ release ##← ##◆◆ ##cast 073 ありな 500ml ##ews 5c ##stle ios7 ##ima 787 dog lenovo ##r4 roger 013 cbs vornado 100m 417 ##desk ##クok ##ald 1867 9595 2900 ##van oil ##ｘ some break common ##jy ##lines g7 twice 419 ella nano belle にこ ##mes ##self ##note jb ##ことかてきます benz ##との ##ova 451 save ##wing ##ますのて kai りは ##hua ##rect rainer ##unge 448 ##0m adsl ##かな guestname ##uma ##kins ##zu tokichoi ##price county ##med ##mus rmk 391 address vm えて openload ##group ##hin ##iginal amg urban ##oz jobs emi ##public beautiful ##sch album ##dden ##bell jerry works hostel miller ##drive ##rmin ##１０ 376 boot 828 ##370 ##fx ##cm～ 1885 ##nome ##ctionary ##oman ##lish ##cr ##hm 433 ##how 432 francis xi c919 b5 evernote ##uc vga ##3000 coupe ##urg ##cca ##uality 019 6g れる multi ##また ##ett em hey ##ani ##tax ##rma inside than 740 leonnhurt ##jin ict れた bird notes 200mm くの ##dical ##lli result 442 iu ee 438 smap gopro ##last yin pure 998 32g けた 5kg ##dan ##rame mama ##oot bean marketing ##hur 2l bella sync xuite ##ground 515 discuz ##getrelax ##ince ##bay ##5s cj ##イス gmat apt ##pass jing ##rix c4 rich ##とても niusnews ##ello bag 770 ##eting ##mobile １８ culture 015 ##のてすか 377 1020 area ##ience 616 details gp universal silver dit はお private ddd u11 kanshu ##ified fung ##nny dx ##520 tai 475 023 ##fr ##lean 3s ##pin 429 ##rin 25000 ly rick ##bility usb3 banner ##baru ##gion metal dt vdf 1871 karl qualcomm bear 1010 oldid ian jo ##tors population ##ernel 1882 mmorpg ##mv ##bike 603 ##© ww friend ##ager exhibition ##del ##pods fpx structure ##free ##tings kl ##rley ##copyright ##mma california 3400 orange yoga 4l canmake honey ##anda ##コメント 595 nikkie ##ルハイト dhl publishing ##mall ##gnet 20cm 513 ##クセス ##┅ e88 970 ##dog fishbase ##! ##" ### ##$ ##% ##& ##' ##( ##) ##* ##+ ##, ##- ##. ##/ ##: ##; ##< ##= ##> ##? ##@ ##[ ##\ ##] ##^ ##_ ##{ ##| ##} ##~ ##£ ##¤ ##¥ ##§ ##« ##± ##³ ##µ ##· ##¹ ##º ##» ##¼ ##ß ##æ ##÷ ##ø ##đ ##ŋ ##ɔ ##ə ##ɡ ##ʰ ##ˇ ##ˈ ##ˊ ##ˋ ##ˍ ##ː ##˙ ##˚ ##ˢ ##α ##β ##γ ##δ ##ε ##η ##θ ##ι ##κ ##λ ##μ ##ν ##ο ##π ##ρ ##ς ##σ ##τ ##υ ##φ ##χ ##ψ ##б ##в ##г ##д ##е ##ж ##з ##к ##л ##м ##н ##о ##п ##р ##с ##т ##у ##ф ##х ##ц ##ч ##ш ##ы ##ь ##і ##ا ##ب ##ة ##ت ##د ##ر ##س ##ع ##ل ##م ##ن ##ه ##و ##ي ##۩ ##ก ##ง ##น ##ม ##ย ##ร ##อ ##า ##เ ##๑ ##་ ##ღ ##ᄀ ##ᄁ ##ᄂ ##ᄃ ##ᄅ ##ᄆ ##ᄇ ##ᄈ ##ᄉ ##ᄋ ##ᄌ ##ᄎ ##ᄏ ##ᄐ ##ᄑ ##ᄒ ##ᅢ ##ᅣ ##ᅥ ##ᅦ ##ᅧ ##ᅨ ##ᅪ ##ᅬ ##ᅭ ##ᅮ ##ᅯ ##ᅲ ##ᅳ ##ᅴ ##ᆷ ##ᆸ ##ᆺ ##ᆻ ##ᗜ ##ᵃ ##ᵉ ##ᵍ ##ᵏ ##ᵐ ##ᵒ ##ᵘ ##‖ ##„ ##† ##• ##‥ ##‧ ##  ##‰ ##′ ##″ ##‹ ##› ##※ ##‿ ##⁄ ##ⁱ ##⁺ ##ⁿ ##₁ ##₃ ##₄ ##€ ##№ ##ⅰ ##ⅱ ##ⅲ ##ⅳ ##ⅴ ##↔ ##↗ ##↘ ##⇒ ##∀ ##− ##∕ ##∙ ##√ ##∞ ##∟ ##∠ ##∣ ##∩ ##∮ ##∶ ##∼ ##∽ ##≈ ##≒ ##≡ ##≤ ##≥ ##≦ ##≧ ##≪ ##≫ ##⊙ ##⋅ ##⋈ ##⋯ ##⌒ ##① ##② ##③ ##④ ##⑤ ##⑥ ##⑦ ##⑧ ##⑨ ##⑩ ##⑴ ##⑵ ##⑶ ##⑷ ##⑸ ##⒈ ##⒉ ##⒊ ##⒋ ##ⓒ ##ⓔ ##ⓘ ##━ ##┃ ##┆ ##┊ ##┌ ##└ ##├ ##┣ ##═ ##║ ##╚ ##╞ ##╠ ##╭ ##╮ ##╯ ##╰ ##╱ ##╳ ##▂ ##▃ ##▅ ##▇ ##▉ ##▋ ##▌ ##▍ ##▎ ##□ ##▪ ##▫ ##▬ ##△ ##▶ ##► ##▽ ##◇ ##◕ ##◠ ##◢ ##◤ ##☀ ##☕ ##☞ ##☺ ##☼ ##♀ ##♂ ##♠ ##♡ ##♣ ##♦ ##♫ ##♬ ##✈ ##✔ ##✕ ##✖ ##✦ ##✨ ##✪ ##✰ ##✿ ##❀ ##➜ ##➤ ##⦿ ##、 ##。 ##〃 ##々 ##〇 ##〈 ##〉 ##《 ##》 ##「 ##」 ##『 ##』 ##【 ##】 ##〓 ##〔 ##〕 ##〖 ##〗 ##〜 ##〝 ##〞 ##ぃ ##ぇ ##ぬ ##ふ ##ほ ##む ##ゃ ##ゅ ##ゆ ##ょ ##゜ ##ゝ ##ァ ##ゥ ##エ ##ォ ##ケ ##サ ##セ ##ソ ##ッ ##ニ ##ヌ ##ネ ##ノ ##ヘ ##モ ##ャ ##ヤ ##ュ ##ユ ##ョ ##ヨ ##ワ ##ヲ ##・ ##ヽ ##ㄅ ##ㄆ ##ㄇ ##ㄉ ##ㄋ ##ㄌ ##ㄍ ##ㄎ ##ㄏ ##ㄒ ##ㄚ ##ㄛ ##ㄞ ##ㄟ ##ㄢ ##ㄤ ##ㄥ ##ㄧ ##ㄨ ##ㆍ ##㈦ ##㊣ ##㗎 ##一 ##丁 ##七 ##万 ##丈 ##三 ##上 ##下 ##不 ##与 ##丐 ##丑 ##专 ##且 ##丕 ##世 ##丘 ##丙 ##业 ##丛 ##东 ##丝 ##丞 ##丟 ##両 ##丢 ##两 ##严 ##並 ##丧 ##丨 ##个 ##丫 ##中 ##丰 ##串 ##临 ##丶 ##丸 ##丹 ##为 ##主 ##丼 ##丽 ##举 ##丿 ##乂 ##乃 ##久 ##么 ##义 ##之 ##乌 ##乍 ##乎 ##乏 ##乐 ##乒 ##乓 ##乔 ##乖 ##乗 ##乘 ##乙 ##乜 ##九 ##乞 ##也 ##习 ##乡 ##书 ##乩 ##买 ##乱 ##乳 ##乾 ##亀 ##亂 ##了 ##予 ##争 ##事 ##二 ##于 ##亏 ##云 ##互 ##五 ##井 ##亘 ##亙 ##亚 ##些 ##亜 ##亞 ##亟 ##亡 ##亢 ##交 ##亥 ##亦 ##产 ##亨 ##亩 ##享 ##京 ##亭 ##亮 ##亲 ##亳 ##亵 ##人 ##亿 ##什 ##仁 ##仃 ##仄 ##仅 ##仆 ##仇 ##今 ##介 ##仍 ##从 ##仏 ##仑 ##仓 ##仔 ##仕 ##他 ##仗 ##付 ##仙 ##仝 ##仞 ##仟 ##代 ##令 ##以 ##仨 ##仪 ##们 ##仮 ##仰 ##仲 ##件 ##价 ##任 ##份 ##仿 ##企 ##伉 ##伊 ##伍 ##伎 ##伏 ##伐 ##休 ##伕 ##众 ##优 ##伙 ##会 ##伝 ##伞 ##伟 ##传 ##伢 ##伤 ##伦 ##伪 ##伫 ##伯 ##估 ##伴 ##伶 ##伸 ##伺 ##似 ##伽 ##佃 ##但 ##佇 ##佈 ##位 ##低 ##住 ##佐 ##佑 ##体 ##佔 ##何 ##佗 ##佘 ##余 ##佚 ##佛 ##作 ##佝 ##佞 ##佟 ##你 ##佢 ##佣 ##佤 ##佥 ##佩 ##佬 ##佯 ##佰 ##佳 ##併 ##佶 ##佻 ##佼 ##使 ##侃 ##侄 ##來 ##侈 ##例 ##侍 ##侏 ##侑 ##侖 ##侗 ##供 ##依 ##侠 ##価 ##侣 ##侥 ##侦 ##侧 ##侨 ##侬 ##侮 ##侯 ##侵 ##侶 ##侷 ##便 ##係 ##促 ##俄 ##俊 ##俎 ##俏 ##俐 ##俑 ##俗 ##俘 ##俚 ##保 ##俞 ##俟 ##俠 ##信 ##俨 ##俩 ##俪 ##俬 ##俭 ##修 ##俯 ##俱 ##俳 ##俸 ##俺 ##俾 ##倆 ##倉 ##個 ##倌 ##倍 ##倏 ##們 ##倒 ##倔 ##倖 ##倘 ##候 ##倚 ##倜 ##借 ##倡 ##値 ##倦 ##倩 ##倪 ##倫 ##倬 ##倭 ##倶 ##债 ##值 ##倾 ##偃 ##假 ##偈 ##偉 ##偌 ##偎 ##偏 ##偕 ##做 ##停 ##健 ##側 ##偵 ##偶 ##偷 ##偻 ##偽 ##偿 ##傀 ##傅 ##傍 ##傑 ##傘 ##備 ##傚 ##傢 ##傣 ##傥 ##储 ##傩 ##催 ##傭 ##傲 ##傳 ##債 ##傷 ##傻 ##傾 ##僅 ##働 ##像 ##僑 ##僕 ##僖 ##僚 ##僥 ##僧 ##僭 ##僮 ##僱 ##僵 ##價 ##僻 ##儀 ##儂 ##億 ##儆 ##儉 ##儋 ##儒 ##儕 ##儘 ##償 ##儡 ##優 ##儲 ##儷 ##儼 ##儿 ##兀 ##允 ##元 ##兄 ##充 ##兆 ##兇 ##先 ##光 ##克 ##兌 ##免 ##児 ##兑 ##兒 ##兔 ##兖 ##党 ##兜 ##兢 ##入 ##內 ##全 ##兩 ##八 ##公 ##六 ##兮 ##兰 ##共 ##兲 ##关 ##兴 ##兵 ##其 ##具 ##典 ##兹 ##养 ##兼 ##兽 ##冀 ##内 ##円 ##冇 ##冈 ##冉 ##冊 ##册 ##再 ##冏 ##冒 ##冕 ##冗 ##写 ##军 ##农 ##冠 ##冢 ##冤 ##冥 ##冨 ##冪 ##冬 ##冯 ##冰 ##冲 ##决 ##况 ##冶 ##冷 ##冻 ##冼 ##冽 ##冾 ##净 ##凄 ##准 ##凇 ##凈 ##凉 ##凋 ##凌 ##凍 ##减 ##凑 ##凛 ##凜 ##凝 ##几 ##凡 ##凤 ##処 ##凪 ##凭 ##凯 ##凰 ##凱 ##凳 ##凶 ##凸 ##凹 ##出 ##击 ##函 ##凿 ##刀 ##刁 ##刃 ##分 ##切 ##刈 ##刊 ##刍 ##刎 ##刑 ##划 ##列 ##刘 ##则 ##刚 ##创 ##初 ##删 ##判 ##別 ##刨 ##利 ##刪 ##别 ##刮 ##到 ##制 ##刷 ##券 ##刹 ##刺 ##刻 ##刽 ##剁 ##剂 ##剃 ##則 ##剉 ##削 ##剋 ##剌 ##前 ##剎 ##剐 ##剑 ##剔 ##剖 ##剛 ##剜 ##剝 ##剣 ##剤 ##剥 ##剧 ##剩 ##剪 ##副 ##割 ##創 ##剷 ##剽 ##剿 ##劃 ##劇 ##劈 ##劉 ##劊 ##劍 ##劏 ##劑 ##力 ##劝 ##办 ##功 ##加 ##务 ##劣 ##动 ##助 ##努 ##劫 ##劭 ##励 ##劲 ##劳 ##労 ##劵 ##効 ##劾 ##势 ##勁 ##勃 ##勇 ##勉 ##勋 ##勐 ##勒 ##動 ##勖 ##勘 ##務 ##勛 ##勝 ##勞 ##募 ##勢 ##勤 ##勧 ##勳 ##勵 ##勸 ##勺 ##勻 ##勾 ##勿 ##匀 ##包 ##匆 ##匈 ##匍 ##匐 ##匕 ##化 ##北 ##匙 ##匝 ##匠 ##匡 ##匣 ##匪 ##匮 ##匯 ##匱 ##匹 ##区 ##医 ##匾 ##匿 ##區 ##十 ##千 ##卅 ##升 ##午 ##卉 ##半 ##卍 ##华 ##协 ##卑 ##卒 ##卓 ##協 ##单 ##卖 ##南 ##単 ##博 ##卜 ##卞 ##卟 ##占 ##卡 ##卢 ##卤 ##卦 ##卧 ##卫 ##卮 ##卯 ##印 ##危 ##即 ##却 ##卵 ##卷 ##卸 ##卻 ##卿 ##厂 ##厄 ##厅 ##历 ##厉 ##压 ##厌 ##厕 ##厘 ##厚 ##厝 ##原 ##厢 ##厥 ##厦 ##厨 ##厩 ##厭 ##厮 ##厲 ##厳 ##去 ##县 ##叁 ##参 ##參 ##又 ##叉 ##及 ##友 ##双 ##反 ##収 ##发 ##叔 ##取 ##受 ##变 ##叙 ##叛 ##叟 ##叠 ##叡 ##叢 ##口 ##古 ##句 ##另 ##叨 ##叩 ##只 ##叫 ##召 ##叭 ##叮 ##可 ##台 ##叱 ##史 ##右 ##叵 ##叶 ##号 ##司 ##叹 ##叻 ##叼 ##叽 ##吁 ##吃 ##各 ##吆 ##合 ##吉 ##吊 ##吋 ##同 ##名 ##后 ##吏 ##吐 ##向 ##吒 ##吓 ##吕 ##吖 ##吗 ##君 ##吝 ##吞 ##吟 ##吠 ##吡 ##否 ##吧 ##吨 ##吩 ##含 ##听 ##吭 ##吮 ##启 ##吱 ##吳 ##吴 ##吵 ##吶 ##吸 ##吹 ##吻 ##吼 ##吽 ##吾 ##呀 ##呂 ##呃 ##呆 ##呈 ##告 ##呋 ##呎 ##呐 ##呓 ##呕 ##呗 ##员 ##呛 ##呜 ##呢 ##呤 ##呦 ##周 ##呱 ##呲 ##味 ##呵 ##呷 ##呸 ##呻 ##呼 ##命 ##咀 ##咁 ##咂 ##咄 ##咆 ##咋 ##和 ##咎 ##咏 ##咐 ##咒 ##咔 ##咕 ##咖 ##咗 ##咘 ##咙 ##咚 ##咛 ##咣 ##咤 ##咦 ##咧 ##咨 ##咩 ##咪 ##咫 ##咬 ##咭 ##咯 ##咱 ##咲 ##咳 ##咸 ##咻 ##咽 ##咿 ##哀 ##品 ##哂 ##哄 ##哆 ##哇 ##哈 ##哉 ##哋 ##哌 ##响 ##哎 ##哏 ##哐 ##哑 ##哒 ##哔 ##哗 ##哟 ##員 ##哥 ##哦 ##哧 ##哨 ##哩 ##哪 ##哭 ##哮 ##哲 ##哺 ##哼 ##哽 ##唁 ##唄 ##唆 ##唇 ##唉 ##唏 ##唐 ##唑 ##唔 ##唠 ##唤 ##唧 ##唬 ##售 ##唯 ##唰 ##唱 ##唳 ##唷 ##唸 ##唾 ##啃 ##啄 ##商 ##啉 ##啊 ##問 ##啓 ##啕 ##啖 ##啜 ##啞 ##啟 ##啡 ##啤 ##啥 ##啦 ##啧 ##啪 ##啫 ##啬 ##啮 ##啰 ##啱 ##啲 ##啵 ##啶 ##啷 ##啸 ##啻 ##啼 ##啾 ##喀 ##喂 ##喃 ##善 ##喆 ##喇 ##喉 ##喊 ##喋 ##喎 ##喏 ##喔 ##喘 ##喙 ##喚 ##喜 ##喝 ##喟 ##喧 ##喪 ##喫 ##喬 ##單 ##喰 ##喱 ##喲 ##喳 ##喵 ##営 ##喷 ##喹 ##喺 ##喻 ##喽 ##嗅 ##嗆 ##嗇 ##嗎 ##嗑 ##嗒 ##嗓 ##嗔 ##嗖 ##嗚 ##嗜 ##嗝 ##嗟 ##嗡 ##嗣 ##嗤 ##嗦 ##嗨 ##嗪 ##嗬 ##嗯 ##嗰 ##嗲 ##嗳 ##嗶 ##嗷 ##嗽 ##嘀 ##嘅 ##嘆 ##嘈 ##嘉 ##嘌 ##嘍 ##嘎 ##嘔 ##嘖 ##嘗 ##嘘 ##嘚 ##嘛 ##嘜 ##嘞 ##嘟 ##嘢 ##嘣 ##嘤 ##嘧 ##嘩 ##嘭 ##嘮 ##嘯 ##嘰 ##嘱 ##嘲 ##嘴 ##嘶 ##嘸 ##嘹 ##嘻 ##嘿 ##噁 ##噌 ##噎 ##噓 ##噔 ##噗 ##噙 ##噜 ##噠 ##噢 ##噤 ##器 ##噩 ##噪 ##噬 ##噱 ##噴 ##噶 ##噸 ##噹 ##噻 ##噼 ##嚀 ##嚇 ##嚎 ##嚏 ##嚐 ##嚓 ##嚕 ##嚟 ##嚣 ##嚥 ##嚨 ##嚮 ##嚴 ##嚷 ##嚼 ##囂 ##囉 ##囊 ##囍 ##囑 ##囔 ##囗 ##囚 ##四 ##囝 ##回 ##囟 ##因 ##囡 ##团 ##団 ##囤 ##囧 ##囪 ##囫 ##园 ##困 ##囱 ##囲 ##図 ##围 ##囹 ##固 ##国 ##图 ##囿 ##圃 ##圄 ##圆 ##圈 ##國 ##圍 ##圏 ##園 ##圓 ##圖 ##團 ##圜 ##土 ##圣 ##圧 ##在 ##圩 ##圭 ##地 ##圳 ##场 ##圻 ##圾 ##址 ##坂 ##均 ##坊 ##坍 ##坎 ##坏 ##坐 ##坑 ##块 ##坚 ##坛 ##坝 ##坞 ##坟 ##坠 ##坡 ##坤 ##坦 ##坨 ##坪 ##坯 ##坳 ##坵 ##坷 ##垂 ##垃 ##垄 ##型 ##垒 ##垚 ##垛 ##垠 ##垢 ##垣 ##垦 ##垩 ##垫 ##垭 ##垮 ##垵 ##埂 ##埃 ##埋 ##城 ##埔 ##埕 ##埗 ##域 ##埠 ##埤 ##埵 ##執 ##埸 ##培 ##基 ##埼 ##堀 ##堂 ##堃 ##堅 ##堆 ##堇 ##堑 ##堕 ##堙 ##堡 ##堤 ##堪 ##堯 ##堰 ##報 ##場 ##堵 ##堺 ##堿 ##塊 ##塌 ##塑 ##塔 ##塗 ##塘 ##塚 ##塞 ##塢 ##塩 ##填 ##塬 ##塭 ##塵 ##塾 ##墀 ##境 ##墅 ##墉 ##墊 ##墒 ##墓 ##増 ##墘 ##墙 ##墜 ##增 ##墟 ##墨 ##墩 ##墮 ##墳 ##墻 ##墾 ##壁 ##壅 ##壆 ##壇 ##壊 ##壑 ##壓 ##壕 ##壘 ##壞 ##壟 ##壢 ##壤 ##壩 ##士 ##壬 ##壮 ##壯 ##声 ##売 ##壳 ##壶 ##壹 ##壺 ##壽 ##处 ##备 ##変 ##复 ##夏 ##夔 ##夕 ##外 ##夙 ##多 ##夜 ##够 ##夠 ##夢 ##夥 ##大 ##天 ##太 ##夫 ##夭 ##央 ##夯 ##失 ##头 ##夷 ##夸 ##夹 ##夺 ##夾 ##奂 ##奄 ##奇 ##奈 ##奉 ##奋 ##奎 ##奏 ##奐 ##契 ##奔 ##奕 ##奖 ##套 ##奘 ##奚 ##奠 ##奢 ##奥 ##奧 ##奪 ##奬 ##奮 ##女 ##奴 ##奶 ##奸 ##她 ##好 ##如 ##妃 ##妄 ##妆 ##妇 ##妈 ##妊 ##妍 ##妒 ##妓 ##妖 ##妘 ##妙 ##妝 ##妞 ##妣 ##妤 ##妥 ##妨 ##妩 ##妪 ##妮 ##妲 ##妳 ##妹 ##妻 ##妾 ##姆 ##姉 ##姊 ##始 ##姍 ##姐 ##姑 ##姒 ##姓 ##委 ##姗 ##姚 ##姜 ##姝 ##姣 ##姥 ##姦 ##姨 ##姪 ##姫 ##姬 ##姹 ##姻 ##姿 ##威 ##娃 ##娄 ##娅 ##娆 ##娇 ##娉 ##娑 ##娓 ##娘 ##娛 ##娜 ##娟 ##娠 ##娣 ##娥 ##娩 ##娱 ##娲 ##娴 ##娶 ##娼 ##婀 ##婁 ##婆 ##婉 ##婊 ##婕 ##婚 ##婢 ##婦 ##婧 ##婪 ##婭 ##婴 ##婵 ##婶 ##婷 ##婺 ##婿 ##媒 ##媚 ##媛 ##媞 ##媧 ##媲 ##媳 ##媽 ##媾 ##嫁 ##嫂 ##嫉 ##嫌 ##嫑 ##嫔 ##嫖 ##嫘 ##嫚 ##嫡 ##嫣 ##嫦 ##嫩 ##嫲 ##嫵 ##嫻 ##嬅 ##嬉 ##嬌 ##嬗 ##嬛 ##嬢 ##嬤 ##嬪 ##嬰 ##嬴 ##嬷 ##嬸 ##嬿 ##孀 ##孃 ##子 ##孑 ##孔 ##孕 ##孖 ##字 ##存 ##孙 ##孚 ##孛 ##孜 ##孝 ##孟 ##孢 ##季 ##孤 ##学 ##孩 ##孪 ##孫 ##孬 ##孰 ##孱 ##孳 ##孵 ##學 ##孺 ##孽 ##孿 ##宁 ##它 ##宅 ##宇 ##守 ##安 ##宋 ##完 ##宏 ##宓 ##宕 ##宗 ##官 ##宙 ##定 ##宛 ##宜 ##宝 ##实 ##実 ##宠 ##审 ##客 ##宣 ##室 ##宥 ##宦 ##宪 ##宫 ##宮 ##宰 ##害 ##宴 ##宵 ##家 ##宸 ##容 ##宽 ##宾 ##宿 ##寂 ##寄 ##寅 ##密 ##寇 ##富 ##寐 ##寒 ##寓 ##寛 ##寝 ##寞 ##察 ##寡 ##寢 ##寥 ##實 ##寧 ##寨 ##審 ##寫 ##寬 ##寮 ##寰 ##寵 ##寶 ##寸 ##对 ##寺 ##寻 ##导 ##対 ##寿 ##封 ##専 ##射 ##将 ##將 ##專 ##尉 ##尊 ##尋 ##對 ##導 ##小 ##少 ##尔 ##尕 ##尖 ##尘 ##尚 ##尝 ##尤 ##尧 ##尬 ##就 ##尴 ##尷 ##尸 ##尹 ##尺 ##尻 ##尼 ##尽 ##尾 ##尿 ##局 ##屁 ##层 ##屄 ##居 ##屆 ##屈 ##屉 ##届 ##屋 ##屌 ##屍 ##屎 ##屏 ##屐 ##屑 ##展 ##屜 ##属 ##屠 ##屡 ##屢 ##層 ##履 ##屬 ##屯 ##山 ##屹 ##屿 ##岀 ##岁 ##岂 ##岌 ##岐 ##岑 ##岔 ##岖 ##岗 ##岘 ##岙 ##岚 ##岛 ##岡 ##岩 ##岫 ##岬 ##岭 ##岱 ##岳 ##岷 ##岸 ##峇 ##峋 ##峒 ##峙 ##峡 ##峤 ##峥 ##峦 ##峨 ##峪 ##峭 ##峯 ##峰 ##峴 ##島 ##峻 ##峽 ##崁 ##崂 ##崆 ##崇 ##崎 ##崑 ##崔 ##崖 ##崗 ##崙 ##崛 ##崧 ##崩 ##崭 ##崴 ##崽 ##嵇 ##嵊 ##嵋 ##嵌 ##嵐 ##嵘 ##嵩 ##嵬 ##嵯 ##嶂 ##嶄 ##嶇 ##嶋 ##嶙 ##嶺 ##嶼 ##嶽 ##巅 ##巍 ##巒 ##巔 ##巖 ##川 ##州 ##巡 ##巢 ##工 ##左 ##巧 ##巨 ##巩 ##巫 ##差 ##己 ##已 ##巳 ##巴 ##巷 ##巻 ##巽 ##巾 ##巿 ##币 ##市 ##布 ##帅 ##帆 ##师 ##希 ##帐 ##帑 ##帕 ##帖 ##帘 ##帚 ##帛 ##帜 ##帝 ##帥 ##带 ##帧 ##師 ##席 ##帮 ##帯 ##帰 ##帳 ##帶 ##帷 ##常 ##帼 ##帽 ##幀 ##幂 ##幄 ##幅 ##幌 ##幔 ##幕 ##幟 ##幡 ##幢 ##幣 ##幫 ##干 ##平 ##年 ##并 ##幸 ##幹 ##幺 ##幻 ##幼 ##幽 ##幾 ##广 ##庁 ##広 ##庄 ##庆 ##庇 ##床 ##序 ##庐 ##库 ##应 ##底 ##庖 ##店 ##庙 ##庚 ##府 ##庞 ##废 ##庠 ##度 ##座 ##庫 ##庭 ##庵 ##庶 ##康 ##庸 ##庹 ##庾 ##廁 ##廂 ##廃 ##廈 ##廉 ##廊 ##廓 ##廖 ##廚 ##廝 ##廟 ##廠 ##廢 ##廣 ##廬 ##廳 ##延 ##廷 ##建 ##廿 ##开 ##弁 ##异 ##弃 ##弄 ##弈 ##弊 ##弋 ##式 ##弑 ##弒 ##弓 ##弔 ##引 ##弗 ##弘 ##弛 ##弟 ##张 ##弥 ##弦 ##弧 ##弩 ##弭 ##弯 ##弱 ##張 ##強 ##弹 ##强 ##弼 ##弾 ##彅 ##彆 ##彈 ##彌 ##彎 ##归 ##当 ##录 ##彗 ##彙 ##彝 ##形 ##彤 ##彥 ##彦 ##彧 ##彩 ##彪 ##彫 ##彬 ##彭 ##彰 ##影 ##彷 ##役 ##彻 ##彼 ##彿 ##往 ##征 ##径 ##待 ##徇 ##很 ##徉 ##徊 ##律 ##後 ##徐 ##徑 ##徒 ##従 ##徕 ##得 ##徘 ##徙 ##徜 ##從 ##徠 ##御 ##徨 ##復 ##循 ##徬 ##微 ##徳 ##徴 ##徵 ##德 ##徹 ##徼 ##徽 ##心 ##必 ##忆 ##忌 ##忍 ##忏 ##忐 ##忑 ##忒 ##忖 ##志 ##忘 ##忙 ##応 ##忠 ##忡 ##忤 ##忧 ##忪 ##快 ##忱 ##念 ##忻 ##忽 ##忿 ##怀 ##态 ##怂 ##怅 ##怆 ##怎 ##怏 ##怒 ##怔 ##怕 ##怖 ##怙 ##怜 ##思 ##怠 ##怡 ##急 ##怦 ##性 ##怨 ##怪 ##怯 ##怵 ##总 ##怼 ##恁 ##恃 ##恆 ##恋 ##恍 ##恐 ##恒 ##恕 ##恙 ##恚 ##恢 ##恣 ##恤 ##恥 ##恨 ##恩 ##恪 ##恫 ##恬 ##恭 ##息 ##恰 ##恳 ##恵 ##恶 ##恸 ##恺 ##恻 ##恼 ##恿 ##悄 ##悅 ##悉 ##悌 ##悍 ##悔 ##悖 ##悚 ##悟 ##悠 ##患 ##悦 ##您 ##悩 ##悪 ##悬 ##悯 ##悱 ##悲 ##悴 ##悵 ##悶 ##悸 ##悻 ##悼 ##悽 ##情 ##惆 ##惇 ##惊 ##惋 ##惑 ##惕 ##惘 ##惚 ##惜 ##惟 ##惠 ##惡 ##惦 ##惧 ##惨 ##惩 ##惫 ##惬 ##惭 ##惮 ##惯 ##惰 ##惱 ##想 ##惴 ##惶 ##惹 ##惺 ##愁 ##愆 ##愈 ##愉 ##愍 ##意 ##愕 ##愚 ##愛 ##愜 ##感 ##愣 ##愤 ##愧 ##愫 ##愷 ##愿 ##慄 ##慈 ##態 ##慌 ##慎 ##慑 ##慕 ##慘 ##慚 ##慟 ##慢 ##慣 ##慧 ##慨 ##慫 ##慮 ##慰 ##慳 ##慵 ##慶 ##慷 ##慾 ##憂 ##憊 ##憋 ##憎 ##憐 ##憑 ##憔 ##憚 ##憤 ##憧 ##憨 ##憩 ##憫 ##憬 ##憲 ##憶 ##憾 ##懂 ##懇 ##懈 ##應 ##懊 ##懋 ##懑 ##懒 ##懦 ##懲 ##懵 ##懶 ##懷 ##懸 ##懺 ##懼 ##懾 ##懿 ##戀 ##戈 ##戊 ##戌 ##戍 ##戎 ##戏 ##成 ##我 ##戒 ##戕 ##或 ##战 ##戚 ##戛 ##戟 ##戡 ##戦 ##截 ##戬 ##戮 ##戰 ##戲 ##戳 ##戴 ##戶 ##户 ##戸 ##戻 ##戾 ##房 ##所 ##扁 ##扇 ##扈 ##扉 ##手 ##才 ##扎 ##扑 ##扒 ##打 ##扔 ##払 ##托 ##扛 ##扣 ##扦 ##执 ##扩 ##扪 ##扫 ##扬 ##扭 ##扮 ##扯 ##扰 ##扱 ##扳 ##扶 ##批 ##扼 ##找 ##承 ##技 ##抄 ##抉 ##把 ##抑 ##抒 ##抓 ##投 ##抖 ##抗 ##折 ##抚 ##抛 ##抜 ##択 ##抟 ##抠 ##抡 ##抢 ##护 ##报 ##抨 ##披 ##抬 ##抱 ##抵 ##抹 ##押 ##抽 ##抿 ##拂 ##拄 ##担 ##拆 ##拇 ##拈 ##拉 ##拋 ##拌 ##拍 ##拎 ##拐 ##拒 ##拓 ##拔 ##拖 ##拗 ##拘 ##拙 ##拚 ##招 ##拜 ##拟 ##拡 ##拢 ##拣 ##拥 ##拦 ##拧 ##拨 ##择 ##括 ##拭 ##拮 ##拯 ##拱 ##拳 ##拴 ##拷 ##拼 ##拽 ##拾 ##拿 ##持 ##挂 ##指 ##挈 ##按 ##挎 ##挑 ##挖 ##挙 ##挚 ##挛 ##挝 ##挞 ##挟 ##挠 ##挡 ##挣 ##挤 ##挥 ##挨 ##挪 ##挫 ##振 ##挲 ##挹 ##挺 ##挽 ##挾 ##捂 ##捅 ##捆 ##捉 ##捋 ##捌 ##捍 ##捎 ##捏 ##捐 ##捕 ##捞 ##损 ##捡 ##换 ##捣 ##捧 ##捨 ##捩 ##据 ##捱 ##捲 ##捶 ##捷 ##捺 ##捻 ##掀 ##掂 ##掃 ##掇 ##授 ##掉 ##掌 ##掏 ##掐 ##排 ##掖 ##掘 ##掙 ##掛 ##掠 ##採 ##探 ##掣 ##接 ##控 ##推 ##掩 ##措 ##掬 ##掰 ##掲 ##掳 ##掴 ##掷 ##掸 ##掺 ##揀 ##揃 ##揄 ##揆 ##揉 ##揍 ##描 ##提 ##插 ##揖 ##揚 ##換 ##握 ##揣 ##揩 ##揪 ##揭 ##揮 ##援 ##揶 ##揸 ##揹 ##揽 ##搀 ##搁 ##搂 ##搅 ##損 ##搏 ##搐 ##搓 ##搔 ##搖 ##搗 ##搜 ##搞 ##搡 ##搪 ##搬 ##搭 ##搵 ##搶 ##携 ##搽 ##摀 ##摁 ##摄 ##摆 ##摇 ##摈 ##摊 ##摒 ##摔 ##摘 ##摞 ##摟 ##摧 ##摩 ##摯 ##摳 ##摸 ##摹 ##摺 ##摻 ##撂 ##撃 ##撅 ##撇 ##撈 ##撐 ##撑 ##撒 ##撓 ##撕 ##撚 ##撞 ##撤 ##撥 ##撩 ##撫 ##撬 ##播 ##撮 ##撰 ##撲 ##撵 ##撷 ##撸 ##撻 ##撼 ##撿 ##擀 ##擁 ##擂 ##擄 ##擅 ##擇 ##擊 ##擋 ##操 ##擎 ##擒 ##擔 ##擘 ##據 ##擞 ##擠 ##擡 ##擢 ##擦 ##擬 ##擰 ##擱 ##擲 ##擴 ##擷 ##擺 ##擼 ##擾 ##攀 ##攏 ##攒 ##攔 ##攘 ##攙 ##攜 ##攝 ##攞 ##攢 ##攣 ##攤 ##攥 ##攪 ##攫 ##攬 ##支 ##收 ##攸 ##改 ##攻 ##放 ##政 ##故 ##效 ##敌 ##敍 ##敎 ##敏 ##救 ##敕 ##敖 ##敗 ##敘 ##教 ##敛 ##敝 ##敞 ##敢 ##散 ##敦 ##敬 ##数 ##敲 ##整 ##敵 ##敷 ##數 ##斂 ##斃 ##文 ##斋 ##斌 ##斎 ##斐 ##斑 ##斓 ##斗 ##料 ##斛 ##斜 ##斟 ##斡 ##斤 ##斥 ##斧 ##斩 ##斫 ##斬 ##断 ##斯 ##新 ##斷 ##方 ##於 ##施 ##旁 ##旃 ##旅 ##旋 ##旌 ##旎 ##族 ##旖 ##旗 ##无 ##既 ##日 ##旦 ##旧 ##旨 ##早 ##旬 ##旭 ##旮 ##旱 ##时 ##旷 ##旺 ##旻 ##昀 ##昂 ##昆 ##昇 ##昉 ##昊 ##昌 ##明 ##昏 ##易 ##昔 ##昕 ##昙 ##星 ##映 ##春 ##昧 ##昨 ##昭 ##是 ##昱 ##昴 ##昵 ##昶 ##昼 ##显 ##晁 ##時 ##晃 ##晉 ##晋 ##晌 ##晏 ##晒 ##晓 ##晔 ##晕 ##晖 ##晗 ##晚 ##晝 ##晞 ##晟 ##晤 ##晦 ##晨 ##晩 ##普 ##景 ##晰 ##晴 ##晶 ##晷 ##智 ##晾 ##暂 ##暄 ##暇 ##暈 ##暉 ##暌 ##暐 ##暑 ##暖 ##暗 ##暝 ##暢 ##暧 ##暨 ##暫 ##暮 ##暱 ##暴 ##暸 ##暹 ##曄 ##曆 ##曇 ##曉 ##曖 ##曙 ##曜 ##曝 ##曠 ##曦 ##曬 ##曰 ##曲 ##曳 ##更 ##書 ##曹 ##曼 ##曾 ##替 ##最 ##會 ##月 ##有 ##朋 ##服 ##朐 ##朔 ##朕 ##朗 ##望 ##朝 ##期 ##朦 ##朧 ##木 ##未 ##末 ##本 ##札 ##朮 ##术 ##朱 ##朴 ##朵 ##机 ##朽 ##杀 ##杂 ##权 ##杆 ##杈 ##杉 ##李 ##杏 ##材 ##村 ##杓 ##杖 ##杜 ##杞 ##束 ##杠 ##条 ##来 ##杨 ##杭 ##杯 ##杰 ##東 ##杳 ##杵 ##杷 ##杼 ##松 ##板 ##极 ##构 ##枇 ##枉 ##枋 ##析 ##枕 ##林 ##枚 ##果 ##枝 ##枢 ##枣 ##枪 ##枫 ##枭 ##枯 ##枰 ##枱 ##枳 ##架 ##枷 ##枸 ##柄 ##柏 ##某 ##柑 ##柒 ##染 ##柔 ##柘 ##柚 ##柜 ##柞 ##柠 ##柢 ##查 ##柩 ##柬 ##柯 ##柱 ##柳 ##柴 ##柵 ##査 ##柿 ##栀 ##栃 ##栄 ##栅 ##标 ##栈 ##栉 ##栋 ##栎 ##栏 ##树 ##栓 ##栖 ##栗 ##校 ##栩 ##株 ##样 ##核 ##根 ##格 ##栽 ##栾 ##桀 ##桁 ##桂 ##桃 ##桅 ##框 ##案 ##桉 ##桌 ##桎 ##桐 ##桑 ##桓 ##桔 ##桜 ##桠 ##桡 ##桢 ##档 ##桥 ##桦 ##桧 ##桨 ##桩 ##桶 ##桿 ##梁 ##梅 ##梆 ##梏 ##梓 ##梗 ##條 ##梟 ##梢 ##梦 ##梧 ##梨 ##梭 ##梯 ##械 ##梳 ##梵 ##梶 ##检 ##棂 ##棄 ##棉 ##棋 ##棍 ##棒 ##棕 ##棗 ##棘 ##棚 ##棟 ##棠 ##棣 ##棧 ##森 ##棱 ##棲 ##棵 ##棹 ##棺 ##椁 ##椅 ##椋 ##植 ##椎 ##椒 ##検 ##椪 ##椭 ##椰 ##椹 ##椽 ##椿 ##楂 ##楊 ##楓 ##楔 ##楚 ##楝 ##楞 ##楠 ##楣 ##楨 ##楫 ##業 ##楮 ##極 ##楷 ##楸 ##楹 ##楼 ##楽 ##概 ##榄 ##榆 ##榈 ##榉 ##榔 ##榕 ##榖 ##榛 ##榜 ##榨 ##榫 ##榭 ##榮 ##榱 ##榴 ##榷 ##榻 ##槁 ##槃 ##構 ##槌 ##槍 ##槎 ##槐 ##槓 ##様 ##槛 ##槟 ##槤 ##槭 ##槲 ##槳 ##槻 ##槽 ##槿 ##樁 ##樂 ##樊 ##樑 ##樓 ##標 ##樞 ##樟 ##模 ##樣 ##権 ##横 ##樫 ##樯 ##樱 ##樵 ##樸 ##樹 ##樺 ##樽 ##樾 ##橄 ##橇 ##橋 ##橐 ##橘 ##橙 ##機 ##橡 ##橢 ##橫 ##橱 ##橹 ##橼 ##檀 ##檄 ##檎 ##檐 ##檔 ##檗 ##檜 ##檢 ##檬 ##檯 ##檳 ##檸 ##檻 ##櫃 ##櫚 ##櫛 ##櫥 ##櫸 ##櫻 ##欄 ##權 ##欒 ##欖 ##欠 ##次 ##欢 ##欣 ##欧 ##欲 ##欸 ##欺 ##欽 ##款 ##歆 ##歇 ##歉 ##歌 ##歎 ##歐 ##歓 ##歙 ##歛 ##歡 ##止 ##正 ##此 ##步 ##武 ##歧 ##歩 ##歪 ##歯 ##歲 ##歳 ##歴 ##歷 ##歸 ##歹 ##死 ##歼 ##殁 ##殃 ##殆 ##殇 ##殉 ##殊 ##残 ##殒 ##殓 ##殖 ##殘 ##殞 ##殡 ##殤 ##殭 ##殯 ##殲 ##殴 ##段 ##殷 ##殺 ##殼 ##殿 ##毀 ##毁 ##毂 ##毅 ##毆 ##毋 ##母 ##毎 ##每 ##毒 ##毓 ##比 ##毕 ##毗 ##毘 ##毙 ##毛 ##毡 ##毫 ##毯 ##毽 ##氈 ##氏 ##氐 ##民 ##氓 ##气 ##氖 ##気 ##氙 ##氛 ##氟 ##氡 ##氢 ##氣 ##氤 ##氦 ##氧 ##氨 ##氪 ##氫 ##氮 ##氯 ##氰 ##氲 ##水 ##氷 ##永 ##氹 ##氾 ##汀 ##汁 ##求 ##汆 ##汇 ##汉 ##汎 ##汐 ##汕 ##汗 ##汙 ##汛 ##汝 ##汞 ##江 ##池 ##污 ##汤 ##汨 ##汩 ##汪 ##汰 ##汲 ##汴 ##汶 ##汹 ##決 ##汽 ##汾 ##沁 ##沂 ##沃 ##沅 ##沈 ##沉 ##沌 ##沏 ##沐 ##沒 ##沓 ##沖 ##沙 ##沛 ##沟 ##没 ##沢 ##沣 ##沥 ##沦 ##沧 ##沪 ##沫 ##沭 ##沮 ##沱 ##河 ##沸 ##油 ##治 ##沼 ##沽 ##沾 ##沿 ##況 ##泄 ##泉 ##泊 ##泌 ##泓 ##法 ##泗 ##泛 ##泞 ##泠 ##泡 ##波 ##泣 ##泥 ##注 ##泪 ##泫 ##泮 ##泯 ##泰 ##泱 ##泳 ##泵 ##泷 ##泸 ##泻 ##泼 ##泽 ##泾 ##洁 ##洄 ##洋 ##洒 ##洗 ##洙 ##洛 ##洞 ##津 ##洩 ##洪 ##洮 ##洱 ##洲 ##洵 ##洶 ##洸 ##洹 ##活 ##洼 ##洽 ##派 ##流 ##浃 ##浄 ##浅 ##浆 ##浇 ##浊 ##测 ##济 ##浏 ##浑 ##浒 ##浓 ##浔 ##浙 ##浚 ##浜 ##浣 ##浦 ##浩 ##浪 ##浬 ##浮 ##浯 ##浴 ##海 ##浸 ##涂 ##涅 ##涇 ##消 ##涉 ##涌 ##涎 ##涓 ##涔 ##涕 ##涙 ##涛 ##涝 ##涞 ##涟 ##涠 ##涡 ##涣 ##涤 ##润 ##涧 ##涨 ##涩 ##涪 ##涮 ##涯 ##液 ##涵 ##涸 ##涼 ##涿 ##淀 ##淄 ##淅 ##淆 ##淇 ##淋 ##淌 ##淑 ##淒 ##淖 ##淘 ##淙 ##淚 ##淞 ##淡 ##淤 ##淦 ##淨 ##淩 ##淪 ##淫 ##淬 ##淮 ##深 ##淳 ##淵 ##混 ##淹 ##淺 ##添 ##淼 ##清 ##済 ##渉 ##渊 ##渋 ##渍 ##渎 ##渐 ##渔 ##渗 ##渙 ##渚 ##減 ##渝 ##渠 ##渡 ##渣 ##渤 ##渥 ##渦 ##温 ##測 ##渭 ##港 ##渲 ##渴 ##游 ##渺 ##渾 ##湃 ##湄 ##湊 ##湍 ##湖 ##湘 ##湛 ##湟 ##湧 ##湫 ##湮 ##湯 ##湳 ##湾 ##湿 ##満 ##溃 ##溅 ##溉 ##溏 ##源 ##準 ##溜 ##溝 ##溟 ##溢 ##溥 ##溧 ##溪 ##溫 ##溯 ##溱 ##溴 ##溶 ##溺 ##溼 ##滁 ##滂 ##滄 ##滅 ##滇 ##滋 ##滌 ##滑 ##滓 ##滔 ##滕 ##滙 ##滚 ##滝 ##滞 ##滟 ##满 ##滢 ##滤 ##滥 ##滦 ##滨 ##滩 ##滬 ##滯 ##滲 ##滴 ##滷 ##滸 ##滾 ##滿 ##漁 ##漂 ##漆 ##漉 ##漏 ##漓 ##演 ##漕 ##漠 ##漢 ##漣 ##漩 ##漪 ##漫 ##漬 ##漯 ##漱 ##漲 ##漳 ##漸 ##漾 ##漿 ##潆 ##潇 ##潋 ##潍 ##潑 ##潔 ##潘 ##潛 ##潜 ##潞 ##潟 ##潢 ##潤 ##潦 ##潧 ##潭 ##潮 ##潰 ##潴 ##潸 ##潺 ##潼 ##澀 ##澄 ##澆 ##澈 ##澍 ##澎 ##澗 ##澜 ##澡 ##澤 ##澧 ##澱 ##澳 ##澹 ##激 ##濁 ##濂 ##濃 ##濑 ##濒 ##濕 ##濘 ##濛 ##濟 ##濠 ##濡 ##濤 ##濫 ##濬 ##濮 ##濯 ##濱 ##濺 ##濾 ##瀅 ##瀆 ##瀉 ##瀋 ##瀏 ##瀑 ##瀕 ##瀘 ##瀚 ##瀛 ##瀝 ##瀞 ##瀟 ##瀧 ##瀨 ##瀬 ##瀰 ##瀾 ##灌 ##灏 ##灑 ##灘 ##灝 ##灞 ##灣 ##火 ##灬 ##灭 ##灯 ##灰 ##灵 ##灶 ##灸 ##灼 ##災 ##灾 ##灿 ##炀 ##炁 ##炅 ##炉 ##炊 ##炎 ##炒 ##炔 ##炕 ##炖 ##炙 ##炜 ##炫 ##炬 ##炭 ##炮 ##炯 ##炳 ##炷 ##炸 ##点 ##為 ##炼 ##炽 ##烁 ##烂 ##烃 ##烈 ##烊 ##烏 ##烘 ##烙 ##烛 ##烟 ##烤 ##烦 ##烧 ##烨 ##烩 ##烫 ##烬 ##热 ##烯 ##烷 ##烹 ##烽 ##焉 ##焊 ##焕 ##焖 ##焗 ##焘 ##焙 ##焚 ##焜 ##無 ##焦 ##焯 ##焰 ##焱 ##然 ##焼 ##煅 ##煉 ##煊 ##煌 ##煎 ##煒 ##煖 ##煙 ##煜 ##煞 ##煤 ##煥 ##煦 ##照 ##煨 ##煩 ##煮 ##煲 ##煸 ##煽 ##熄 ##熊 ##熏 ##熒 ##熔 ##熙 ##熟 ##熠 ##熨 ##熬 ##熱 ##熵 ##熹 ##熾 ##燁 ##燃 ##燄 ##燈 ##燉 ##燊 ##燎 ##燒 ##燔 ##燕 ##燙 ##燜 ##營 ##燥 ##燦 ##燧 ##燭 ##燮 ##燴 ##燻 ##燼 ##燿 ##爆 ##爍 ##爐 ##爛 ##爪 ##爬 ##爭 ##爰 ##爱 ##爲 ##爵 ##父 ##爷 ##爸 ##爹 ##爺 ##爻 ##爽 ##爾 ##牆 ##片 ##版 ##牌 ##牍 ##牒 ##牙 ##牛 ##牝 ##牟 ##牠 ##牡 ##牢 ##牦 ##牧 ##物 ##牯 ##牲 ##牴 ##牵 ##特 ##牺 ##牽 ##犀 ##犁 ##犄 ##犊 ##犍 ##犒 ##犢 ##犧 ##犬 ##犯 ##状 ##犷 ##犸 ##犹 ##狀 ##狂 ##狄 ##狈 ##狎 ##狐 ##狒 ##狗 ##狙 ##狞 ##狠 ##狡 ##狩 ##独 ##狭 ##狮 ##狰 ##狱 ##狸 ##狹 ##狼 ##狽 ##猎 ##猕 ##猖 ##猗 ##猙 ##猛 ##猜 ##猝 ##猥 ##猩 ##猪 ##猫 ##猬 ##献 ##猴 ##猶 ##猷 ##猾 ##猿 ##獄 ##獅 ##獎 ##獐 ##獒 ##獗 ##獠 ##獣 ##獨 ##獭 ##獰 ##獲 ##獵 ##獷 ##獸 ##獺 ##獻 ##獼 ##獾 ##玄 ##率 ##玉 ##王 ##玑 ##玖 ##玛 ##玟 ##玠 ##玥 ##玩 ##玫 ##玮 ##环 ##现 ##玲 ##玳 ##玷 ##玺 ##玻 ##珀 ##珂 ##珅 ##珈 ##珉 ##珊 ##珍 ##珏 ##珐 ##珑 ##珙 ##珞 ##珠 ##珣 ##珥 ##珩 ##珪 ##班 ##珮 ##珲 ##珺 ##現 ##球 ##琅 ##理 ##琇 ##琉 ##琊 ##琍 ##琏 ##琐 ##琛 ##琢 ##琥 ##琦 ##琨 ##琪 ##琬 ##琮 ##琰 ##琲 ##琳 ##琴 ##琵 ##琶 ##琺 ##琼 ##瑀 ##瑁 ##瑄 ##瑋 ##瑕 ##瑗 ##瑙 ##瑚 ##瑛 ##瑜 ##瑞 ##瑟 ##瑠 ##瑣 ##瑤 ##瑩 ##瑪 ##瑯 ##瑰 ##瑶 ##瑾 ##璀 ##璁 ##璃 ##璇 ##璉 ##璋 ##璎 ##璐 ##璜 ##璞 ##璟 ##璧 ##璨 ##環 ##璽 ##璿 ##瓊 ##瓏 ##瓒 ##瓜 ##瓢 ##瓣 ##瓤 ##瓦 ##瓮 ##瓯 ##瓴 ##瓶 ##瓷 ##甄 ##甌 ##甕 ##甘 ##甙 ##甚 ##甜 ##生 ##產 ##産 ##甥 ##甦 ##用 ##甩 ##甫 ##甬 ##甭 ##甯 ##田 ##由 ##甲 ##申 ##电 ##男 ##甸 ##町 ##画 ##甾 ##畀 ##畅 ##界 ##畏 ##畑 ##畔 ##留 ##畜 ##畝 ##畢 ##略 ##畦 ##番 ##畫 ##異 ##畲 ##畳 ##畴 ##當 ##畸 ##畹 ##畿 ##疆 ##疇 ##疊 ##疏 ##疑 ##疔 ##疖 ##疗 ##疙 ##疚 ##疝 ##疟 ##疡 ##疣 ##疤 ##疥 ##疫 ##疮 ##疯 ##疱 ##疲 ##疳 ##疵 ##疸 ##疹 ##疼 ##疽 ##疾 ##痂 ##病 ##症 ##痈 ##痉 ##痊 ##痍 ##痒 ##痔 ##痕 ##痘 ##痙 ##痛 ##痞 ##痠 ##痢 ##痣 ##痤 ##痧 ##痨 ##痪 ##痫 ##痰 ##痱 ##痴 ##痹 ##痺 ##痼 ##痿 ##瘀 ##瘁 ##瘋 ##瘍 ##瘓 ##瘘 ##瘙 ##瘟 ##瘠 ##瘡 ##瘢 ##瘤 ##瘦 ##瘧 ##瘩 ##瘪 ##瘫 ##瘴 ##瘸 ##瘾 ##療 ##癇 ##癌 ##癒 ##癖 ##癜 ##癞 ##癡 ##癢 ##癣 ##癥 ##癫 ##癬 ##癮 ##癱 ##癲 ##癸 ##発 ##登 ##發 ##白 ##百 ##皂 ##的 ##皆 ##皇 ##皈 ##皋 ##皎 ##皑 ##皓 ##皖 ##皙 ##皚 ##皮 ##皰 ##皱 ##皴 ##皺 ##皿 ##盂 ##盃 ##盅 ##盆 ##盈 ##益 ##盎 ##盏 ##盐 ##监 ##盒 ##盔 ##盖 ##盗 ##盘 ##盛 ##盜 ##盞 ##盟 ##盡 ##監 ##盤 ##盥 ##盧 ##盪 ##目 ##盯 ##盱 ##盲 ##直 ##相 ##盹 ##盼 ##盾 ##省 ##眈 ##眉 ##看 ##県 ##眙 ##眞 ##真 ##眠 ##眦 ##眨 ##眩 ##眯 ##眶 ##眷 ##眸 ##眺 ##眼 ##眾 ##着 ##睁 ##睇 ##睏 ##睐 ##睑 ##睛 ##睜 ##睞 ##睡 ##睢 ##督 ##睥 ##睦 ##睨 ##睪 ##睫 ##睬 ##睹 ##睽 ##睾 ##睿 ##瞄 ##瞅 ##瞇 ##瞋 ##瞌 ##瞎 ##瞑 ##瞒 ##瞓 ##瞞 ##瞟 ##瞠 ##瞥 ##瞧 ##瞩 ##瞪 ##瞬 ##瞭 ##瞰 ##瞳 ##瞻 ##瞼 ##瞿 ##矇 ##矍 ##矗 ##矚 ##矛 ##矜 ##矢 ##矣 ##知 ##矩 ##矫 ##短 ##矮 ##矯 ##石 ##矶 ##矽 ##矾 ##矿 ##码 ##砂 ##砌 ##砍 ##砒 ##研 ##砖 ##砗 ##砚 ##砝 ##砣 ##砥 ##砧 ##砭 ##砰 ##砲 ##破 ##砷 ##砸 ##砺 ##砼 ##砾 ##础 ##硅 ##硐 ##硒 ##硕 ##硝 ##硫 ##硬 ##确 ##硯 ##硼 ##碁 ##碇 ##碉 ##碌 ##碍 ##碎 ##碑 ##碓 ##碗 ##碘 ##碚 ##碛 ##碟 ##碣 ##碧 ##碩 ##碰 ##碱 ##碳 ##碴 ##確 ##碼 ##碾 ##磁 ##磅 ##磊 ##磋 ##磐 ##磕 ##磚 ##磡 ##磨 ##磬 ##磯 ##磲 ##磷 ##磺 ##礁 ##礎 ##礙 ##礡 ##礦 ##礪 ##礫 ##礴 ##示 ##礼 ##社 ##祀 ##祁 ##祂 ##祇 ##祈 ##祉 ##祎 ##祐 ##祕 ##祖 ##祗 ##祚 ##祛 ##祜 ##祝 ##神 ##祟 ##祠 ##祢 ##祥 ##票 ##祭 ##祯 ##祷 ##祸 ##祺 ##祿 ##禀 ##禁 ##禄 ##禅 ##禍 ##禎 ##福 ##禛 ##禦 ##禧 ##禪 ##禮 ##禱 ##禹 ##禺 ##离 ##禽 ##禾 ##禿 ##秀 ##私 ##秃 ##秆 ##秉 ##秋 ##种 ##科 ##秒 ##秘 ##租 ##秣 ##秤 ##秦 ##秧 ##秩 ##秭 ##积 ##称 ##秸 ##移 ##秽 ##稀 ##稅 ##程 ##稍 ##税 ##稔 ##稗 ##稚 ##稜 ##稞 ##稟 ##稠 ##稣 ##種 ##稱 ##稲 ##稳 ##稷 ##稹 ##稻 ##稼 ##稽 ##稿 ##穀 ##穂 ##穆 ##穌 ##積 ##穎 ##穗 ##穢 ##穩 ##穫 ##穴 ##究 ##穷 ##穹 ##空 ##穿 ##突 ##窃 ##窄 ##窈 ##窍 ##窑 ##窒 ##窓 ##窕 ##窖 ##窗 ##窘 ##窜 ##窝 ##窟 ##窠 ##窥 ##窦 ##窨 ##窩 ##窪 ##窮 ##窯 ##窺 ##窿 ##竄 ##竅 ##竇 ##竊 ##立 ##竖 ##站 ##竜 ##竞 ##竟 ##章 ##竣 ##童 ##竭 ##端 ##競 ##竹 ##竺 ##竽 ##竿 ##笃 ##笆 ##笈 ##笋 ##笏 ##笑 ##笔 ##笙 ##笛 ##笞 ##笠 ##符 ##笨 ##第 ##笹 ##笺 ##笼 ##筆 ##等 ##筊 ##筋 ##筍 ##筏 ##筐 ##筑 ##筒 ##答 ##策 ##筛 ##筝 ##筠 ##筱 ##筲 ##筵 ##筷 ##筹 ##签 ##简 ##箇 ##箋 ##箍 ##箏 ##箐 ##箔 ##箕 ##算 ##箝 ##管 ##箩 ##箫 ##箭 ##箱 ##箴 ##箸 ##節 ##篁 ##範 ##篆 ##篇 ##築 ##篑 ##篓 ##篙 ##篝 ##篠 ##篡 ##篤 ##篩 ##篪 ##篮 ##篱 ##篷 ##簇 ##簌 ##簍 ##簡 ##簦 ##簧 ##簪 ##簫 ##簷 ##簸 ##簽 ##簾 ##簿 ##籁 ##籃 ##籌 ##籍 ##籐 ##籟 ##籠 ##籤 ##籬 ##籮 ##籲 ##米 ##类 ##籼 ##籽 ##粄 ##粉 ##粑 ##粒 ##粕 ##粗 ##粘 ##粟 ##粤 ##粥 ##粧 ##粪 ##粮 ##粱 ##粲 ##粳 ##粵 ##粹 ##粼 ##粽 ##精 ##粿 ##糅 ##糊 ##糍 ##糕 ##糖 ##糗 ##糙 ##糜 ##糞 ##糟 ##糠 ##糧 ##糬 ##糯 ##糰 ##糸 ##系 ##糾 ##紀 ##紂 ##約 ##紅 ##紉 ##紊 ##紋 ##納 ##紐 ##紓 ##純 ##紗 ##紘 ##紙 ##級 ##紛 ##紜 ##素 ##紡 ##索 ##紧 ##紫 ##紮 ##累 ##細 ##紳 ##紹 ##紺 ##終 ##絃 ##組 ##絆 ##経 ##結 ##絕 ##絞 ##絡 ##絢 ##給 ##絨 ##絮 ##統 ##絲 ##絳 ##絵 ##絶 ##絹 ##綁 ##綏 ##綑 ##經 ##継 ##続 ##綜 ##綠 ##綢 ##綦 ##綫 ##綬 ##維 ##綱 ##網 ##綴 ##綵 ##綸 ##綺 ##綻 ##綽 ##綾 ##綿 ##緊 ##緋 ##総 ##緑 ##緒 ##緘 ##線 ##緝 ##緞 ##締 ##緣 ##編 ##緩 ##緬 ##緯 ##練 ##緹 ##緻 ##縁 ##縄 ##縈 ##縛 ##縝 ##縣 ##縫 ##縮 ##縱 ##縴 ##縷 ##總 ##績 ##繁 ##繃 ##繆 ##繇 ##繋 ##織 ##繕 ##繚 ##繞 ##繡 ##繩 ##繪 ##繫 ##繭 ##繳 ##繹 ##繼 ##繽 ##纂 ##續 ##纍 ##纏 ##纓 ##纔 ##纖 ##纜 ##纠 ##红 ##纣 ##纤 ##约 ##级 ##纨 ##纪 ##纫 ##纬 ##纭 ##纯 ##纰 ##纱 ##纲 ##纳 ##纵 ##纶 ##纷 ##纸 ##纹 ##纺 ##纽 ##纾 ##线 ##绀 ##练 ##组 ##绅 ##细 ##织 ##终 ##绊 ##绍 ##绎 ##经 ##绑 ##绒 ##结 ##绔 ##绕 ##绘 ##给 ##绚 ##绛 ##络 ##绝 ##绞 ##统 ##绡 ##绢 ##绣 ##绥 ##绦 ##继 ##绩 ##绪 ##绫 ##续 ##绮 ##绯 ##绰 ##绳 ##维 ##绵 ##绶 ##绷 ##绸 ##绻 ##综 ##绽 ##绾 ##绿 ##缀 ##缄 ##缅 ##缆 ##缇 ##缈 ##缉 ##缎 ##缓 ##缔 ##缕 ##编 ##缘 ##缙 ##缚 ##缜 ##缝 ##缠 ##缢 ##缤 ##缥 ##缨 ##缩 ##缪 ##缭 ##缮 ##缰 ##缱 ##缴 ##缸 ##缺 ##缽 ##罂 ##罄 ##罌 ##罐 ##网 ##罔 ##罕 ##罗 ##罚 ##罡 ##罢 ##罩 ##罪 ##置 ##罰 ##署 ##罵 ##罷 ##罹 ##羁 ##羅 ##羈 ##羊 ##羌 ##美 ##羔 ##羚 ##羞 ##羟 ##羡 ##羣 ##群 ##羥 ##羧 ##羨 ##義 ##羯 ##羲 ##羸 ##羹 ##羽 ##羿 ##翁 ##翅 ##翊 ##翌 ##翎 ##習 ##翔 ##翘 ##翟 ##翠 ##翡 ##翦 ##翩 ##翰 ##翱 ##翳 ##翹 ##翻 ##翼 ##耀 ##老 ##考 ##耄 ##者 ##耆 ##耋 ##而 ##耍 ##耐 ##耒 ##耕 ##耗 ##耘 ##耙 ##耦 ##耨 ##耳 ##耶 ##耷 ##耸 ##耻 ##耽 ##耿 ##聂 ##聆 ##聊 ##聋 ##职 ##聒 ##联 ##聖 ##聘 ##聚 ##聞 ##聪 ##聯 ##聰 ##聲 ##聳 ##聴 ##聶 ##職 ##聽 ##聾 ##聿 ##肃 ##肄 ##肅 ##肆 ##肇 ##肉 ##肋 ##肌 ##肏 ##肓 ##肖 ##肘 ##肚 ##肛 ##肝 ##肠 ##股 ##肢 ##肤 ##肥 ##肩 ##肪 ##肮 ##肯 ##肱 ##育 ##肴 ##肺 ##肽 ##肾 ##肿 ##胀 ##胁 ##胃 ##胄 ##胆 ##背 ##胍 ##胎 ##胖 ##胚 ##胛 ##胜 ##胝 ##胞 ##胡 ##胤 ##胥 ##胧 ##胫 ##胭 ##胯 ##胰 ##胱 ##胳 ##胴 ##胶 ##胸 ##胺 ##能 ##脂 ##脅 ##脆 ##脇 ##脈 ##脉 ##脊 ##脍 ##脏 ##脐 ##脑 ##脓 ##脖 ##脘 ##脚 ##脛 ##脣 ##脩 ##脫 ##脯 ##脱 ##脲 ##脳 ##脸 ##脹 ##脾 ##腆 ##腈 ##腊 ##腋 ##腌 ##腎 ##腐 ##腑 ##腓 ##腔 ##腕 ##腥 ##腦 ##腩 ##腫 ##腭 ##腮 ##腰 ##腱 ##腳 ##腴 ##腸 ##腹 ##腺 ##腻 ##腼 ##腾 ##腿 ##膀 ##膈 ##膊 ##膏 ##膑 ##膘 ##膚 ##膛 ##膜 ##膝 ##膠 ##膦 ##膨 ##膩 ##膳 ##膺 ##膻 ##膽 ##膾 ##膿 ##臀 ##臂 ##臃 ##臆 ##臉 ##臊 ##臍 ##臓 ##臘 ##臟 ##臣 ##臥 ##臧 ##臨 ##自 ##臬 ##臭 ##至 ##致 ##臺 ##臻 ##臼 ##臾 ##舀 ##舂 ##舅 ##舆 ##與 ##興 ##舉 ##舊 ##舌 ##舍 ##舎 ##舐 ##舒 ##舔 ##舖 ##舗 ##舛 ##舜 ##舞 ##舟 ##航 ##舫 ##般 ##舰 ##舱 ##舵 ##舶 ##舷 ##舸 ##船 ##舺 ##舾 ##艇 ##艋 ##艘 ##艙 ##艦 ##艮 ##良 ##艰 ##艱 ##色 ##艳 ##艷 ##艹 ##艺 ##艾 ##节 ##芃 ##芈 ##芊 ##芋 ##芍 ##芎 ##芒 ##芙 ##芜 ##芝 ##芡 ##芥 ##芦 ##芩 ##芪 ##芫 ##芬 ##芭 ##芮 ##芯 ##花 ##芳 ##芷 ##芸 ##芹 ##芻 ##芽 ##芾 ##苁 ##苄 ##苇 ##苋 ##苍 ##苏 ##苑 ##苒 ##苓 ##苔 ##苕 ##苗 ##苛 ##苜 ##苞 ##苟 ##苡 ##苣 ##若 ##苦 ##苫 ##苯 ##英 ##苷 ##苹 ##苻 ##茁 ##茂 ##范 ##茄 ##茅 ##茉 ##茎 ##茏 ##茗 ##茜 ##茧 ##茨 ##茫 ##茬 ##茭 ##茯 ##茱 ##茲 ##茴 ##茵 ##茶 ##茸 ##茹 ##茼 ##荀 ##荃 ##荆 ##草 ##荊 ##荏 ##荐 ##荒 ##荔 ##荖 ##荘 ##荚 ##荞 ##荟 ##荠 ##荡 ##荣 ##荤 ##荥 ##荧 ##荨 ##荪 ##荫 ##药 ##荳 ##荷 ##荸 ##荻 ##荼 ##荽 ##莅 ##莆 ##莉 ##莊 ##莎 ##莒 ##莓 ##莖 ##莘 ##莞 ##莠 ##莢 ##莧 ##莪 ##莫 ##莱 ##莲 ##莴 ##获 ##莹 ##莺 ##莽 ##莿 ##菀 ##菁 ##菅 ##菇 ##菈 ##菊 ##菌 ##菏 ##菓 ##菖 ##菘 ##菜 ##菟 ##菠 ##菡 ##菩 ##華 ##菱 ##菲 ##菸 ##菽 ##萁 ##萃 ##萄 ##萊 ##萋 ##萌 ##萍 ##萎 ##萘 ##萝 ##萤 ##营 ##萦 ##萧 ##萨 ##萩 ##萬 ##萱 ##萵 ##萸 ##萼 ##落 ##葆 ##葉 ##著 ##葚 ##葛 ##葡 ##董 ##葦 ##葩 ##葫 ##葬 ##葭 ##葯 ##葱 ##葳 ##葵 ##葷 ##葺 ##蒂 ##蒋 ##蒐 ##蒔 ##蒙 ##蒜 ##蒞 ##蒟 ##蒡 ##蒨 ##蒲 ##蒸 ##蒹 ##蒻 ##蒼 ##蒿 ##蓁 ##蓄 ##蓆 ##蓉 ##蓋 ##蓑 ##蓓 ##蓖 ##蓝 ##蓟 ##蓦 ##蓬 ##蓮 ##蓼 ##蓿 ##蔑 ##蔓 ##蔔 ##蔗 ##蔘 ##蔚 ##蔡 ##蔣 ##蔥 ##蔫 ##蔬 ##蔭 ##蔵 ##蔷 ##蔺 ##蔻 ##蔼 ##蔽 ##蕁 ##蕃 ##蕈 ##蕉 ##蕊 ##蕎 ##蕙 ##蕤 ##蕨 ##蕩 ##蕪 ##蕭 ##蕲 ##蕴 ##蕻 ##蕾 ##薄 ##薅 ##薇 ##薈 ##薊 ##薏 ##薑 ##薔 ##薙 ##薛 ##薦 ##薨 ##薩 ##薪 ##薬 ##薯 ##薰 ##薹 ##藉 ##藍 ##藏 ##藐 ##藓 ##藕 ##藜 ##藝 ##藤 ##藥 ##藩 ##藹 ##藻 ##藿 ##蘆 ##蘇 ##蘊 ##蘋 ##蘑 ##蘚 ##蘭 ##蘸 ##蘼 ##蘿 ##虎 ##虏 ##虐 ##虑 ##虔 ##處 ##虚 ##虛 ##虜 ##虞 ##號 ##虢 ##虧 ##虫 ##虬 ##虱 ##虹 ##虻 ##虽 ##虾 ##蚀 ##蚁 ##蚂 ##蚊 ##蚌 ##蚓 ##蚕 ##蚜 ##蚝 ##蚣 ##蚤 ##蚩 ##蚪 ##蚯 ##蚱 ##蚵 ##蛀 ##蛆 ##蛇 ##蛊 ##蛋 ##蛎 ##蛐 ##蛔 ##蛙 ##蛛 ##蛟 ##蛤 ##蛭 ##蛮 ##蛰 ##蛳 ##蛹 ##蛻 ##蛾 ##蜀 ##蜂 ##蜃 ##蜆 ##蜇 ##蜈 ##蜊 ##蜍 ##蜒 ##蜓 ##蜕 ##蜗 ##蜘 ##蜚 ##蜜 ##蜡 ##蜢 ##蜥 ##蜱 ##蜴 ##蜷 ##蜻 ##蜿 ##蝇 ##蝈 ##蝉 ##蝌 ##蝎 ##蝕 ##蝗 ##蝙 ##蝟 ##蝠 ##蝦 ##蝨 ##蝴 ##蝶 ##蝸 ##蝼 ##螂 ##螃 ##融 ##螞 ##螢 ##螨 ##螯 ##螳 ##螺 ##蟀 ##蟄 ##蟆 ##蟋 ##蟎 ##蟑 ##蟒 ##蟠 ##蟬 ##蟲 ##蟹 ##蟻 ##蟾 ##蠅 ##蠍 ##蠔 ##蠕 ##蠛 ##蠟 ##蠡 ##蠢 ##蠣 ##蠱 ##蠶 ##蠹 ##蠻 ##血 ##衄 ##衅 ##衆 ##行 ##衍 ##術 ##衔 ##街 ##衙 ##衛 ##衝 ##衞 ##衡 ##衢 ##衣 ##补 ##表 ##衩 ##衫 ##衬 ##衮 ##衰 ##衲 ##衷 ##衹 ##衾 ##衿 ##袁 ##袂 ##袄 ##袅 ##袈 ##袋 ##袍 ##袒 ##袖 ##袜 ##袞 ##袤 ##袪 ##被 ##袭 ##袱 ##裁 ##裂 ##装 ##裆 ##裊 ##裏 ##裔 ##裕 ##裘 ##裙 ##補 ##裝 ##裟 ##裡 ##裤 ##裨 ##裱 ##裳 ##裴 ##裸 ##裹 ##製 ##裾 ##褂 ##複 ##褐 ##褒 ##褓 ##褔 ##褚 ##褥 ##褪 ##褫 ##褲 ##褶 ##褻 ##襁 ##襄 ##襟 ##襠 ##襪 ##襬 ##襯 ##襲 ##西 ##要 ##覃 ##覆 ##覇 ##見 ##規 ##覓 ##視 ##覚 ##覦 ##覧 ##親 ##覬 ##観 ##覷 ##覺 ##覽 ##觀 ##见 ##观 ##规 ##觅 ##视 ##览 ##觉 ##觊 ##觎 ##觐 ##觑 ##角 ##觞 ##解 ##觥 ##触 ##觸 ##言 ##訂 ##計 ##訊 ##討 ##訓 ##訕 ##訖 ##託 ##記 ##訛 ##訝 ##訟 ##訣 ##訥 ##訪 ##設 ##許 ##訳 ##訴 ##訶 ##診 ##註 ##証 ##詆 ##詐 ##詔 ##評 ##詛 ##詞 ##詠 ##詡 ##詢 ##詣 ##試 ##詩 ##詫 ##詬 ##詭 ##詮 ##詰 ##話 ##該 ##詳 ##詹 ##詼 ##誅 ##誇 ##誉 ##誌 ##認 ##誓 ##誕 ##誘 ##語 ##誠 ##誡 ##誣 ##誤 ##誥 ##誦 ##誨 ##說 ##説 ##読 ##誰 ##課 ##誹 ##誼 ##調 ##諄 ##談 ##請 ##諏 ##諒 ##論 ##諗 ##諜 ##諡 ##諦 ##諧 ##諫 ##諭 ##諮 ##諱 ##諳 ##諷 ##諸 ##諺 ##諾 ##謀 ##謁 ##謂 ##謄 ##謊 ##謎 ##謐 ##謔 ##謗 ##謙 ##講 ##謝 ##謠 ##謨 ##謬 ##謹 ##謾 ##譁 ##證 ##譎 ##譏 ##識 ##譙 ##譚 ##譜 ##警 ##譬 ##譯 ##議 ##譲 ##譴 ##護 ##譽 ##讀 ##變 ##讓 ##讚 ##讞 ##计 ##订 ##认 ##讥 ##讧 ##讨 ##让 ##讪 ##讫 ##训 ##议 ##讯 ##记 ##讲 ##讳 ##讴 ##讶 ##讷 ##许 ##讹 ##论 ##讼 ##讽 ##设 ##访 ##诀 ##证 ##诃 ##评 ##诅 ##识 ##诈 ##诉 ##诊 ##诋 ##词 ##诏 ##译 ##试 ##诗 ##诘 ##诙 ##诚 ##诛 ##话 ##诞 ##诟 ##诠 ##诡 ##询 ##诣 ##诤 ##该 ##详 ##诧 ##诩 ##诫 ##诬 ##语 ##误 ##诰 ##诱 ##诲 ##说 ##诵 ##诶 ##请 ##诸 ##诺 ##读 ##诽 ##课 ##诿 ##谀 ##谁 ##调 ##谄 ##谅 ##谆 ##谈 ##谊 ##谋 ##谌 ##谍 ##谎 ##谏 ##谐 ##谑 ##谒 ##谓 ##谔 ##谕 ##谗 ##谘 ##谙 ##谚 ##谛 ##谜 ##谟 ##谢 ##谣 ##谤 ##谥 ##谦 ##谧 ##谨 ##谩 ##谪 ##谬 ##谭 ##谯 ##谱 ##谲 ##谴 ##谶 ##谷 ##豁 ##豆 ##豇 ##豈 ##豉 ##豊 ##豌 ##豎 ##豐 ##豔 ##豚 ##象 ##豢 ##豪 ##豫 ##豬 ##豹 ##豺 ##貂 ##貅 ##貌 ##貓 ##貔 ##貘 ##貝 ##貞 ##負 ##財 ##貢 ##貧 ##貨 ##販 ##貪 ##貫 ##責 ##貯 ##貰 ##貳 ##貴 ##貶 ##買 ##貸 ##費 ##貼 ##貽 ##貿 ##賀 ##賁 ##賂 ##賃 ##賄 ##資 ##賈 ##賊 ##賑 ##賓 ##賜 ##賞 ##賠 ##賡 ##賢 ##賣 ##賤 ##賦 ##質 ##賬 ##賭 ##賴 ##賺 ##購 ##賽 ##贅 ##贈 ##贊 ##贍 ##贏 ##贓 ##贖 ##贛 ##贝 ##贞 ##负 ##贡 ##财 ##责 ##贤 ##败 ##账 ##货 ##质 ##贩 ##贪 ##贫 ##贬 ##购 ##贮 ##贯 ##贰 ##贱 ##贲 ##贴 ##贵 ##贷 ##贸 ##费 ##贺 ##贻 ##贼 ##贾 ##贿 ##赁 ##赂 ##赃 ##资 ##赅 ##赈 ##赊 ##赋 ##赌 ##赎 ##赏 ##赐 ##赓 ##赔 ##赖 ##赘 ##赚 ##赛 ##赝 ##赞 ##赠 ##赡 ##赢 ##赣 ##赤 ##赦 ##赧 ##赫 ##赭 ##走 ##赳 ##赴 ##赵 ##赶 ##起 ##趁 ##超 ##越 ##趋 ##趕 ##趙 ##趟 ##趣 ##趨 ##足 ##趴 ##趵 ##趸 ##趺 ##趾 ##跃 ##跄 ##跆 ##跋 ##跌 ##跎 ##跑 ##跖 ##跚 ##跛 ##距 ##跟 ##跡 ##跤 ##跨 ##跩 ##跪 ##路 ##跳 ##践 ##跷 ##跹 ##跺 ##跻 ##踉 ##踊 ##踌 ##踏 ##踐 ##踝 ##踞 ##踟 ##踢 ##踩 ##踪 ##踮 ##踱 ##踴 ##踵 ##踹 ##蹂 ##蹄 ##蹇 ##蹈 ##蹉 ##蹊 ##蹋 ##蹑 ##蹒 ##蹙 ##蹟 ##蹣 ##蹤 ##蹦 ##蹩 ##蹬 ##蹭 ##蹲 ##蹴 ##蹶 ##蹺 ##蹼 ##蹿 ##躁 ##躇 ##躉 ##躊 ##躋 ##躍 ##躏 ##躪 ##身 ##躬 ##躯 ##躲 ##躺 ##軀 ##車 ##軋 ##軌 ##軍 ##軒 ##軟 ##転 ##軸 ##軼 ##軽 ##軾 ##較 ##載 ##輒 ##輓 ##輔 ##輕 ##輛 ##輝 ##輟 ##輩 ##輪 ##輯 ##輸 ##輻 ##輾 ##輿 ##轄 ##轅 ##轆 ##轉 ##轍 ##轎 ##轟 ##车 ##轧 ##轨 ##轩 ##转 ##轭 ##轮 ##软 ##轰 ##轲 ##轴 ##轶 ##轻 ##轼 ##载 ##轿 ##较 ##辄 ##辅 ##辆 ##辇 ##辈 ##辉 ##辊 ##辍 ##辐 ##辑 ##输 ##辕 ##辖 ##辗 ##辘 ##辙 ##辛 ##辜 ##辞 ##辟 ##辣 ##辦 ##辨 ##辩 ##辫 ##辭 ##辮 ##辯 ##辰 ##辱 ##農 ##边 ##辺 ##辻 ##込 ##辽 ##达 ##迁 ##迂 ##迄 ##迅 ##过 ##迈 ##迎 ##运 ##近 ##返 ##还 ##这 ##进 ##远 ##违 ##连 ##迟 ##迢 ##迤 ##迥 ##迦 ##迩 ##迪 ##迫 ##迭 ##述 ##迴 ##迷 ##迸 ##迹 ##迺 ##追 ##退 ##送 ##适 ##逃 ##逅 ##逆 ##选 ##逊 ##逍 ##透 ##逐 ##递 ##途 ##逕 ##逗 ##這 ##通 ##逛 ##逝 ##逞 ##速 ##造 ##逢 ##連 ##逮 ##週 ##進 ##逵 ##逶 ##逸 ##逻 ##逼 ##逾 ##遁 ##遂 ##遅 ##遇 ##遊 ##運 ##遍 ##過 ##遏 ##遐 ##遑 ##遒 ##道 ##達 ##違 ##遗 ##遙 ##遛 ##遜 ##遞 ##遠 ##遢 ##遣 ##遥 ##遨 ##適 ##遭 ##遮 ##遲 ##遴 ##遵 ##遶 ##遷 ##選 ##遺 ##遼 ##遽 ##避 ##邀 ##邁 ##邂 ##邃 ##還 ##邇 ##邈 ##邊 ##邋 ##邏 ##邑 ##邓 ##邕 ##邛 ##邝 ##邢 ##那 ##邦 ##邨 ##邪 ##邬 ##邮 ##邯 ##邰 ##邱 ##邳 ##邵 ##邸 ##邹 ##邺 ##邻 ##郁 ##郅 ##郊 ##郎 ##郑 ##郜 ##郝 ##郡 ##郢 ##郤 ##郦 ##郧 ##部 ##郫 ##郭 ##郴 ##郵 ##郷 ##郸 ##都 ##鄂 ##鄉 ##鄒 ##鄔 ##鄙 ##鄞 ##鄢 ##鄧 ##鄭 ##鄰 ##鄱 ##鄲 ##鄺 ##酉 ##酊 ##酋 ##酌 ##配 ##酐 ##酒 ##酗 ##酚 ##酝 ##酢 ##酣 ##酥 ##酩 ##酪 ##酬 ##酮 ##酯 ##酰 ##酱 ##酵 ##酶 ##酷 ##酸 ##酿 ##醃 ##醇 ##醉 ##醋 ##醍 ##醐 ##醒 ##醚 ##醛 ##醜 ##醞 ##醣 ##醪 ##醫 ##醬 ##醮 ##醯 ##醴 ##醺 ##釀 ##釁 ##采 ##釉 ##释 ##釋 ##里 ##重 ##野 ##量 ##釐 ##金 ##釗 ##釘 ##釜 ##針 ##釣 ##釦 ##釧 ##釵 ##鈀 ##鈉 ##鈍 ##鈎 ##鈔 ##鈕 ##鈞 ##鈣 ##鈦 ##鈪 ##鈴 ##鈺 ##鈾 ##鉀 ##鉄 ##鉅 ##鉉 ##鉑 ##鉗 ##鉚 ##鉛 ##鉤 ##鉴 ##鉻 ##銀 ##銃 ##銅 ##銑 ##銓 ##銖 ##銘 ##銜 ##銬 ##銭 ##銮 ##銳 ##銷 ##銹 ##鋁 ##鋅 ##鋒 ##鋤 ##鋪 ##鋰 ##鋸 ##鋼 ##錄 ##錐 ##錘 ##錚 ##錠 ##錢 ##錦 ##錨 ##錫 ##錮 ##錯 ##録 ##錳 ##錶 ##鍊 ##鍋 ##鍍 ##鍛 ##鍥 ##鍰 ##鍵 ##鍺 ##鍾 ##鎂 ##鎊 ##鎌 ##鎏 ##鎔 ##鎖 ##鎗 ##鎚 ##鎧 ##鎬 ##鎮 ##鎳 ##鏈 ##鏖 ##鏗 ##鏘 ##鏞 ##鏟 ##鏡 ##鏢 ##鏤 ##鏽 ##鐘 ##鐮 ##鐲 ##鐳 ##鐵 ##鐸 ##鐺 ##鑄 ##鑊 ##鑑 ##鑒 ##鑣 ##鑫 ##鑰 ##鑲 ##鑼 ##鑽 ##鑾 ##鑿 ##针 ##钉 ##钊 ##钎 ##钏 ##钒 ##钓 ##钗 ##钙 ##钛 ##钜 ##钝 ##钞 ##钟 ##钠 ##钡 ##钢 ##钣 ##钤 ##钥 ##钦 ##钧 ##钨 ##钩 ##钮 ##钯 ##钰 ##钱 ##钳 ##钴 ##钵 ##钺 ##钻 ##钼 ##钾 ##钿 ##铀 ##铁 ##铂 ##铃 ##铄 ##铅 ##铆 ##铉 ##铎 ##铐 ##铛 ##铜 ##铝 ##铠 ##铡 ##铢 ##铣 ##铤 ##铨 ##铩 ##铬 ##铭 ##铮 ##铰 ##铲 ##铵 ##银 ##铸 ##铺 ##链 ##铿 ##销 ##锁 ##锂 ##锄 ##锅 ##锆 ##锈 ##锉 ##锋 ##锌 ##锏 ##锐 ##锑 ##错 ##锚 ##锟 ##锡 ##锢 ##锣 ##锤 ##锥 ##锦 ##锭 ##键 ##锯 ##锰 ##锲 ##锵 ##锹 ##锺 ##锻 ##镀 ##镁 ##镂 ##镇 ##镉 ##镌 ##镍 ##镐 ##镑 ##镕 ##镖 ##镗 ##镛 ##镜 ##镣 ##镭 ##镯 ##镰 ##镳 ##镶 ##長 ##长 ##門 ##閃 ##閉 ##開 ##閎 ##閏 ##閑 ##閒 ##間 ##閔 ##閘 ##閡 ##関 ##閣 ##閥 ##閨 ##閩 ##閱 ##閲 ##閹 ##閻 ##閾 ##闆 ##闇 ##闊 ##闌 ##闍 ##闔 ##闕 ##闖 ##闘 ##關 ##闡 ##闢 ##门 ##闪 ##闫 ##闭 ##问 ##闯 ##闰 ##闲 ##间 ##闵 ##闷 ##闸 ##闹 ##闺 ##闻 ##闽 ##闾 ##阀 ##阁 ##阂 ##阅 ##阆 ##阇 ##阈 ##阉 ##阎 ##阐 ##阑 ##阔 ##阕 ##阖 ##阙 ##阚 ##阜 ##队 ##阡 ##阪 ##阮 ##阱 ##防 ##阳 ##阴 ##阵 ##阶 ##阻 ##阿 ##陀 ##陂 ##附 ##际 ##陆 ##陇 ##陈 ##陋 ##陌 ##降 ##限 ##陕 ##陛 ##陝 ##陞 ##陟 ##陡 ##院 ##陣 ##除 ##陨 ##险 ##陪 ##陰 ##陲 ##陳 ##陵 ##陶 ##陷 ##陸 ##険 ##陽 ##隅 ##隆 ##隈 ##隊 ##隋 ##隍 ##階 ##随 ##隐 ##隔 ##隕 ##隘 ##隙 ##際 ##障 ##隠 ##隣 ##隧 ##隨 ##險 ##隱 ##隴 ##隶 ##隸 ##隻 ##隼 ##隽 ##难 ##雀 ##雁 ##雄 ##雅 ##集 ##雇 ##雉 ##雋 ##雌 ##雍 ##雎 ##雏 ##雑 ##雒 ##雕 ##雖 ##雙 ##雛 ##雜 ##雞 ##離 ##難 ##雨 ##雪 ##雯 ##雰 ##雲 ##雳 ##零 ##雷 ##雹 ##電 ##雾 ##需 ##霁 ##霄 ##霆 ##震 ##霈 ##霉 ##霊 ##霍 ##霎 ##霏 ##霑 ##霓 ##霖 ##霜 ##霞 ##霧 ##霭 ##霰 ##露 ##霸 ##霹 ##霽 ##霾 ##靂 ##靄 ##靈 ##青 ##靓 ##靖 ##静 ##靚 ##靛 ##靜 ##非 ##靠 ##靡 ##面 ##靥 ##靦 ##革 ##靳 ##靴 ##靶 ##靼 ##鞅 ##鞋 ##鞍 ##鞏 ##鞑 ##鞘 ##鞠 ##鞣 ##鞦 ##鞭 ##韆 ##韋 ##韌 ##韓 ##韜 ##韦 ##韧 ##韩 ##韬 ##韭 ##音 ##韵 ##韶 ##韻 ##響 ##頁 ##頂 ##頃 ##項 ##順 ##須 ##頌 ##預 ##頑 ##頒 ##頓 ##頗 ##領 ##頜 ##頡 ##頤 ##頫 ##頭 ##頰 ##頷 ##頸 ##頹 ##頻 ##頼 ##顆 ##題 ##額 ##顎 ##顏 ##顔 ##願 ##顛 ##類 ##顧 ##顫 ##顯 ##顱 ##顴 ##页 ##顶 ##顷 ##项 ##顺 ##须 ##顼 ##顽 ##顾 ##顿 ##颁 ##颂 ##预 ##颅 ##领 ##颇 ##颈 ##颉 ##颊 ##颌 ##颍 ##颐 ##频 ##颓 ##颔 ##颖 ##颗 ##题 ##颚 ##颛 ##颜 ##额 ##颞 ##颠 ##颡 ##颢 ##颤 ##颦 ##颧 ##風 ##颯 ##颱 ##颳 ##颶 ##颼 ##飄 ##飆 ##风 ##飒 ##飓 ##飕 ##飘 ##飙 ##飚 ##飛 ##飞 ##食 ##飢 ##飨 ##飩 ##飪 ##飯 ##飲 ##飼 ##飽 ##飾 ##餃 ##餅 ##餉 ##養 ##餌 ##餐 ##餒 ##餓 ##餘 ##餚 ##餛 ##餞 ##餡 ##館 ##餮 ##餵 ##餾 ##饅 ##饈 ##饋 ##饌 ##饍 ##饑 ##饒 ##饕 ##饗 ##饞 ##饥 ##饨 ##饪 ##饬 ##饭 ##饮 ##饯 ##饰 ##饱 ##饲 ##饴 ##饵 ##饶 ##饷 ##饺 ##饼 ##饽 ##饿 ##馀 ##馁 ##馄 ##馅 ##馆 ##馈 ##馋 ##馍 ##馏 ##馒 ##馔 ##首 ##馗 ##香 ##馥 ##馨 ##馬 ##馭 ##馮 ##馳 ##馴 ##駁 ##駄 ##駅 ##駆 ##駐 ##駒 ##駕 ##駛 ##駝 ##駭 ##駱 ##駿 ##騁 ##騎 ##騏 ##験 ##騙 ##騨 ##騰 ##騷 ##驀 ##驅 ##驊 ##驍 ##驒 ##驕 ##驗 ##驚 ##驛 ##驟 ##驢 ##驥 ##马 ##驭 ##驮 ##驯 ##驰 ##驱 ##驳 ##驴 ##驶 ##驷 ##驸 ##驹 ##驻 ##驼 ##驾 ##驿 ##骁 ##骂 ##骄 ##骅 ##骆 ##骇 ##骈 ##骊 ##骋 ##验 ##骏 ##骐 ##骑 ##骗 ##骚 ##骛 ##骜 ##骞 ##骠 ##骡 ##骤 ##骥 ##骧 ##骨 ##骯 ##骰 ##骶 ##骷 ##骸 ##骼 ##髂 ##髅 ##髋 ##髏 ##髒 ##髓 ##體 ##髖 ##高 ##髦 ##髪 ##髮 ##髯 ##髻 ##鬃 ##鬆 ##鬍 ##鬓 ##鬚 ##鬟 ##鬢 ##鬣 ##鬥 ##鬧 ##鬱 ##鬼 ##魁 ##魂 ##魄 ##魅 ##魇 ##魍 ##魏 ##魔 ##魘 ##魚 ##魯 ##魷 ##鮑 ##鮨 ##鮪 ##鮭 ##鮮 ##鯉 ##鯊 ##鯖 ##鯛 ##鯨 ##鯰 ##鯽 ##鰍 ##鰓 ##鰭 ##鰲 ##鰻 ##鰾 ##鱈 ##鱉 ##鱔 ##鱗 ##鱷 ##鱸 ##鱼 ##鱿 ##鲁 ##鲈 ##鲍 ##鲑 ##鲛 ##鲜 ##鲟 ##鲢 ##鲤 ##鲨 ##鲫 ##鲱 ##鲲 ##鲶 ##鲷 ##鲸 ##鳃 ##鳄 ##鳅 ##鳌 ##鳍 ##鳕 ##鳖 ##鳗 ##鳝 ##鳞 ##鳥 ##鳩 ##鳳 ##鳴 ##鳶 ##鴉 ##鴕 ##鴛 ##鴦 ##鴨 ##鴻 ##鴿 ##鵑 ##鵜 ##鵝 ##鵡 ##鵬 ##鵰 ##鵲 ##鶘 ##鶩 ##鶯 ##鶴 ##鷗 ##鷲 ##鷹 ##鷺 ##鸚 ##鸞 ##鸟 ##鸠 ##鸡 ##鸢 ##鸣 ##鸥 ##鸦 ##鸨 ##鸪 ##鸭 ##鸯 ##鸳 ##鸵 ##鸽 ##鸾 ##鸿 ##鹂 ##鹃 ##鹄 ##鹅 ##鹈 ##鹉 ##鹊 ##鹌 ##鹏 ##鹑 ##鹕 ##鹘 ##鹜 ##鹞 ##鹤 ##鹦 ##鹧 ##鹫 ##鹭 ##鹰 ##鹳 ##鹵 ##鹹 ##鹼 ##鹽 ##鹿 ##麂 ##麋 ##麒 ##麓 ##麗 ##麝 ##麟 ##麥 ##麦 ##麩 ##麴 ##麵 ##麸 ##麺 ##麻 ##麼 ##麽 ##麾 ##黃 ##黄 ##黍 ##黎 ##黏 ##黑 ##黒 ##黔 ##默 ##黛 ##黜 ##黝 ##點 ##黠 ##黨 ##黯 ##黴 ##鼋 ##鼎 ##鼐 ##鼓 ##鼠 ##鼬 ##鼹 ##鼻 ##鼾 ##齁 ##齊 ##齋 ##齐 ##齒 ##齡 ##齢 ##齣 ##齦 ##齿 ##龄 ##龅 ##龈 ##龊 ##龋 ##龌 ##龍 ##龐 ##龔 ##龕 ##龙 ##龚 ##龛 ##龜 ##龟 ##︰ ##︱ ##︶ ##︿ ##﹁ ##﹂ ##﹍ ##﹏ ##﹐ ##﹑ ##﹒ ##﹔ ##﹕ ##﹖ ##﹗ ##﹙ ##﹚ ##﹝ ##﹞ ##﹡ ##﹣ ##！ ##＂ ##＃ ##＄ ##％ ##＆ ##＇ ##（ ##） ##＊ ##， ##－ ##． ##／ ##： ##； ##＜ ##？ ##＠ ##［ ##＼ ##］ ##＾ ##＿ ##｀ ##ｆ ##ｈ ##ｊ ##ｕ ##ｗ ##ｚ ##｛ ##｝ ##｡ ##｢ ##｣ ##､ ##･ ##ｯ ##ｰ ##ｲ ##ｸ ##ｼ ##ｽ ##ﾄ ##ﾉ ##ﾌ ##ﾗ ##ﾙ ##ﾝ ##ﾞ ##ﾟ ##￣ ##￥ ##👍 ##🔥 ##😂 ##😎 ================================================ FILE: run_classifier.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """BERT finetuning runner.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import collections import csv import os import modeling import optimization_finetuning as optimization import tokenization import tensorflow as tf # from loss import bi_tempered_logistic_loss flags = tf.flags FLAGS = flags.FLAGS ## Required parameters flags.DEFINE_string( "data_dir", None, "The input data dir. Should contain the .tsv files (or other data files) " "for the task.") flags.DEFINE_string( "bert_config_file", None, "The config json file corresponding to the pre-trained BERT model. " "This specifies the model architecture.") flags.DEFINE_string("task_name", None, "The name of the task to train.") flags.DEFINE_string("vocab_file", None, "The vocabulary file that the BERT model was trained on.") flags.DEFINE_string( "output_dir", None, "The output directory where the model checkpoints will be written.") ## Other parameters flags.DEFINE_string( "init_checkpoint", None, "Initial checkpoint (usually from a pre-trained BERT model).") flags.DEFINE_bool( "do_lower_case", True, "Whether to lower case the input text. Should be True for uncased " "models and False for cased models.") flags.DEFINE_integer( "max_seq_length", 128, "The maximum total input sequence length after WordPiece tokenization. " "Sequences longer than this will be truncated, and sequences shorter " "than this will be padded.") flags.DEFINE_bool("do_train", False, "Whether to run training.") flags.DEFINE_bool("do_eval", False, "Whether to run eval on the dev set.") flags.DEFINE_bool( "do_predict", False, "Whether to run the model in inference mode on the test set.") flags.DEFINE_integer("train_batch_size", 32, "Total batch size for training.") flags.DEFINE_integer("eval_batch_size", 8, "Total batch size for eval.") flags.DEFINE_integer("predict_batch_size", 8, "Total batch size for predict.") flags.DEFINE_float("learning_rate", 5e-5, "The initial learning rate for Adam.") flags.DEFINE_float("num_train_epochs", 3.0, "Total number of training epochs to perform.") flags.DEFINE_float( "warmup_proportion", 0.1, "Proportion of training to perform linear learning rate warmup for. " "E.g., 0.1 = 10% of training.") flags.DEFINE_integer("save_checkpoints_steps", 1000, "How often to save the model checkpoint.") flags.DEFINE_integer("iterations_per_loop", 1000, "How many steps to make in each estimator call.") flags.DEFINE_bool("use_tpu", False, "Whether to use TPU or GPU/CPU.") tf.flags.DEFINE_string( "tpu_name", None, "The Cloud TPU to use for training. This should be either the name " "used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 " "url.") tf.flags.DEFINE_string( "tpu_zone", None, "[Optional] GCE zone where the Cloud TPU is located in. If not " "specified, we will attempt to automatically detect the GCE project from " "metadata.") tf.flags.DEFINE_string( "gcp_project", None, "[Optional] Project name for the Cloud TPU-enabled project. If not " "specified, we will attempt to automatically detect the GCE project from " "metadata.") tf.flags.DEFINE_string("master", None, "[Optional] TensorFlow master URL.") flags.DEFINE_integer( "num_tpu_cores", 8, "Only used if `use_tpu` is True. Total number of TPU cores to use.") class InputExample(object): """A single training/test example for simple sequence classification.""" def __init__(self, guid, text_a, text_b=None, label=None): """Constructs a InputExample. Args: guid: Unique id for the example. text_a: string. The untokenized text of the first sequence. For single sequence tasks, only this sequence must be specified. text_b: (Optional) string. The untokenized text of the second sequence. Only must be specified for sequence pair tasks. label: (Optional) string. The label of the example. This should be specified for train and dev examples, but not for test examples. """ self.guid = guid self.text_a = text_a self.text_b = text_b self.label = label class PaddingInputExample(object): """Fake example so the num input examples is a multiple of the batch size. When running eval/predict on the TPU, we need to pad the number of examples to be a multiple of the batch size, because the TPU requires a fixed batch size. The alternative is to drop the last batch, which is bad because it means the entire output data won't be generated. We use this class instead of `None` because treating `None` as padding battches could cause silent errors. """ class InputFeatures(object): """A single set of features of data.""" def __init__(self, input_ids, input_mask, segment_ids, label_id, is_real_example=True): self.input_ids = input_ids self.input_mask = input_mask self.segment_ids = segment_ids self.label_id = label_id self.is_real_example = is_real_example class DataProcessor(object): """Base class for data converters for sequence classification data sets.""" def get_train_examples(self, data_dir): """Gets a collection of `InputExample`s for the train set.""" raise NotImplementedError() def get_dev_examples(self, data_dir): """Gets a collection of `InputExample`s for the dev set.""" raise NotImplementedError() def get_test_examples(self, data_dir): """Gets a collection of `InputExample`s for prediction.""" raise NotImplementedError() def get_labels(self): """Gets the list of labels for this data set.""" raise NotImplementedError() @classmethod def _read_tsv(cls, input_file, quotechar=None): """Reads a tab separated value file.""" with tf.gfile.Open(input_file, "r") as f: reader = csv.reader(f, delimiter="\t", quotechar=quotechar) lines = [] for line in reader: lines.append(line) return lines class LCQMCPairClassificationProcessor(DataProcessor): # TODO NEED CHANGE2 """Processor for the internal data set. sentence pair classification""" def __init__(self): self.language = "zh" def get_train_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "train.txt")), "train") # dev_0827.tsv def get_dev_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "dev.txt")), "dev") def get_test_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "test.txt")), "test") def get_labels(self): """See base class.""" return ["0", "1"] def _create_examples(self, lines, set_type): """Creates examples for the training and dev sets.""" examples = [] print("length of lines:",len(lines)) for (i, line) in enumerate(lines): #print('#i:',i,line) if i == 0: continue guid = "%s-%s" % (set_type, i) try: label = tokenization.convert_to_unicode(line[2]) text_a = tokenization.convert_to_unicode(line[0]) text_b = tokenization.convert_to_unicode(line[1]) examples.append( InputExample(guid=guid, text_a=text_a, text_b=text_b, label=label)) except Exception: print('###error.i:', i, line) return examples def convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer): """Converts a single `InputExample` into a single `InputFeatures`.""" if isinstance(example, PaddingInputExample): return InputFeatures( input_ids=[0] * max_seq_length, input_mask=[0] * max_seq_length, segment_ids=[0] * max_seq_length, label_id=0, is_real_example=False) label_map = {} for (i, label) in enumerate(label_list): label_map[label] = i tokens_a = tokenizer.tokenize(example.text_a) tokens_b = None if example.text_b: tokens_b = tokenizer.tokenize(example.text_b) if tokens_b: # Modifies `tokens_a` and `tokens_b` in place so that the total # length is less than the specified length. # Account for [CLS], [SEP], [SEP] with "- 3" _truncate_seq_pair(tokens_a, tokens_b, max_seq_length - 3) else: # Account for [CLS] and [SEP] with "- 2" if len(tokens_a) > max_seq_length - 2: tokens_a = tokens_a[0:(max_seq_length - 2)] # The convention in BERT is: # (a) For sequence pairs: # tokens: [CLS] is this jack ##son ##ville ? [SEP] no it is not . [SEP] # type_ids: 0 0 0 0 0 0 0 0 1 1 1 1 1 1 # (b) For single sequences: # tokens: [CLS] the dog is hairy . [SEP] # type_ids: 0 0 0 0 0 0 0 # # Where "type_ids" are used to indicate whether this is the first # sequence or the second sequence. The embedding vectors for `type=0` and # `type=1` were learned during pre-training and are added to the wordpiece # embedding vector (and position vector). This is not *strictly* necessary # since the [SEP] token unambiguously separates the sequences, but it makes # it easier for the model to learn the concept of sequences. # # For classification tasks, the first vector (corresponding to [CLS]) is # used as the "sentence vector". Note that this only makes sense because # the entire model is fine-tuned. tokens = [] segment_ids = [] tokens.append("[CLS]") segment_ids.append(0) for token in tokens_a: tokens.append(token) segment_ids.append(0) tokens.append("[SEP]") segment_ids.append(0) if tokens_b: for token in tokens_b: tokens.append(token) segment_ids.append(1) tokens.append("[SEP]") segment_ids.append(1) input_ids = tokenizer.convert_tokens_to_ids(tokens) # The mask has 1 for real tokens and 0 for padding tokens. Only real # tokens are attended to. input_mask = [1] * len(input_ids) # Zero-pad up to the sequence length. while len(input_ids) < max_seq_length: input_ids.append(0) input_mask.append(0) segment_ids.append(0) assert len(input_ids) == max_seq_length assert len(input_mask) == max_seq_length assert len(segment_ids) == max_seq_length label_id = label_map[example.label] if ex_index < 5: tf.logging.info("*** Example ***") tf.logging.info("guid: %s" % (example.guid)) tf.logging.info("tokens: %s" % " ".join( [tokenization.printable_text(x) for x in tokens])) tf.logging.info("input_ids: %s" % " ".join([str(x) for x in input_ids])) tf.logging.info("input_mask: %s" % " ".join([str(x) for x in input_mask])) tf.logging.info("segment_ids: %s" % " ".join([str(x) for x in segment_ids])) tf.logging.info("label: %s (id = %d)" % (example.label, label_id)) feature = InputFeatures( input_ids=input_ids, input_mask=input_mask, segment_ids=segment_ids, label_id=label_id, is_real_example=True) return feature def file_based_convert_examples_to_features( examples, label_list, max_seq_length, tokenizer, output_file): """Convert a set of `InputExample`s to a TFRecord file.""" writer = tf.python_io.TFRecordWriter(output_file) for (ex_index, example) in enumerate(examples): if ex_index % 10000 == 0: tf.logging.info("Writing example %d of %d" % (ex_index, len(examples))) feature = convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer) def create_int_feature(values): f = tf.train.Feature(int64_list=tf.train.Int64List(value=list(values))) return f features = collections.OrderedDict() features["input_ids"] = create_int_feature(feature.input_ids) features["input_mask"] = create_int_feature(feature.input_mask) features["segment_ids"] = create_int_feature(feature.segment_ids) features["label_ids"] = create_int_feature([feature.label_id]) features["is_real_example"] = create_int_feature( [int(feature.is_real_example)]) tf_example = tf.train.Example(features=tf.train.Features(feature=features)) writer.write(tf_example.SerializeToString()) writer.close() def file_based_input_fn_builder(input_file, seq_length, is_training, drop_remainder): """Creates an `input_fn` closure to be passed to TPUEstimator.""" name_to_features = { "input_ids": tf.FixedLenFeature([seq_length], tf.int64), "input_mask": tf.FixedLenFeature([seq_length], tf.int64), "segment_ids": tf.FixedLenFeature([seq_length], tf.int64), "label_ids": tf.FixedLenFeature([], tf.int64), "is_real_example": tf.FixedLenFeature([], tf.int64), } def _decode_record(record, name_to_features): """Decodes a record to a TensorFlow example.""" example = tf.parse_single_example(record, name_to_features) # tf.Example only supports tf.int64, but the TPU only supports tf.int32. # So cast all int64 to int32. for name in list(example.keys()): t = example[name] if t.dtype == tf.int64: t = tf.to_int32(t) example[name] = t return example def input_fn(params): """The actual input function.""" batch_size = params["batch_size"] # For training, we want a lot of parallel reading and shuffling. # For eval, we want no shuffling and parallel reading doesn't matter. d = tf.data.TFRecordDataset(input_file) if is_training: d = d.repeat() d = d.shuffle(buffer_size=100) d = d.apply( tf.contrib.data.map_and_batch( lambda record: _decode_record(record, name_to_features), batch_size=batch_size, drop_remainder=drop_remainder)) return d return input_fn def _truncate_seq_pair(tokens_a, tokens_b, max_length): """Truncates a sequence pair in place to the maximum length.""" # This is a simple heuristic which will always truncate the longer sequence # one token at a time. This makes more sense than truncating an equal percent # of tokens from each, since if one sequence is very short then each token # that's truncated likely contains more information than a longer sequence. while True: total_length = len(tokens_a) + len(tokens_b) if total_length <= max_length: break if len(tokens_a) > len(tokens_b): tokens_a.pop() else: tokens_b.pop() def create_model(bert_config, is_training, input_ids, input_mask, segment_ids, labels, num_labels, use_one_hot_embeddings): """Creates a classification model.""" model = modeling.BertModel( config=bert_config, is_training=is_training, input_ids=input_ids, input_mask=input_mask, token_type_ids=segment_ids, use_one_hot_embeddings=use_one_hot_embeddings) # In the demo, we are doing a simple classification task on the entire # segment. # # If you want to use the token-level output, use model.get_sequence_output() # instead. output_layer = model.get_pooled_output() hidden_size = output_layer.shape[-1].value output_weights = tf.get_variable( "output_weights", [num_labels, hidden_size], initializer=tf.truncated_normal_initializer(stddev=0.02)) output_bias = tf.get_variable( "output_bias", [num_labels], initializer=tf.zeros_initializer()) with tf.variable_scope("loss"): if is_training: # I.e., 0.1 dropout output_layer = tf.nn.dropout(output_layer, keep_prob=0.9) logits = tf.matmul(output_layer, output_weights, transpose_b=True) logits = tf.nn.bias_add(logits, output_bias) probabilities = tf.nn.softmax(logits, axis=-1) log_probs = tf.nn.log_softmax(logits, axis=-1) one_hot_labels = tf.one_hot(labels, depth=num_labels, dtype=tf.float32) per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1) # todo 08-29 try temp-loss ###############bi_tempered_logistic_loss############################################################################ # print("##cross entropy loss is used...."); tf.logging.info("##cross entropy loss is used....") # t1=0.9 #t1=0.90 # t2=1.05 #t2=1.05 # per_example_loss=bi_tempered_logistic_loss(log_probs,one_hot_labels,t1,t2,label_smoothing=0.1,num_iters=5) # TODO label_smoothing=0.0 #tf.logging.info("per_example_loss:"+str(per_example_loss.shape)) ##############bi_tempered_logistic_loss############################################################################# loss = tf.reduce_mean(per_example_loss) return (loss, per_example_loss, logits, probabilities) def model_fn_builder(bert_config, num_labels, init_checkpoint, learning_rate, num_train_steps, num_warmup_steps, use_tpu, use_one_hot_embeddings): """Returns `model_fn` closure for TPUEstimator.""" def model_fn(features, labels, mode, params): # pylint: disable=unused-argument """The `model_fn` for TPUEstimator.""" tf.logging.info("*** Features ***") for name in sorted(features.keys()): tf.logging.info(" name = %s, shape = %s" % (name, features[name].shape)) input_ids = features["input_ids"] input_mask = features["input_mask"] segment_ids = features["segment_ids"] label_ids = features["label_ids"] is_real_example = None if "is_real_example" in features: is_real_example = tf.cast(features["is_real_example"], dtype=tf.float32) else: is_real_example = tf.ones(tf.shape(label_ids), dtype=tf.float32) is_training = (mode == tf.estimator.ModeKeys.TRAIN) (total_loss, per_example_loss, logits, probabilities) = create_model( bert_config, is_training, input_ids, input_mask, segment_ids, label_ids, num_labels, use_one_hot_embeddings) tvars = tf.trainable_variables() initialized_variable_names = {} scaffold_fn = None if init_checkpoint: (assignment_map, initialized_variable_names ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint) if use_tpu: def tpu_scaffold(): tf.train.init_from_checkpoint(init_checkpoint, assignment_map) return tf.train.Scaffold() scaffold_fn = tpu_scaffold else: tf.train.init_from_checkpoint(init_checkpoint, assignment_map) tf.logging.info("**** Trainable Variables ****") for var in tvars: init_string = "" if var.name in initialized_variable_names: init_string = ", *INIT_FROM_CKPT*" tf.logging.info(" name = %s, shape = %s%s", var.name, var.shape, init_string) output_spec = None if mode == tf.estimator.ModeKeys.TRAIN: train_op = optimization.create_optimizer( total_loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu) output_spec = tf.contrib.tpu.TPUEstimatorSpec( mode=mode, loss=total_loss, train_op=train_op, scaffold_fn=scaffold_fn) elif mode == tf.estimator.ModeKeys.EVAL: def metric_fn(per_example_loss, label_ids, logits, is_real_example): predictions = tf.argmax(logits, axis=-1, output_type=tf.int32) accuracy = tf.metrics.accuracy( labels=label_ids, predictions=predictions, weights=is_real_example) loss = tf.metrics.mean(values=per_example_loss, weights=is_real_example) return { "eval_accuracy": accuracy, "eval_loss": loss, } eval_metrics = (metric_fn, [per_example_loss, label_ids, logits, is_real_example]) output_spec = tf.contrib.tpu.TPUEstimatorSpec( mode=mode, loss=total_loss, eval_metrics=eval_metrics, scaffold_fn=scaffold_fn) else: output_spec = tf.contrib.tpu.TPUEstimatorSpec( mode=mode, predictions={"probabilities": probabilities}, scaffold_fn=scaffold_fn) return output_spec return model_fn # This function is not used by this file but is still used by the Colab and # people who depend on it. def input_fn_builder(features, seq_length, is_training, drop_remainder): """Creates an `input_fn` closure to be passed to TPUEstimator.""" all_input_ids = [] all_input_mask = [] all_segment_ids = [] all_label_ids = [] for feature in features: all_input_ids.append(feature.input_ids) all_input_mask.append(feature.input_mask) all_segment_ids.append(feature.segment_ids) all_label_ids.append(feature.label_id) def input_fn(params): """The actual input function.""" batch_size = params["batch_size"] num_examples = len(features) # This is for demo purposes and does NOT scale to large data sets. We do # not use Dataset.from_generator() because that uses tf.py_func which is # not TPU compatible. The right way to load data is with TFRecordReader. d = tf.data.Dataset.from_tensor_slices({ "input_ids": tf.constant( all_input_ids, shape=[num_examples, seq_length], dtype=tf.int32), "input_mask": tf.constant( all_input_mask, shape=[num_examples, seq_length], dtype=tf.int32), "segment_ids": tf.constant( all_segment_ids, shape=[num_examples, seq_length], dtype=tf.int32), "label_ids": tf.constant(all_label_ids, shape=[num_examples], dtype=tf.int32), }) if is_training: d = d.repeat() d = d.shuffle(buffer_size=100) d = d.batch(batch_size=batch_size, drop_remainder=drop_remainder) return d return input_fn class LCQMCPairClassificationProcessor(DataProcessor): # TODO NEED CHANGE2 """Processor for the internal data set. sentence pair classification""" def __init__(self): self.language = "zh" def get_train_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "train.txt")), "train") # dev_0827.tsv def get_dev_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "test.txt")), "dev") # todo change temp for test purpose def get_test_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "test.txt")), "test") def get_labels(self): """See base class.""" return ["0", "1"] #return ["-1","0", "1"] def _create_examples(self, lines, set_type): """Creates examples for the training and dev sets.""" examples = [] print("length of lines:",len(lines)) for (i, line) in enumerate(lines): #print('#i:',i,line) if i == 0: continue guid = "%s-%s" % (set_type, i) try: label = tokenization.convert_to_unicode(line[2]) text_a = tokenization.convert_to_unicode(line[0]) text_b = tokenization.convert_to_unicode(line[1]) examples.append( InputExample(guid=guid, text_a=text_a, text_b=text_b, label=label)) except Exception: print('###error.i:', i, line) return examples class SentencePairClassificationProcessor(DataProcessor): """Processor for the internal data set. sentence pair classification""" def __init__(self): self.language = "zh" def get_train_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "train_0827.tsv")), "train") # dev_0827.tsv def get_dev_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "dev_0827.tsv")), "dev") def get_test_examples(self, data_dir): """See base class.""" return self._create_examples( self._read_tsv(os.path.join(data_dir, "test_0827.tsv")), "test") def get_labels(self): """See base class.""" return ["0", "1"] #return ["-1","0", "1"] def _create_examples(self, lines, set_type): """Creates examples for the training and dev sets.""" examples = [] print("length of lines:",len(lines)) for (i, line) in enumerate(lines): #print('#i:',i,line) if i == 0: continue guid = "%s-%s" % (set_type, i) try: label = tokenization.convert_to_unicode(line[0]) text_a = tokenization.convert_to_unicode(line[1]) text_b = tokenization.convert_to_unicode(line[2]) examples.append( InputExample(guid=guid, text_a=text_a, text_b=text_b, label=label)) except Exception: print('###error.i:', i, line) return examples # This function is not used by this file but is still used by the Colab and # people who depend on it. def convert_examples_to_features(examples, label_list, max_seq_length, tokenizer): """Convert a set of `InputExample`s to a list of `InputFeatures`.""" features = [] for (ex_index, example) in enumerate(examples): if ex_index % 10000 == 0: tf.logging.info("Writing example %d of %d" % (ex_index, len(examples))) feature = convert_single_example(ex_index, example, label_list, max_seq_length, tokenizer) features.append(feature) return features def main(_): tf.logging.set_verbosity(tf.logging.INFO) processors = { "sentence_pair": SentencePairClassificationProcessor, "lcqmc_pair":LCQMCPairClassificationProcessor } tokenization.validate_case_matches_checkpoint(FLAGS.do_lower_case, FLAGS.init_checkpoint) if not FLAGS.do_train and not FLAGS.do_eval and not FLAGS.do_predict: raise ValueError( "At least one of `do_train`, `do_eval` or `do_predict' must be True.") bert_config = modeling.BertConfig.from_json_file(FLAGS.bert_config_file) if FLAGS.max_seq_length > bert_config.max_position_embeddings: raise ValueError( "Cannot use sequence length %d because the BERT model " "was only trained up to sequence length %d" % (FLAGS.max_seq_length, bert_config.max_position_embeddings)) tf.gfile.MakeDirs(FLAGS.output_dir) task_name = FLAGS.task_name.lower() if task_name not in processors: raise ValueError("Task not found: %s" % (task_name)) processor = processors[task_name]() label_list = processor.get_labels() tokenizer = tokenization.FullTokenizer( vocab_file=FLAGS.vocab_file, do_lower_case=FLAGS.do_lower_case) tpu_cluster_resolver = None if FLAGS.use_tpu and FLAGS.tpu_name: tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver( FLAGS.tpu_name, zone=FLAGS.tpu_zone, project=FLAGS.gcp_project) is_per_host = tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2 # Cloud TPU: Invalid TPU configuration, ensure ClusterResolver is passed to tpu. print("###tpu_cluster_resolver:",tpu_cluster_resolver) run_config = tf.contrib.tpu.RunConfig( cluster=tpu_cluster_resolver, master=FLAGS.master, model_dir=FLAGS.output_dir, save_checkpoints_steps=FLAGS.save_checkpoints_steps, tpu_config=tf.contrib.tpu.TPUConfig( iterations_per_loop=FLAGS.iterations_per_loop, num_shards=FLAGS.num_tpu_cores, per_host_input_for_training=is_per_host)) train_examples = None num_train_steps = None num_warmup_steps = None if FLAGS.do_train: train_examples =processor.get_train_examples(FLAGS.data_dir) # TODO print("###length of total train_examples:",len(train_examples)) num_train_steps = int(len(train_examples)/ FLAGS.train_batch_size * FLAGS.num_train_epochs) num_warmup_steps = int(num_train_steps * FLAGS.warmup_proportion) model_fn = model_fn_builder( bert_config=bert_config, num_labels=len(label_list), init_checkpoint=FLAGS.init_checkpoint, learning_rate=FLAGS.learning_rate, num_train_steps=num_train_steps, num_warmup_steps=num_warmup_steps, use_tpu=FLAGS.use_tpu, use_one_hot_embeddings=FLAGS.use_tpu) # If TPU is not available, this will fall back to normal Estimator on CPU # or GPU. estimator = tf.contrib.tpu.TPUEstimator( use_tpu=FLAGS.use_tpu, model_fn=model_fn, config=run_config, train_batch_size=FLAGS.train_batch_size, eval_batch_size=FLAGS.eval_batch_size, predict_batch_size=FLAGS.predict_batch_size) if FLAGS.do_train: train_file = os.path.join(FLAGS.output_dir, "train.tf_record") train_file_exists=os.path.exists(train_file) print("###train_file_exists:", train_file_exists," ;train_file:",train_file) if not train_file_exists: # if tf_record file not exist, convert from raw text file. # TODO file_based_convert_examples_to_features(train_examples, label_list, FLAGS.max_seq_length, tokenizer, train_file) tf.logging.info("***** Running training *****") tf.logging.info(" Num examples = %d", len(train_examples)) tf.logging.info(" Batch size = %d", FLAGS.train_batch_size) tf.logging.info(" Num steps = %d", num_train_steps) train_input_fn = file_based_input_fn_builder( input_file=train_file, seq_length=FLAGS.max_seq_length, is_training=True, drop_remainder=True) estimator.train(input_fn=train_input_fn, max_steps=num_train_steps) if FLAGS.do_eval: eval_examples = processor.get_dev_examples(FLAGS.data_dir) num_actual_eval_examples = len(eval_examples) if FLAGS.use_tpu: # TPU requires a fixed batch size for all batches, therefore the number # of examples must be a multiple of the batch size, or else examples # will get dropped. So we pad with fake examples which are ignored # later on. These do NOT count towards the metric (all tf.metrics # support a per-instance weight, and these get a weight of 0.0). while len(eval_examples) % FLAGS.eval_batch_size != 0: eval_examples.append(PaddingInputExample()) eval_file = os.path.join(FLAGS.output_dir, "eval.tf_record") file_based_convert_examples_to_features( eval_examples, label_list, FLAGS.max_seq_length, tokenizer, eval_file) tf.logging.info("***** Running evaluation *****") tf.logging.info(" Num examples = %d (%d actual, %d padding)", len(eval_examples), num_actual_eval_examples, len(eval_examples) - num_actual_eval_examples) tf.logging.info(" Batch size = %d", FLAGS.eval_batch_size) # This tells the estimator to run through the entire set. eval_steps = None # However, if running eval on the TPU, you will need to specify the # number of steps. if FLAGS.use_tpu: assert len(eval_examples) % FLAGS.eval_batch_size == 0 eval_steps = int(len(eval_examples) // FLAGS.eval_batch_size) eval_drop_remainder = True if FLAGS.use_tpu else False eval_input_fn = file_based_input_fn_builder( input_file=eval_file, seq_length=FLAGS.max_seq_length, is_training=False, drop_remainder=eval_drop_remainder) ####################################################################################################################### # evaluate 所有的checkpoint steps_and_files = [] filenames = tf.gfile.ListDirectory(FLAGS.output_dir) for filename in filenames: if filename.endswith(".index"): ckpt_name = filename[:-6] cur_filename = os.path.join(FLAGS.output_dir, ckpt_name) global_step = int(cur_filename.split("-")[-1]) tf.logging.info("Add {} to eval list.".format(cur_filename)) steps_and_files.append([global_step, cur_filename]) steps_and_files = sorted(steps_and_files, key=lambda x: x[0]) output_eval_file = os.path.join(FLAGS.data_dir, "eval_results16-layer24-4million-2.txt") # finetuning-layer24-4million print("output_eval_file:",output_eval_file) tf.logging.info("output_eval_file:"+output_eval_file) with tf.gfile.GFile(output_eval_file, "w") as writer: for global_step, filename in sorted(steps_and_files, key=lambda x: x[0]): result = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps, checkpoint_path=filename) tf.logging.info("***** Eval results %s *****" % (filename)) writer.write("***** Eval results %s *****\n" % (filename)) for key in sorted(result.keys()): tf.logging.info(" %s = %s", key, str(result[key])) writer.write("%s = %s\n" % (key, str(result[key]))) ####################################################################################################################### #result = estimator.evaluate(input_fn=eval_input_fn, steps=eval_steps) # #output_eval_file = os.path.join(FLAGS.output_dir, "eval_results.txt") #with tf.gfile.GFile(output_eval_file, "w") as writer: # tf.logging.info("***** Eval results *****") # for key in sorted(result.keys()): # tf.logging.info(" %s = %s", key, str(result[key])) # writer.write("%s = %s\n" % (key, str(result[key]))) if FLAGS.do_predict: predict_examples = processor.get_test_examples(FLAGS.data_dir) num_actual_predict_examples = len(predict_examples) if FLAGS.use_tpu: # TPU requires a fixed batch size for all batches, therefore the number # of examples must be a multiple of the batch size, or else examples # will get dropped. So we pad with fake examples which are ignored # later on. while len(predict_examples) % FLAGS.predict_batch_size != 0: predict_examples.append(PaddingInputExample()) predict_file = os.path.join(FLAGS.output_dir, "predict.tf_record") file_based_convert_examples_to_features(predict_examples, label_list, FLAGS.max_seq_length, tokenizer, predict_file) tf.logging.info("***** Running prediction*****") tf.logging.info(" Num examples = %d (%d actual, %d padding)", len(predict_examples), num_actual_predict_examples, len(predict_examples) - num_actual_predict_examples) tf.logging.info(" Batch size = %d", FLAGS.predict_batch_size) predict_drop_remainder = True if FLAGS.use_tpu else False predict_input_fn = file_based_input_fn_builder( input_file=predict_file, seq_length=FLAGS.max_seq_length, is_training=False, drop_remainder=predict_drop_remainder) result = estimator.predict(input_fn=predict_input_fn) output_predict_file = os.path.join(FLAGS.output_dir, "test_results.tsv") with tf.gfile.GFile(output_predict_file, "w") as writer: num_written_lines = 0 tf.logging.info("***** Predict results *****") for (i, prediction) in enumerate(result): probabilities = prediction["probabilities"] if i >= num_actual_predict_examples: break output_line = "\t".join( str(class_probability) for class_probability in probabilities) + "\n" writer.write(output_line) num_written_lines += 1 assert num_written_lines == num_actual_predict_examples if __name__ == "__main__": flags.mark_flag_as_required("data_dir") flags.mark_flag_as_required("task_name") flags.mark_flag_as_required("vocab_file") flags.mark_flag_as_required("bert_config_file") flags.mark_flag_as_required("output_dir") tf.app.run() ================================================ FILE: run_pretraining.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """Run masked LM/next sentence masked_lm pre-training for BERT.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import os import modeling import optimization import tensorflow as tf flags = tf.flags FLAGS = flags.FLAGS ## Required parameters flags.DEFINE_string( "bert_config_file", None, "The config json file corresponding to the pre-trained BERT model. " "This specifies the model architecture.") flags.DEFINE_string( "input_file", None, "Input TF example files (can be a glob or comma separated).") flags.DEFINE_string( "output_dir", None, "The output directory where the model checkpoints will be written.") ## Other parameters flags.DEFINE_string( "init_checkpoint", None, "Initial checkpoint (usually from a pre-trained BERT model).") flags.DEFINE_integer( "max_seq_length", 128, "The maximum total input sequence length after WordPiece tokenization. " "Sequences longer than this will be truncated, and sequences shorter " "than this will be padded. Must match data generation.") flags.DEFINE_integer( "max_predictions_per_seq", 20, "Maximum number of masked LM predictions per sequence. " "Must match data generation.") flags.DEFINE_bool("do_train", False, "Whether to run training.") flags.DEFINE_bool("do_eval", False, "Whether to run eval on the dev set.") flags.DEFINE_integer("train_batch_size", 32, "Total batch size for training.") flags.DEFINE_integer("eval_batch_size", 8, "Total batch size for eval.") flags.DEFINE_float("learning_rate", 5e-5, "The initial learning rate for Adam.") flags.DEFINE_integer("num_train_steps", 100000, "Number of training steps.") flags.DEFINE_integer("num_warmup_steps", 10000, "Number of warmup steps.") flags.DEFINE_integer("save_checkpoints_steps", 1000, "How often to save the model checkpoint.") flags.DEFINE_integer("iterations_per_loop", 1000, "How many steps to make in each estimator call.") flags.DEFINE_integer("max_eval_steps", 100, "Maximum number of eval steps.") flags.DEFINE_bool("use_tpu", False, "Whether to use TPU or GPU/CPU.") tf.flags.DEFINE_string( "tpu_name", None, "The Cloud TPU to use for training. This should be either the name " "used when creating the Cloud TPU, or a grpc://ip.address.of.tpu:8470 " "url.") tf.flags.DEFINE_string( "tpu_zone", None, "[Optional] GCE zone where the Cloud TPU is located in. If not " "specified, we will attempt to automatically detect the GCE project from " "metadata.") tf.flags.DEFINE_string( "gcp_project", None, "[Optional] Project name for the Cloud TPU-enabled project. If not " "specified, we will attempt to automatically detect the GCE project from " "metadata.") tf.flags.DEFINE_string("master", None, "[Optional] TensorFlow master URL.") flags.DEFINE_integer( "num_tpu_cores", 8, "Only used if `use_tpu` is True. Total number of TPU cores to use.") def model_fn_builder(bert_config, init_checkpoint, learning_rate, num_train_steps, num_warmup_steps, use_tpu, use_one_hot_embeddings): """Returns `model_fn` closure for TPUEstimator.""" def model_fn(features, labels, mode, params): # pylint: disable=unused-argument """The `model_fn` for TPUEstimator.""" tf.logging.info("*** Features ***") for name in sorted(features.keys()): tf.logging.info(" name = %s, shape = %s" % (name, features[name].shape)) input_ids = features["input_ids"] input_mask = features["input_mask"] segment_ids = features["segment_ids"] masked_lm_positions = features["masked_lm_positions"] masked_lm_ids = features["masked_lm_ids"] masked_lm_weights = features["masked_lm_weights"] next_sentence_labels = features["next_sentence_labels"] is_training = (mode == tf.estimator.ModeKeys.TRAIN) model = modeling.BertModel( config=bert_config, is_training=is_training, input_ids=input_ids, input_mask=input_mask, token_type_ids=segment_ids, use_one_hot_embeddings=use_one_hot_embeddings) (masked_lm_loss, masked_lm_example_loss, masked_lm_log_probs) = get_masked_lm_output( bert_config, model.get_sequence_output(), model.get_embedding_table(), masked_lm_positions, masked_lm_ids, masked_lm_weights) (next_sentence_loss, next_sentence_example_loss, # TODO TODO TODO 可以计算单不算成绩 next_sentence_log_probs) = get_next_sentence_output( bert_config, model.get_pooled_output(), next_sentence_labels) # batch_size=masked_lm_log_probs.shape[0] # next_sentence_example_loss=tf.zeros((batch_size)) #tf.constant(0.0,dtype=tf.float32) # next_sentence_log_probs=tf.zeros((batch_size,2)) total_loss = masked_lm_loss # TODO remove next sentence loss 2019-08-08, + next_sentence_loss tvars = tf.trainable_variables() initialized_variable_names = {} print("init_checkpoint:",init_checkpoint) scaffold_fn = None if init_checkpoint: (assignment_map, initialized_variable_names ) = modeling.get_assignment_map_from_checkpoint(tvars, init_checkpoint) if use_tpu: def tpu_scaffold(): tf.train.init_from_checkpoint(init_checkpoint, assignment_map) return tf.train.Scaffold() scaffold_fn = tpu_scaffold else: tf.train.init_from_checkpoint(init_checkpoint, assignment_map) tf.logging.info("**** Trainable Variables ****") for var in tvars: init_string = "" if var.name in initialized_variable_names: init_string = ", *INIT_FROM_CKPT*" tf.logging.info(" name = %s, shape = %s%s", var.name, var.shape, init_string) output_spec = None if mode == tf.estimator.ModeKeys.TRAIN: train_op = optimization.create_optimizer( total_loss, learning_rate, num_train_steps, num_warmup_steps, use_tpu) output_spec = tf.contrib.tpu.TPUEstimatorSpec( mode=mode, loss=total_loss, train_op=train_op, scaffold_fn=scaffold_fn) elif mode == tf.estimator.ModeKeys.EVAL: def metric_fn(masked_lm_example_loss, masked_lm_log_probs, masked_lm_ids, masked_lm_weights, next_sentence_example_loss, next_sentence_log_probs, next_sentence_labels): """Computes the loss and accuracy of the model.""" masked_lm_log_probs = tf.reshape(masked_lm_log_probs,[-1, masked_lm_log_probs.shape[-1]]) masked_lm_predictions = tf.argmax(masked_lm_log_probs, axis=-1, output_type=tf.int32) masked_lm_example_loss = tf.reshape(masked_lm_example_loss, [-1]) masked_lm_ids = tf.reshape(masked_lm_ids, [-1]) masked_lm_weights = tf.reshape(masked_lm_weights, [-1]) masked_lm_accuracy = tf.metrics.accuracy( labels=masked_lm_ids, predictions=masked_lm_predictions, weights=masked_lm_weights) masked_lm_mean_loss = tf.metrics.mean( values=masked_lm_example_loss, weights=masked_lm_weights) next_sentence_log_probs = tf.reshape( next_sentence_log_probs, [-1, next_sentence_log_probs.shape[-1]]) next_sentence_predictions = tf.argmax( next_sentence_log_probs, axis=-1, output_type=tf.int32) next_sentence_labels = tf.reshape(next_sentence_labels, [-1]) next_sentence_accuracy = tf.metrics.accuracy( labels=next_sentence_labels, predictions=next_sentence_predictions) next_sentence_mean_loss = tf.metrics.mean( values=next_sentence_example_loss) return { "masked_lm_accuracy": masked_lm_accuracy, "masked_lm_loss": masked_lm_mean_loss, "next_sentence_accuracy": next_sentence_accuracy, "next_sentence_loss": next_sentence_mean_loss, } # next_sentence_example_loss=0.0 TODO # next_sentence_log_probs=0.0 # TODO eval_metrics = (metric_fn, [ masked_lm_example_loss, masked_lm_log_probs, masked_lm_ids, masked_lm_weights, next_sentence_example_loss, next_sentence_log_probs, next_sentence_labels ]) output_spec = tf.contrib.tpu.TPUEstimatorSpec( mode=mode, loss=total_loss, eval_metrics=eval_metrics, scaffold_fn=scaffold_fn) else: raise ValueError("Only TRAIN and EVAL modes are supported: %s" % (mode)) return output_spec return model_fn def get_masked_lm_output(bert_config, input_tensor, output_weights, positions, label_ids, label_weights): """Get loss and log probs for the masked LM.""" input_tensor = gather_indexes(input_tensor, positions) with tf.variable_scope("cls/predictions"): # We apply one more non-linear transformation before the output layer. # This matrix is not used after pre-training. with tf.variable_scope("transform"): input_tensor = tf.layers.dense( input_tensor, units=bert_config.hidden_size, activation=modeling.get_activation(bert_config.hidden_act), kernel_initializer=modeling.create_initializer( bert_config.initializer_range)) input_tensor = modeling.layer_norm(input_tensor) # The output weights are the same as the input embeddings, but there is # an output-only bias for each token. output_bias = tf.get_variable( "output_bias", shape=[bert_config.vocab_size], initializer=tf.zeros_initializer()) logits = tf.matmul(input_tensor, output_weights, transpose_b=True) logits = tf.nn.bias_add(logits, output_bias) log_probs = tf.nn.log_softmax(logits, axis=-1) label_ids = tf.reshape(label_ids, [-1]) label_weights = tf.reshape(label_weights, [-1]) one_hot_labels = tf.one_hot(label_ids, depth=bert_config.vocab_size, dtype=tf.float32) # The `positions` tensor might be zero-padded (if the sequence is too # short to have the maximum number of predictions). The `label_weights` # tensor has a value of 1.0 for every real prediction and 0.0 for the # padding predictions. per_example_loss = -tf.reduce_sum(log_probs * one_hot_labels, axis=[-1]) numerator = tf.reduce_sum(label_weights * per_example_loss) denominator = tf.reduce_sum(label_weights) + 1e-5 loss = numerator / denominator return (loss, per_example_loss, log_probs) def get_next_sentence_output(bert_config, input_tensor, labels): """Get loss and log probs for the next sentence prediction.""" # Simple binary classification. Note that 0 is "next sentence" and 1 is # "random sentence". This weight matrix is not used after pre-training. with tf.variable_scope("cls/seq_relationship"): output_weights = tf.get_variable( "output_weights", shape=[2, bert_config.hidden_size], initializer=modeling.create_initializer(bert_config.initializer_range)) output_bias = tf.get_variable( "output_bias", shape=[2], initializer=tf.zeros_initializer()) logits = tf.matmul(input_tensor, output_weights, transpose_b=True) logits = tf.nn.bias_add(logits, output_bias) log_probs = tf.nn.log_softmax(logits, axis=-1) labels = tf.reshape(labels, [-1]) one_hot_labels = tf.one_hot(labels, depth=2, dtype=tf.float32) per_example_loss = -tf.reduce_sum(one_hot_labels * log_probs, axis=-1) loss = tf.reduce_mean(per_example_loss) return (loss, per_example_loss, log_probs) def gather_indexes(sequence_tensor, positions): """Gathers the vectors at the specific positions over a minibatch.""" sequence_shape = modeling.get_shape_list(sequence_tensor, expected_rank=3) batch_size = sequence_shape[0] seq_length = sequence_shape[1] width = sequence_shape[2] flat_offsets = tf.reshape( tf.range(0, batch_size, dtype=tf.int32) * seq_length, [-1, 1]) flat_positions = tf.reshape(positions + flat_offsets, [-1]) flat_sequence_tensor = tf.reshape(sequence_tensor, [batch_size * seq_length, width]) output_tensor = tf.gather(flat_sequence_tensor, flat_positions) return output_tensor def input_fn_builder(input_files, max_seq_length, max_predictions_per_seq, is_training, num_cpu_threads=4): """Creates an `input_fn` closure to be passed to TPUEstimator.""" def input_fn(params): """The actual input function.""" batch_size = params["batch_size"] name_to_features = { "input_ids": tf.FixedLenFeature([max_seq_length], tf.int64), "input_mask": tf.FixedLenFeature([max_seq_length], tf.int64), "segment_ids": tf.FixedLenFeature([max_seq_length], tf.int64), "masked_lm_positions": tf.FixedLenFeature([max_predictions_per_seq], tf.int64), "masked_lm_ids": tf.FixedLenFeature([max_predictions_per_seq], tf.int64), "masked_lm_weights": tf.FixedLenFeature([max_predictions_per_seq], tf.float32), "next_sentence_labels": tf.FixedLenFeature([1], tf.int64), } # For training, we want a lot of parallel reading and shuffling. # For eval, we want no shuffling and parallel reading doesn't matter. if is_training: d = tf.data.Dataset.from_tensor_slices(tf.constant(input_files)) d = d.repeat() d = d.shuffle(buffer_size=len(input_files)) # `cycle_length` is the number of parallel files that get read. cycle_length = min(num_cpu_threads, len(input_files)) # `sloppy` mode means that the interleaving is not exact. This adds # even more randomness to the training pipeline. d = d.apply( tf.contrib.data.parallel_interleave( tf.data.TFRecordDataset, sloppy=is_training, cycle_length=cycle_length)) d = d.shuffle(buffer_size=100) else: d = tf.data.TFRecordDataset(input_files) # Since we evaluate for a fixed number of steps we don't want to encounter # out-of-range exceptions. d = d.repeat() # We must `drop_remainder` on training because the TPU requires fixed # size dimensions. For eval, we assume we are evaluating on the CPU or GPU # and we *don't* want to drop the remainder, otherwise we wont cover # every sample. d = d.apply( tf.contrib.data.map_and_batch( lambda record: _decode_record(record, name_to_features), batch_size=batch_size, num_parallel_batches=num_cpu_threads, drop_remainder=True)) return d return input_fn def _decode_record(record, name_to_features): """Decodes a record to a TensorFlow example.""" example = tf.parse_single_example(record, name_to_features) # tf.Example only supports tf.int64, but the TPU only supports tf.int32. # So cast all int64 to int32. for name in list(example.keys()): t = example[name] if t.dtype == tf.int64: t = tf.to_int32(t) example[name] = t return example def main(_): tf.logging.set_verbosity(tf.logging.INFO) if not FLAGS.do_train and not FLAGS.do_eval: # 必须是训练或验证的类型 raise ValueError("At least one of `do_train` or `do_eval` must be True.") bert_config = modeling.BertConfig.from_json_file(FLAGS.bert_config_file) # 从json文件中获得配置信息 tf.gfile.MakeDirs(FLAGS.output_dir) input_files = [] # 输入可以是多个文件，以“逗号隔开”；可以是一个匹配形式的，如“input_x*” for input_pattern in FLAGS.input_file.split(","): input_files.extend(tf.gfile.Glob(input_pattern)) tf.logging.info("*** Input Files ***") for input_file in input_files: tf.logging.info(" %s" % input_file) tpu_cluster_resolver = None #if FLAGS.use_tpu and FLAGS.tpu_name: tpu_cluster_resolver = tf.contrib.cluster_resolver.TPUClusterResolver( # TODO tpu=FLAGS.tpu_name, zone=FLAGS.tpu_zone, project=FLAGS.gcp_project) print("###tpu_cluster_resolver:",tpu_cluster_resolver,";FLAGS.use_tpu:",FLAGS.use_tpu,";FLAGS.tpu_name:",FLAGS.tpu_name,";FLAGS.tpu_zone:",FLAGS.tpu_zone) # ###tpu_cluster_resolver: ;FLAGS.use_tpu: True ;FLAGS.tpu_name: grpc://10.240.1.83:8470 is_per_host = tf.contrib.tpu.InputPipelineConfig.PER_HOST_V2 run_config = tf.contrib.tpu.RunConfig( keep_checkpoint_max=20, # 10 cluster=tpu_cluster_resolver, master=FLAGS.master, model_dir=FLAGS.output_dir, save_checkpoints_steps=FLAGS.save_checkpoints_steps, tpu_config=tf.contrib.tpu.TPUConfig( iterations_per_loop=FLAGS.iterations_per_loop, num_shards=FLAGS.num_tpu_cores, per_host_input_for_training=is_per_host)) model_fn = model_fn_builder( bert_config=bert_config, init_checkpoint=FLAGS.init_checkpoint, learning_rate=FLAGS.learning_rate, num_train_steps=FLAGS.num_train_steps, num_warmup_steps=FLAGS.num_warmup_steps, use_tpu=FLAGS.use_tpu, use_one_hot_embeddings=FLAGS.use_tpu) # If TPU is not available, this will fall back to normal Estimator on CPU # or GPU. estimator = tf.contrib.tpu.TPUEstimator( use_tpu=FLAGS.use_tpu, model_fn=model_fn, config=run_config, train_batch_size=FLAGS.train_batch_size, eval_batch_size=FLAGS.eval_batch_size) if FLAGS.do_train: tf.logging.info("***** Running training *****") tf.logging.info(" Batch size = %d", FLAGS.train_batch_size) train_input_fn = input_fn_builder( input_files=input_files, max_seq_length=FLAGS.max_seq_length, max_predictions_per_seq=FLAGS.max_predictions_per_seq, is_training=True) estimator.train(input_fn=train_input_fn, max_steps=FLAGS.num_train_steps) if FLAGS.do_eval: tf.logging.info("***** Running evaluation *****") tf.logging.info(" Batch size = %d", FLAGS.eval_batch_size) eval_input_fn = input_fn_builder( input_files=input_files, max_seq_length=FLAGS.max_seq_length, max_predictions_per_seq=FLAGS.max_predictions_per_seq, is_training=False) result = estimator.evaluate(input_fn=eval_input_fn, steps=FLAGS.max_eval_steps) output_eval_file = os.path.join(FLAGS.output_dir, "eval_results.txt") with tf.gfile.GFile(output_eval_file, "w") as writer: tf.logging.info("***** Eval results *****") for key in sorted(result.keys()): tf.logging.info(" %s = %s", key, str(result[key])) writer.write("%s = %s\n" % (key, str(result[key]))) if __name__ == "__main__": flags.mark_flag_as_required("input_file") flags.mark_flag_as_required("bert_config_file") flags.mark_flag_as_required("output_dir") tf.app.run() ================================================ FILE: tokenization.py ================================================ # coding=utf-8 # Copyright 2018 The Google AI Language Team Authors. # # Licensed under the Apache License, Version 2.0 (the "License"); # you may not use this file except in compliance with the License. # You may obtain a copy of the License at # # http://www.apache.org/licenses/LICENSE-2.0 # # Unless required by applicable law or agreed to in writing, software # distributed under the License is distributed on an "AS IS" BASIS, # WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. # See the License for the specific language governing permissions and # limitations under the License. """Tokenization classes.""" from __future__ import absolute_import from __future__ import division from __future__ import print_function import collections import re import unicodedata import six import tensorflow as tf def validate_case_matches_checkpoint(do_lower_case, init_checkpoint): """Checks whether the casing config is consistent with the checkpoint name.""" # The casing has to be passed in by the user and there is no explicit check # as to whether it matches the checkpoint. The casing information probably # should have been stored in the bert_config.json file, but it's not, so # we have to heuristically detect it to validate. if not init_checkpoint: return m = re.match("^.*?([A-Za-z0-9_-]+)/bert_model.ckpt", init_checkpoint) if m is None: return model_name = m.group(1) lower_models = [ "uncased_L-24_H-1024_A-16", "uncased_L-12_H-768_A-12", "multilingual_L-12_H-768_A-12", "chinese_L-12_H-768_A-12" ] cased_models = [ "cased_L-12_H-768_A-12", "cased_L-24_H-1024_A-16", "multi_cased_L-12_H-768_A-12" ] is_bad_config = False if model_name in lower_models and not do_lower_case: is_bad_config = True actual_flag = "False" case_name = "lowercased" opposite_flag = "True" if model_name in cased_models and do_lower_case: is_bad_config = True actual_flag = "True" case_name = "cased" opposite_flag = "False" if is_bad_config: raise ValueError( "You passed in `--do_lower_case=%s` with `--init_checkpoint=%s`. " "However, `%s` seems to be a %s model, so you " "should pass in `--do_lower_case=%s` so that the fine-tuning matches " "how the model was pre-training. If this error is wrong, please " "just comment out this check." % (actual_flag, init_checkpoint, model_name, case_name, opposite_flag)) def convert_to_unicode(text): """Converts `text` to Unicode (if it's not already), assuming utf-8 input.""" if six.PY3: if isinstance(text, str): return text elif isinstance(text, bytes): return text.decode("utf-8", "ignore") else: raise ValueError("Unsupported string type: %s" % (type(text))) elif six.PY2: if isinstance(text, str): return text.decode("utf-8", "ignore") elif isinstance(text, unicode): return text else: raise ValueError("Unsupported string type: %s" % (type(text))) else: raise ValueError("Not running on Python2 or Python 3?") def printable_text(text): """Returns text encoded in a way suitable for print or `tf.logging`.""" # These functions want `str` for both Python2 and Python3, but in one case # it's a Unicode string and in the other it's a byte string. if six.PY3: if isinstance(text, str): return text elif isinstance(text, bytes): return text.decode("utf-8", "ignore") else: raise ValueError("Unsupported string type: %s" % (type(text))) elif six.PY2: if isinstance(text, str): return text elif isinstance(text, unicode): return text.encode("utf-8") else: raise ValueError("Unsupported string type: %s" % (type(text))) else: raise ValueError("Not running on Python2 or Python 3?") def load_vocab(vocab_file): """Loads a vocabulary file into a dictionary.""" vocab = collections.OrderedDict() index = 0 with tf.gfile.GFile(vocab_file, "r") as reader: while True: token = convert_to_unicode(reader.readline()) if not token: break token = token.strip() vocab[token] = index index += 1 return vocab def convert_by_vocab(vocab, items): """Converts a sequence of [tokens|ids] using the vocab.""" output = [] #print("items:",items) #['[CLS]', '日', '##期', '，', '但', '被', '##告', '金', '##东', '##福', '载', '##明', '[MASK]', 'U', '##N', '##K', ']', '保', '##证', '本', '##月', '1', '##4', '[MASK]', '到', '##位', '，', '2', '##0', '##1', '##5', '年', '6', '[MASK]', '1', '##1', '日', '[', 'U', '##N', '##K', ']', '，', '原', '##告', '[MASK]', '认', '##可', '于', '2', '##0', '##1', '##5', '[MASK]', '6', '月', '[MASK]', '[MASK]', '日', '##向', '被', '##告', '主', '##张', '权', '##利', '。', '而', '[MASK]', '[MASK]', '自', '[MASK]', '[MASK]', '[MASK]', '[MASK]', '年', '6', '月', '1', '##1', '日', '[SEP]', '原', '##告', '于', '2', '##0', '##1', '##6', '[MASK]', '6', '[MASK]', '2', '##4', '日', '起', '##诉', '，', '主', '##张', '保', '##证', '责', '##任', '，', '已', '超', '##过', '保', '##证', '期', '##限', '[MASK]', '保', '##证', '人', '依', '##法', '不', '##再', '承', '##担', '保', '##证', '[MASK]', '[MASK]', '[MASK]', '[SEP]'] for i,item in enumerate(items): #print(i,"item:",item) # ##期 output.append(vocab[item]) return output def convert_tokens_to_ids(vocab, tokens): return convert_by_vocab(vocab, tokens) def convert_ids_to_tokens(inv_vocab, ids): return convert_by_vocab(inv_vocab, ids) def whitespace_tokenize(text): """Runs basic whitespace cleaning and splitting on a piece of text.""" text = text.strip() if not text: return [] tokens = text.split() return tokens class FullTokenizer(object): """Runs end-to-end tokenziation.""" def __init__(self, vocab_file, do_lower_case=True): self.vocab = load_vocab(vocab_file) self.inv_vocab = {v: k for k, v in self.vocab.items()} self.basic_tokenizer = BasicTokenizer(do_lower_case=do_lower_case) self.wordpiece_tokenizer = WordpieceTokenizer(vocab=self.vocab) def tokenize(self, text): split_tokens = [] for token in self.basic_tokenizer.tokenize(text): for sub_token in self.wordpiece_tokenizer.tokenize(token): split_tokens.append(sub_token) return split_tokens def convert_tokens_to_ids(self, tokens): return convert_by_vocab(self.vocab, tokens) def convert_ids_to_tokens(self, ids): return convert_by_vocab(self.inv_vocab, ids) class BasicTokenizer(object): """Runs basic tokenization (punctuation splitting, lower casing, etc.).""" def __init__(self, do_lower_case=True): """Constructs a BasicTokenizer. Args: do_lower_case: Whether to lower case the input. """ self.do_lower_case = do_lower_case def tokenize(self, text): """Tokenizes a piece of text.""" text = convert_to_unicode(text) text = self._clean_text(text) # This was added on November 1st, 2018 for the multilingual and Chinese # models. This is also applied to the English models now, but it doesn't # matter since the English models were not trained on any Chinese data # and generally don't have any Chinese data in them (there are Chinese # characters in the vocabulary because Wikipedia does have some Chinese # words in the English Wikipedia.). text = self._tokenize_chinese_chars(text) orig_tokens = whitespace_tokenize(text) split_tokens = [] for token in orig_tokens: if self.do_lower_case: token = token.lower() token = self._run_strip_accents(token) split_tokens.extend(self._run_split_on_punc(token)) output_tokens = whitespace_tokenize(" ".join(split_tokens)) return output_tokens def _run_strip_accents(self, text): """Strips accents from a piece of text.""" text = unicodedata.normalize("NFD", text) output = [] for char in text: cat = unicodedata.category(char) if cat == "Mn": continue output.append(char) return "".join(output) def _run_split_on_punc(self, text): """Splits punctuation on a piece of text.""" chars = list(text) i = 0 start_new_word = True output = [] while i < len(chars): char = chars[i] if _is_punctuation(char): output.append([char]) start_new_word = True else: if start_new_word: output.append([]) start_new_word = False output[-1].append(char) i += 1 return ["".join(x) for x in output] def _tokenize_chinese_chars(self, text): """Adds whitespace around any CJK character.""" output = [] for char in text: cp = ord(char) if self._is_chinese_char(cp): output.append(" ") output.append(char) output.append(" ") else: output.append(char) return "".join(output) def _is_chinese_char(self, cp): """Checks whether CP is the codepoint of a CJK character.""" # This defines a "chinese character" as anything in the CJK Unicode block: # https://en.wikipedia.org/wiki/CJK_Unified_Ideographs_(Unicode_block) # # Note that the CJK Unicode block is NOT all Japanese and Korean characters, # despite its name. The modern Korean Hangul alphabet is a different block, # as is Japanese Hiragana and Katakana. Those alphabets are used to write # space-separated words, so they are not treated specially and handled # like the all of the other languages. if ((cp >= 0x4E00 and cp <= 0x9FFF) or # (cp >= 0x3400 and cp <= 0x4DBF) or # (cp >= 0x20000 and cp <= 0x2A6DF) or # (cp >= 0x2A700 and cp <= 0x2B73F) or # (cp >= 0x2B740 and cp <= 0x2B81F) or # (cp >= 0x2B820 and cp <= 0x2CEAF) or (cp >= 0xF900 and cp <= 0xFAFF) or # (cp >= 0x2F800 and cp <= 0x2FA1F)): # return True return False def _clean_text(self, text): """Performs invalid character removal and whitespace cleanup on text.""" output = [] for char in text: cp = ord(char) if cp == 0 or cp == 0xfffd or _is_control(char): continue if _is_whitespace(char): output.append(" ") else: output.append(char) return "".join(output) class WordpieceTokenizer(object): """Runs WordPiece tokenziation.""" def __init__(self, vocab, unk_token="[UNK]", max_input_chars_per_word=200): self.vocab = vocab self.unk_token = unk_token self.max_input_chars_per_word = max_input_chars_per_word def tokenize(self, text): """Tokenizes a piece of text into its word pieces. This uses a greedy longest-match-first algorithm to perform tokenization using the given vocabulary. For example: input = "unaffable" output = ["un", "##aff", "##able"] Args: text: A single token or whitespace separated tokens. This should have already been passed through `BasicTokenizer. Returns: A list of wordpiece tokens. """ text = convert_to_unicode(text) output_tokens = [] for token in whitespace_tokenize(text): chars = list(token) if len(chars) > self.max_input_chars_per_word: output_tokens.append(self.unk_token) continue is_bad = False start = 0 sub_tokens = [] while start < len(chars): end = len(chars) cur_substr = None while start < end: substr = "".join(chars[start:end]) if start > 0: substr = "##" + substr if substr in self.vocab: cur_substr = substr break end -= 1 if cur_substr is None: is_bad = True break sub_tokens.append(cur_substr) start = end if is_bad: output_tokens.append(self.unk_token) else: output_tokens.extend(sub_tokens) return output_tokens def _is_whitespace(char): """Checks whether `chars` is a whitespace character.""" # \t, \n, and \r are technically contorl characters but we treat them # as whitespace since they are generally considered as such. if char == " " or char == "\t" or char == "\n" or char == "\r": return True cat = unicodedata.category(char) if cat == "Zs": return True return False def _is_control(char): """Checks whether `chars` is a control character.""" # These are technically control characters but we count them as whitespace # characters. if char == "\t" or char == "\n" or char == "\r": return False cat = unicodedata.category(char) if cat in ("Cc", "Cf"): return True return False def _is_punctuation(char): """Checks whether `chars` is a punctuation character.""" cp = ord(char) # We treat all non-letter/number ASCII as punctuation. # Characters such as "^", "$", and "`" are not in the Unicode # Punctuation class but we treat them as punctuation anyways, for # consistency. if ((cp >= 33 and cp <= 47) or (cp >= 58 and cp <= 64) or (cp >= 91 and cp <= 96) or (cp >= 123 and cp <= 126)): return True cat = unicodedata.category(char) if cat.startswith("P"): return True return False