[
  {
    "path": "LICENSE",
    "content": "BSD 3-Clause License\n\nCopyright (c) 2017, Xiang Zhang\nAll rights reserved.\n\nRedistribution and use in source and binary forms, with or without\nmodification, are permitted provided that the following conditions are met:\n\n* Redistributions of source code must retain the above copyright notice, this\n  list of conditions and the following disclaimer.\n\n* Redistributions in binary form must reproduce the above copyright notice,\n  this list of conditions and the following disclaimer in the documentation\n  and/or other materials provided with the distribution.\n\n* Neither the name of the copyright holder nor the names of its\n  contributors may be used to endorse or promote products derived from\n  this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS IS\"\nAND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE\nDISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE\nFOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL\nDAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR\nSERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER\nCAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,\nOR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\nOF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n"
  },
  {
    "path": "README.md",
    "content": "# Glyph\n\nThis repository is used to publish all the code used for the following article:\n\n[Xiang Zhang, Yann LeCun, Which Encoding is the Best for Text Classification in Chinese, English, Japanese and Korean?, arXiv 1708.02657](https://arxiv.org/abs/1708.02657)\n\nThe code and datasets are completely released as of January 2018, including all the code for crawling, preprocessing and training on the datasets. However, the documentation may not be complete yet. That said, readers could refer to the `doc` directory for an example in reproducing all the results for the Dianping dataset, and extend that to other datasets in similar ways.\n\n## Reproducibility Manifesto\n\nIf anyone sees a number in our paper, there is a script one can execute to reproduce it. No responsibility should be imposed on the user to figure out any experimental parameter barried in the paper's content.\n\n## Datasets\n\nThe `data` directory contains the preprocessing scripts for all the datasets used in the paper. These datasets are released separately of their processing source code. See below for details.\n\n### Summary\n\nThe following table is a summary of the datasets. Most of them have millions of samples for training.\n\n| Dataset        | Language     | Classes | Train      | Test      |\n|----------------|--------------|---------|------------|-----------|\n| Dianping       | Chinese      | 2       | 2,000,000  | 500,000   |\n| JD full        | Chinese      | 5       | 3,000,000  | 250,000   |\n| JD binary      | Chinese      | 2       | 4,000,000  | 360,000   |\n| Rakuten full   | Japanese     | 5       | 4,000,000  | 500,000   |\n| Rakuten binary | Japanese     | 2       | 3,400,000  | 400,000   |\n| 11st full      | Korean       | 5       | 750,000    | 100,000   |\n| 11st binary    | Korean       | 2       | 4,000,000  | 400,000   |\n| Amazon full    | English      | 5       | 3,000,000  | 650,000   |\n| Amazon binary  | English      | 2       | 3,600,000  | 400,000   |\n| Ifeng          | Chinese      | 5       | 800,000    | 50,000    |\n| Chinanews      | Chinese      | 7       | 1,400,000  | 112,000   |\n| NYTimes        | English      | 7       | 1,400,000  | 105,000   |\n| Joint full     | Multilingual | 5       | 10,750,000 | 1,500,000 |\n| Joint binary   | Multilingual | 2       | 15,000,000 | 1,560,000 |\n\n### Download\n\nDatasets are released separtely of the source code via links from Google Drive. *These datasets should only be used for the purpose of research*.\n\n| Dataset        | Train                          | Test                          |\n|----------------|--------------------------------|-------------------------------|\n| Dianping       | [Link](https://goo.gl/uKPxyo)  | [Link](https://goo.gl/2QZpLx) |\n| JD full        | [Link](https://goo.gl/u3vsak)  | [Link](https://goo.gl/hLZRky) |\n| JD binary      | [Link](https://goo.gl/ZPj1ip)  | [Link](https://goo.gl/bqiEfP) |\n| Rakuten full   | [Link](https://goo.gl/A7y14i)  | [Link](https://goo.gl/ve4mup) |\n| Rakuten binary | [Link](https://goo.gl/3kYQ2f)  | [Link](https://goo.gl/m8FpeH) |\n| 11st full      | [Link](https://goo.gl/F1oPBX)  | [Link](https://goo.gl/ZpTLND) |\n| 11st binary    | [Link](https://goo.gl/8Qi7ao)  | [Link](https://goo.gl/nbBhFq) |\n| Amazon full    | [Link](https://goo.gl/UzQWaj)  | [Link](https://goo.gl/EXkzWs) |\n| Amazon binary  | [Link](https://goo.gl/u7AxWS)  | [Link](https://goo.gl/2fft8x) |\n| Ifeng          | [Link](https://goo.gl/AtKsq4)  | [Link](https://goo.gl/tLWojy) |\n| Chinanews      | [Link](https://goo.gl/1p4kdx)  | [Link](https://goo.gl/rxvhCJ) |\n| NYTimes        | [Link](https://goo.gl/2hZeqd)  | [Link](https://goo.gl/66EDa5) |\n| Joint full     | [Link](https://goo.gl/AJfzLC)  | [Link](https://goo.gl/mibMsV) |\n| Joint binary   | [Link](https://goo.gl/YLMqNe)  | [Link](https://goo.gl/WRXQuJ) |\n\n## GNU Unifont\n\nThe `glyphnet` scripts require the GNU Unifont character images to run. The file `unifont-8.0.01.t7b.xz` can be downloaded via [this link](https://goo.gl/aFxYHq).\n"
  },
  {
    "path": "data/11st/construct_rr.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Korean datasets to Revised Romanization of Korean (RR, MC2000)\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_hepburn.py -i [input] -o [output]\n'''\n\n# Input file\nINPUT = '../data/11st/sentiment/full_train.csv'\n# Output file\nOUTPUT = '../data/11st/sentiment/full_train_rr.csv'\n\nimport argparse\nimport csv\nimport hanja\nimport unidecode\n\n# Hangul romanization libraries\nfrom hangul_romanize import Transliter\nfrom hangul_romanize.rule import academic\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    transliter = Transliter(academic)\n\n    convertRoman(transliter)\n\ndef romanizeText(transliter, text):\n    text = text.strip()\n    if text != '':\n        hangul_text = hanja.translate(text, 'substitution')\n        return transliter.translit(hangul_text)\n    return text\n\n# Convert the text in Chinese to pintin\ndef convertRoman(transliter):\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            new_row.append(unidecode.unidecode(romanizeText(\n                        transliter, row[i])).strip().replace('\\n','\\\\n'))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/11st/create_post.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of reviews\nCopyright 2016 Xiang Zhang\n\nUsage: python3 create_post.py -i [input file pattern] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/11st/post/*.json.xz'\nOUTPUT = '../data/11st/sentiment/post.csv'\n\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file pattern', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    createData()\n\ndef createData():\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    files = glob.glob(INPUT)\n    n = 0\n    filecount = 0\n    for filename in files:\n        filecount = filecount + 1\n        print('Processing file {}/{}: {}. Processed items {}.'.format(\n                filecount, len(files), filename, n))\n        try:\n            ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n            for line in ifd:\n                review = json.loads(line)\n                star = review.get('star', '')\n                title = review.get('title', '')\n                content = review.get('content', '')\n                if star != '':\n                    n = n + 1\n                    writer.writerow([star, title.replace('\\n', '\\\\n'),\n                                     content.replace('\\n', '\\\\n')])\n            ifd.close()\n        except Exception as e:\n            print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/11st/create_review.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of reviews\nCopyright 2016 Xiang Zhang\n\nUsage: python3 create_review.py -i [input file pattern] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/11st/review/*.json.xz'\nOUTPUT = '../data/11st/sentiment/review.csv'\n\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file pattern', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    createData()\n\ndef createData():\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    files = glob.glob(INPUT)\n    n = 0\n    filecount = 0\n    for filename in files:\n        filecount = filecount + 1\n        print('Processing file {}/{}: {}. Processed items {}.'.format(\n                filecount, len(files), filename, n))\n        try:\n            ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n            for line in ifd:\n                review = json.loads(line)\n                star = review.get('star', '')\n                title = review.get('title', '')\n                content = review.get('content', '')\n                if star != '':\n                    n = n + 1\n                    writer.writerow([star, title.replace('\\n', '\\\\n'),\n                                     content.replace('\\n', '\\\\n')])\n            ifd.close()\n        except Exception as e:\n            print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/11st/segment_rr_word.lua",
    "content": "--[[\nCreate romananized word data from romanized data in csv for Korean\nCopyright 2016 Xiang Zhang\n\nUsage: th segment_rr_word.lua [input] [output] [list] [read]\n--]]\n\nlocal ffi = require('ffi')\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/11st/sentiment/full_train_rr.csv'\n   local output = arg[2] or '../data/11st/sentiment/full_train_rr_word.csv'\n   local list = arg[3] or '../data/11st/sentiment/full_train_rr_word_list.csv'\n   local read = (arg[4] == 'true')\n\n   local word_index, word_total\n   if read then\n      print('Reading word index')\n      word_index, word_total = joe.readWords(list)\n   else\n      print('Counting words')\n      local word_count, word_freq = joe.splitWords(input)\n      print('Sorting words by count')\n      word_index, word_total = joe.sortWords(list, word_count, word_freq)\n   end\n\n   print('Constructing word index output')\n   joe.constructWords(input, output, word_index, word_total)\nend\n\nfunction joe.readWords(list)\n   local word_index = tds.Hash()\n   local fd = io.open(list)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: '..n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      content[1] = content[1]:gsub('\\\\n', '\\n')\n      word_index[content[1]] = n\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n   return word_index, n\nend\n\nfunction joe.splitWords(input)\n   local word_count, word_freq = tds.Hash(), tds.Hash()\n   local fd = io.open(input)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      field_set = {}\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         -- All punctuation characters except for hyphen \"-\"\n         content[i] = content[i]:gsub(\n            '([!\"#$%%&\\'()*+,./:;<=>?@%[\\\\%]^_`{|}~])', ' %1 ')\n         for word in content[i]:gmatch('[%S]+') do\n            word_count[word] = (word_count[word] or 0) + 1\n            if not field_set[word] then\n               field_set[word] = true\n               word_freq[word] = (word_freq[word] or 0) + 1\n            end\n         end\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   -- Normalizing word frequencies\n   for key, value in pairs(word_freq) do\n      word_freq[key] = value / n\n   end\n\n   return word_count, word_freq\nend\n\nfunction joe.sortWords(list, word_count, word_freq)\n   -- Sort the list of words\n   word_list = tds.Vec()\n   for word, _ in pairs(word_count) do\n      word_list[#word_list + 1] = word\n   end\n   word_list:sort(function (w, v) return word_count[w] > word_count[v] end)\n\n   -- Create the word index\n   word_index = tds.Hash()\n   for index, word in ipairs(word_list) do\n      word_index[word] = index\n   end\n\n   -- Write it to file\n   fd = io.open(list, 'w')\n   for index, word in ipairs(word_list) do\n      fd:write('\"', word:gsub(\"\\n\", \"\\\\n\"):gsub(\"\\\"\", \"\\\"\\\"\"), '\",\"',\n               word_count[word], '\",\"', word_freq[word], '\"\\n')\n   end\n\n   return word_index, #word_list\nend\n\nfunction joe.constructWords(input, output, word_index, word_total)\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n   local n = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n\n      ofd:write('\"', content[1], '\"')\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         -- All punctuation characters except for hyphen \"-\"\n         content[i] = content[i]:gsub(\n            '([!\"#$%%&\\'()*+,./:;<=>?@%[\\\\%]^_`{|}~])', ' %1 ')\n         local first_write = true\n         ofd:write(',\"')\n         for word in content[i]:gmatch('[%S]+') do\n            local index = word_index[word] or word_total + 1\n            if first_write then\n               first_write = false\n               ofd:write(index)\n            else\n               ofd:write(' ', index)\n            end\n         end\n         ofd:write('\"')\n      end\n\n      ofd:write('\\n')\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/11st/segment_word.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Korean datasets to Index of Words\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_pinyin.py -i [input] -l [list] -o [output] [-r]\n'''\n\n#Input file\nINPUT = '../data/11st/sentiment/full_train.csv'\n#Output file\nOUTPUT = '../data/11st/sentiment/full_train_word.csv'\n# List file\nLIST = '../data/11st/sentiment/full_train_word_list.csv'\n# Read already defined word list\nREAD = False\n\n# Korean dictionary path for MeCab\nMECAB_DICT_PATH = '/home/xiang/.usr/lib/mecab/dic/mecab-ko-dic'\n\nimport argparse\nimport csv\nfrom konlpy.tag import Mecab\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n    global LIST\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n    parser.add_argument('-l', '--list', help = 'Word list file', default = LIST)\n    parser.add_argument(\n        '-r', '--read', help = 'Read from list file', action = 'store_true')\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n    LIST = args.list\n    READ = args.read\n\n    if READ:\n        print('Reading word index')\n        word_index = readWords()\n    else:\n        print('Counting words')\n        word_count, word_freq = segmentWords()\n        print('Sorting words by count')\n        word_index = sortWords(word_count, word_freq)\n    print('Constructing word index output')\n    convertWords(word_index)\n\n# Read from pre-existing word list\ndef readWords():\n    # Open the files\n    ifd = open(LIST, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_index = dict()\n    n = 0\n    for row in reader:\n        word = row[0].replace('\\\\n', '\\n')\n        word_index[word] = n + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    return word_index\n\n# Segment the text in Chinese\ndef segmentWords():\n    mecab = Mecab(MECAB_DICT_PATH)\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_count = dict()\n    word_freq = dict()\n    n = 0\n    for row in reader:\n        field_set = set()\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = mecab.morphs(field)\n            for word in field_list:\n                word_count[word] = word_count.get(word, 0) + 1\n                if word not in field_set:\n                    field_set.add(word)\n                    word_freq[word] = word_freq.get(word, 0) + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    # Normalizing word frequency\n    for word in word_freq:\n        word_freq[word] = float(word_freq[word]) / float(n)\n    return word_count, word_freq\n\n# Sort words for a given count dictionary object\ndef sortWords(word_count, word_freq):\n    # Sort the words\n    word_list = sorted(\n        word_count, key = lambda word: word_count[word], reverse = True)\n    # Open the files\n    ofd = open(LIST, 'w', encoding = 'utf-8', newline = '')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over all the words\n    word_index = dict()\n    n = 0\n    for i in range(len(word_list)):\n        word = word_list[i]\n        row = [word.replace('\\n', '\\\\n'), str(word_count[word]),\n               str(word_freq[word])]\n        writer.writerow(row)\n        word_index[word] = i + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing word: {}'.format(n), end = '')\n    print('\\rProcessed words: {}'.format(n))\n    ofd.close()\n    return word_index\n\n# Convert the text in Chinese to word list\ndef convertWords(word_index):\n    mecab = Mecab(MECAB_DICT_PATH)\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = mecab.morphs(field)\n            new_row.append(' '.join(map(\n                str, map(lambda word: word_index.get(word, len(word_index) + 1),\n                         field_list))))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/README.md",
    "content": "# Datasets\n\nThis directory contains the preprocessing scripts for all the datasets used in the paper. These datasets are released separately of their processing source code.\n"
  },
  {
    "path": "data/chinanews/construct_topic.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of news articles\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_topic.py -i [input directory] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/chinanews/article'\nOUTPUT = '../data/chinanews/topic/news.csv'\nCATEGORY_FILE = '../data/chinanews/category/category.json'\n\ndef main():\n    global INPUT\n    global OUTPUT\n    global CATEGORY_FILE\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file directory', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n    parser.add_argument(\n        '-c', '--category', help = 'Category file', default = CATEGORY_FILE)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n    CATEGORY_FILE = args.category\n\n    createData()\n\ndef createData():\n    # Open the category file\n    classes = dict()\n    cfd = open(CATEGORY_FILE, encoding = 'utf-8')\n    i = 1\n    for line in cfd:\n        category = json.loads(line)\n        classes[category['code']] = i\n        i = i + 1\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    for prefix in classes:\n        files = glob.glob(INPUT + '/' + prefix + '_*.json.xz')\n        index = classes[prefix]\n        n = 0\n        filecount = 0\n        for filename in files:\n            filecount = filecount + 1\n            print('Processing file {}/{}: {}. Processed items {}.'.format(\n                    filecount, len(files), filename, n))\n            try:\n                ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n                for line in ifd:\n                    news = json.loads(line)\n                    title = news.get('title', '')\n                    content = news.get('content', list())\n                    abstract = ''\n                    if len(content) > 0:\n                        abstract = content[0]\n                    n = n + 1\n                    writer.writerow([index, title.replace('\\n', '\\\\n'),\n                                     abstract.replace('\\n', '\\\\n')])\n                ifd.close()\n            except Exception as e:\n                print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/data/README.txt",
    "content": "This directory should contain training and testing datasets.\n"
  },
  {
    "path": "data/dianping/combine_gram_count.lua",
    "content": "--[[\nCombine sorted gram counts\nCopyright 2016 Xiang Zhang\n\nUsage: th combine_gram_count.lua [input_prefix] [output] [samples] [chunks]\n\nComment: This program also outputs lines with counts as the firt unquoted csv\n   value, so that one can use GNU sort easily.\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal string = require('string')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input_prefix = arg[1] or '../data/dianping/train_chargram_count_sort/'\n   local output = arg[2] or '../data/dianping/train_chargram_count_combine.csv'\n   local samples = arg[3] and tonumber(arg[3]) or 2000000\n   local chunks = arg[4] and tonumber(arg[4]) or 100\n\n   print('Combine chunks')\n   joe.combineChunks(input_prefix, output, samples, chunks)\nend\n\nfunction joe.combineChunks(input_prefix, output, samples, chunks)\n   local n = 0\n   local ofd = io.open(output, 'w')\n   local current = {}\n   for i = 1, chunks do\n      local ifd = io.open(input_prefix..i..'.csv')\n      for line in ifd:lines() do\n         n = n + 1\n         if math.fmod(n, 100000) == 0 then\n            io.write('\\rProcessing line ', n)\n            io.flush()\n         end\n         local content = joe.parseCSVLine(line)\n         if current[1] ~= content[1] then\n            if current[1] ~= nil then\n               ofd:write(current[3], ',\"', current[1], '\",\"',\n                         current[2]:gsub('\"', '\"\"'), '\",\"',\n                         current[4] / samples, '\",\"', current[3], '\"\\n')\n            end\n            current = content\n         else\n            current[3] = current[3] + content[3]\n            current[4] = current[4] + content[4]\n         end\n      end\n      ifd:close()\n   end\n   ofd:write(current[3], ',\"', current[1], '\",\"',\n             current[2]:gsub('\"', '\"\"'), '\",\"',\n             current[4] / samples, '\",\"', current[3], '\"\\n')\n   ofd:close()\n   print('\\rProcessed lines: '..n)\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_charbag.lua",
    "content": "--[[\nConstruct unicode character bag-of-element format from unicode serialization\nCopyright 2016 Xiang Zhang\n\nUsage: th construct_charbag.lua [input] [output] [list] [read] [limit] [replace]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_code.t7b'\n   local output = arg[2] or '../data/dianping/train_charbag.t7b'\n   local list = arg[3] or '../data/dianping/train_charbag_list.csv'\n   local read = (arg[4] == 'true')\n   local limit = arg[5] and tonumber(arg[5]) or 200000\n   local replace = arg[6] and tonumber(arg[6]) or 200001\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Counting character')\n   local count, freq = joe.countBag(data, limit, replace)\n   print('Total number of values: '..count)\n\n   if read == true then\n      print('Reading frequency from '..list)\n      freq = joe.readList(list)\n   else\n      print('Outputing frequency list to '..list)\n      joe.writeList(freq, list)\n   end\n\n   print('Constructing character bag data')\n   local bag = joe.constructBag(data, count, limit, replace)\n\n   print('Saving to '..output)\n   torch.save(output, bag)\nend\n\nfunction joe.writeList(freq, list)\n   local fd = io.open(list, 'w')\n   for i = 1, freq:size(1) do\n      local char = (i <= 65536) and joe.utf8str(i - 1) or ''\n      -- Do not print control characters\n      if i < 11 or (i > 11 and i < 33) then\n         char = ''\n      end\n      fd:write('\"', i, '\",\"', char:gsub('\\n', '\\\\n'):gsub('\"', '\"\"'), '\",\"',\n               freq[i], '\"\\n')\n   end\nend\n\nfunction joe.readList(list)\n   local freq = {}\n   local fd = io.open(list)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[2] = content[2]:gsub('\\\\n', '\\n')\n      freq[#freq + 1] = tonumber(content[3])\n   end\n   return torch.Tensor(freq)\nend\n\nfunction joe.countBag(data, limit, replace)\n   local code, code_value = data.code, data.code_value\n\n   local count = 0\n   local freq = torch.zeros(math.max(limit, replace))\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n         end\n         local index = {}\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            for l = 1, code[i][j][k][2] do\n               local char = code_value[code[i][j][k][1] + l - 1]\n               if char > limit then\n                  char = replace\n               end\n               if not index[char] then\n                  count = count + 1\n                  index[char] = 1\n                  freq[char] = freq[char] + 1\n               else\n                  index[char] = index[char] + 1\n               end\n            end\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   -- Normalizing the frequency\n   local sum = 0\n   for i = 1, #code do\n      sum = sum + code[i]:size(1)\n   end\n   freq:div(sum)\n   return count, freq\nend\n\nfunction joe.constructBag(data, count, limit, replace)\n   local code, code_value = data.code, data.code_value\n   local bag = {}\n   local bag_index = torch.LongTensor(count)\n   local bag_value = torch.DoubleTensor(count)\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      bag[i] = torch.LongTensor(code[i]:size(1), 2)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n         end\n         local index = {}\n         local pointer = {}\n         bag[i][j][1] = count + 1\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            for l = 1, code[i][j][k][2] do\n               local char = code_value[code[i][j][k][1] + l - 1]\n               if char > limit then\n                  char = replace\n               end\n               if not index[char] then\n                  count = count + 1\n                  index[char] = 1\n                  pointer[#pointer + 1] = char\n               else\n                  index[char] = index[char] + 1\n               end\n            end\n         end\n         table.sort(pointer)\n         bag[i][j][2] = #pointer\n         for m = 1, #pointer do\n            bag_index[bag[i][j][1] + m - 1] = pointer[m]\n            if pointer[m] > limit then\n               bag_value[bag[i][j][1] + m - 1] = 0\n            else\n               bag_value[bag[i][j][1] + m - 1] = index[pointer[m]]\n            end\n         end\n        if #pointer > 0 and\n        bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum() ~= 0 then\n           bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):div(\n              bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum())\n        end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return {bag = bag, bag_index = bag_index, bag_value = bag_value}\nend\n\njoe.bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\nfunction joe.utf8str(decimal)\n   local bytemarkers = joe.bytemarkers\n   if decimal < 128 then return string.char(decimal) end\n   local charbytes = {}\n   for bytes,vals in ipairs(bytemarkers) do\n      if decimal <= vals[1] then\n        for b = bytes + 1, 2, -1 do\n          local mod = decimal % 64\n          decimal = (decimal - mod) / 64\n          charbytes[b] = string.char(128+mod)\n        end\n        charbytes[1] = string.char(vals[2] + decimal)\n        break\n      end\n    end\n   return table.concat(charbytes)\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_chargram.lua",
    "content": "--[[\nConstruct unicode character ngrams format from unicode serialization\nCopyright 2016 Xiang Zhang\n\nUsage: th construct_chargram.lua [input] [output] [list] [read] [gram] [limit]\n   [replace]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_code.t7b'\n   local output = arg[2] or '../data/dianping/train_chargram.t7b'\n   local list = arg[3] or '../data/dianping/train_chargram_list.csv'\n   local read = (arg[4] == nil) or(arg[4] == 'true')\n   local gram = arg[5] and tonumber(arg[5]) or 5\n   local limit = arg[6] and tonumber(arg[6]) or 1000000\n   local replace = arg[7] and tonumber(arg[7]) or 1000001\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   local freq, dict, ngrams\n   if read == true then\n      print('Reading frequency from '..list)\n      freq, dict = joe.readList(list)\n   else\n      print('Constructing dictionary and frequency list')\n      freq, dict, ngrams = joe.constructList(data, gram)\n      print('Outputing frequency list to '..list)\n      joe.writeList(freq, ngrams, list)\n   end\n\n   print('Counting character ngrams data')\n   local count = joe.countBag(data, dict, gram, limit, replace)\n   print('Total number of ngrams in data is '..count)\n\n   print('Constructing character bag data')\n   local bag = joe.constructBag(data, dict, count, gram, limit, replace)\n\n   print('Saving to '..output)\n   torch.save(output, bag)\nend\n\nfunction joe.constructList(data, gram)\n   local count = tds.Hash()\n   local docs = tds.Hash()\n   local code, code_value = data.code, data.code_value\n\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n            collectgarbage()\n         end\n         local index = {}\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            -- Iterate through the grams\n            for n = 1, gram do\n               -- Iterate through the positions\n               for l = 1, code[i][j][k][2] - n + 1 do\n                  local ngram = tostring(code_value[code[i][j][k][1] + l - 1])\n                  for m = 2, n do\n                     ngram = ngram..' '..tostring(\n                        code_value[code[i][j][k][1] + l - 1 + m - 1])\n                  end\n                  if not index[ngram] then\n                     docs[ngram] = (docs[ngram] or 0) + 1\n                     index[ngram] = 0\n                  end\n                  index[ngram] = index[ngram] + 1\n                  count[ngram] = (count[ngram] or 0) + 1\n               end\n            end\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   local ngrams = tds.Vec()\n   for ngram, value in pairs(count) do\n      ngrams[#ngrams + 1] = ngram\n   end\n   ngrams:sort(function(a, b) return count[a] > count[b] end)\n\n   local sum = 0\n   for i = 1, #code do\n      sum = sum + code[i]:size(1)\n   end\n\n   local dict = tds.Hash()\n   local freq = torch.Tensor(#ngrams)\n   for index, ngram in ipairs(ngrams) do\n      dict[ngram] = index\n      freq[index] = (docs[ngram] or 0) / sum\n   end\n\n   return freq, dict, ngrams\nend\n\nfunction joe.writeList(freq, ngrams, list)\n   local fd = io.open(list, 'w')\n   for i = 1, freq:size(1) do\n      local ngram_string = ''\n      for code in ngrams[i]:gmatch('[%S]+') do\n         local code = tonumber(code)\n         local char = (code <= 65536 and (code > 32 or code == 11)) and\n            joe.utf8str(code - 1) or ' '\n         ngram_string = ngram_string..char\n      end\n      fd:write('\"', ngrams[i], '\",\"',\n               ngram_string:gsub('\\n', '\\\\n'):gsub('\"', '\"\"'), '\",\"',\n               freq[i], '\"\\n')\n   end\nend\n\nfunction joe.readList(list)\n   local freq_table = tds.Vec()\n   local dict = tds.Hash()\n   local fd = io.open(list)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[2] = content[2]:gsub('\\\\n', '\\n')\n      freq_table[#freq_table + 1] = tonumber(content[3])\n      dict[content[1]] = #freq_table\n   end\n\n   local freq = torch.Tensor(#freq_table)\n   for i, v in ipairs(freq_table) do\n      freq[i] = v\n   end\n   return freq, dict\nend\n\nfunction joe.countBag(data, dict, gram, limit, replace)\n   local count = 0\n   local code, code_value = data.code, data.code_value\n\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n            collectgarbage()\n         end\n         local index = {}\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            -- Iterate through the grams\n            for n = 1, gram do\n               -- Iterate through the positions\n               for l = 1, code[i][j][k][2] - n + 1 do\n                  local ngram = tostring(code_value[code[i][j][k][1] + l - 1])\n                  for m = 2, n do\n                     ngram = ngram..' '..tostring(\n                        code_value[code[i][j][k][1] + l - 1 + m - 1])\n                  end\n                  local ngram_index = dict[ngram]\n                  if ngram_index == nil or ngram_index > limit then\n                     ngram_index = replace\n                  end\n                  if not index[ngram_index] then\n                     index[ngram_index] = 0\n                     count = count + 1\n                  end\n                  index[ngram_index] = index[ngram_index] + 1\n               end\n            end\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return count\nend\n\nfunction joe.constructBag(data, dict, count, gram, limit, replace)\n   local code, code_value = data.code, data.code_value\n   local bag = {}\n   local bag_index = torch.LongTensor(count)\n   local bag_value = torch.DoubleTensor(count)\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      bag[i] = torch.LongTensor(code[i]:size(1), 2)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n            collectgarbage()\n         end\n         local index = {}\n         local pointer = {}\n         bag[i][j][1] = count + 1\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            -- Iterate through the grams\n            for n = 1, gram do\n               -- Iterate through the positions\n               for l = 1, code[i][j][k][2] - n + 1 do\n                  local ngram = tostring(code_value[code[i][j][k][1] + l - 1])\n                  for m = 2, n do\n                     ngram = ngram..' '..tostring(\n                        code_value[code[i][j][k][1] + l - 1 + m - 1])\n                  end\n                  local ngram_index = dict[ngram]\n                  if ngram_index == nil or ngram_index > limit then\n                     ngram_index = replace\n                  end\n                  if not index[ngram_index] then\n                     count = count + 1\n                     index[ngram_index] = 0\n                     pointer[#pointer + 1] = ngram_index\n                  end\n                  index[ngram_index] = index[ngram_index] + 1\n               end\n            end\n         end\n         table.sort(pointer)\n         bag[i][j][2] = #pointer\n         for m = 1, #pointer do\n            bag_index[bag[i][j][1] + m - 1] = pointer[m]\n            if pointer[m] > limit then\n               bag_value[bag[i][j][1] + m - 1] = 0\n            else\n               bag_value[bag[i][j][1] + m - 1] = index[pointer[m]]\n            end\n         end\n         if #pointer > 0 and\n         bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum() ~= 0 then\n            bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):div(\n               bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum())\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return {bag = bag, bag_index = bag_index, bag_value = bag_value}\nend\n\njoe.bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\nfunction joe.utf8str(decimal)\n   local bytemarkers = joe.bytemarkers\n   if decimal < 128 then return string.char(decimal) end\n   local charbytes = {}\n   for bytes,vals in ipairs(bytemarkers) do\n      if decimal <= vals[1] then\n        for b = bytes + 1, 2, -1 do\n          local mod = decimal % 64\n          decimal = (decimal - mod) / 64\n          charbytes[b] = string.char(128+mod)\n        end\n        charbytes[1] = string.char(vals[2] + decimal)\n        break\n      end\n    end\n   return table.concat(charbytes)\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_chartoken.lua",
    "content": "--[[\nCreate chartoken format for fastText\nCopyright 2017 Xiang Zhang\n\nUsage: th construct_chartoken.lua [input] [output]\n--]]\n\nlocal bit32 = require('bit32')\nlocal io = require('io')\nlocal math = require('math')\nlocal string = require('string')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train.csv'\n   local output = arg[2] or '../data/dianping/train_chartoken.txt'\n\n   print('Construct token')\n   joe.constructToken(input, output)\nend\n\nfunction joe.constructToken(input, output)\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n\n   local n = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      ofd:write('__label__', class)\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', ' '):gsub(\n            '[%z\\001-\\031\\127]', ' '):gsub('^%s*(.-)%s*$', '%1')\n         local sequence = joe.utf8to32(content[i])\n         for j, code in ipairs(sequence) do\n            if code > 32 then\n               ofd:write(' ', joe.utf8str(code))\n            end\n         end\n      end\n\n      ofd:write('\\n')\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\n-- UTF-8 decoding function\n-- Ref: http://lua-users.org/wiki/LuaUnicode\nfunction joe.utf8to32(utf8str)\n   assert(type(utf8str) == 'string')\n   local res, seq, val = {}, 0, nil\n   for i = 1, #utf8str do\n      local c = string.byte(utf8str, i)\n      if seq == 0 then\n         table.insert(res, val)\n         seq = c < 0x80 and 1 or c < 0xE0 and 2 or c < 0xF0 and 3 or\n            c < 0xF8 and 4 or --c < 0xFC and 5 or c < 0xFE and 6 or\n            error('Invalid UTF-8 character sequence')\n         val = bit32.band(c, 2^(8-seq) - 1)\n      else\n         val = bit32.bor(bit32.lshift(val, 6), bit32.band(c, 0x3F))\n      end\n      seq = seq - 1\n   end\n   table.insert(res, val)\n   table.insert(res, 0)\n   return res\nend\n\n-- UTF-8 encoding function\n-- Ref: http://stackoverflow.com/questions/7983574/how-to-write-a-unicode-symbol\n--      -in-lua\nfunction joe.utf8str(decimal)\n   local bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\n   if decimal < 128 then return string.char(decimal) end\n   local charbytes = {}\n   for bytes,vals in ipairs(bytemarkers) do\n      if decimal <= vals[1] then\n         for b = bytes + 1, 2, -1 do\n            local mod = decimal % 64\n            decimal = (decimal - mod) / 64\n            charbytes[b] = string.char(128+mod)\n         end\n         charbytes[1] = string.char(vals[2] + decimal)\n         break\n      end\n   end\n   return table.concat(charbytes)\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_code.lua",
    "content": "--[[\nConstruct unicode serialization format from string serialization format\nCopyright 2015-2016 Xiang Zhang\n\nUsage: th construct_code.lua [input] [output] [limit] [replace]\n--]]\n\nlocal bit32 = require('bit32')\nlocal ffi = require('ffi')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_string.t7b'\n   local output = arg[2] or '../data/dianping/train_code.t7b'\n   local limit = arg[3] and tonumber(arg[3]) or 65536\n   local replace = arg[4] and tonumber(arg[4]) or 33\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Counting UTF-8 code')\n   local count = joe.countCode(data)\n   print('Total number of codes: '..count)\n\n   print('Constructing UTF-8 code data')\n   local code = joe.constructCode(data, count, limit, replace)\n\n   print('Saving to '..output)\n   torch.save(output, code)\nend\n\nfunction joe.countCode(data)\n   local index, content = data.index, data.content\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #index do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, index[i]:size(1) do\n         if math.fmod(j, 10000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', index[i]:size(1))\n            io.flush()\n         end\n         -- Iterate through the fields\n         for k = 1, index[i][j]:size(1) do\n            local text = ffi.string(\n               torch.data(content:narrow(1, index[i][j][k][1], 1)))\n            local sequence = joe.utf8to32(text)\n            count = count + #sequence\n         end\n      end\n      print('\\rProcessed texts: '..index[i]:size(1)..'/'..index[i]:size(1))\n   end\n\n   return count\nend\n\nfunction joe.constructCode(data, count, limit, replace)\n   local index, content = data.index, data.content\n   local code = {}\n   local code_value = torch.LongTensor(count)\n\n   local p = 1\n   -- Iterate through the classes\n   for i = 1, #index do\n      print('Processing for class '..i)\n      code[i] = index[i]:clone():zero()\n      -- Iterate through the samples\n      for j = 1, index[i]:size(1) do\n         if math.fmod(j, 10000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', index[i]:size(1))\n            io.flush()\n         end\n         -- Iterate through the fields\n         for k = 1, index[i][j]:size(1) do\n            local text = ffi.string(\n               torch.data(content:narrow(1, index[i][j][k][1], 1)))\n            local sequence = joe.utf8to32(text)\n            code[i][j][k][1] = p\n            code[i][j][k][2] = #sequence\n            for l = 1, #sequence do\n               code_value[p + l - 1] = sequence[l] + 1\n               if limit and code_value[p + l - 1] > limit then\n                  code_value[p + l - 1] = replace\n               end\n            end\n            p = p + #sequence\n         end\n      end\n      print('\\rProcessed texts: '..index[i]:size(1)..'/'..index[i]:size(1))\n   end\n\n   return {code = code, code_value = code_value}\nend\n\n-- UTF-8 decoding function\n-- Ref: http://lua-users.org/wiki/LuaUnicode\nfunction joe.utf8to32(utf8str)\n   assert(type(utf8str) == 'string')\n   local res, seq, val = {}, 0, nil\n   for i = 1, #utf8str do\n      local c = string.byte(utf8str, i)\n      if seq == 0 then\n         table.insert(res, val)\n         seq = c < 0x80 and 1 or c < 0xE0 and 2 or c < 0xF0 and 3 or\n            c < 0xF8 and 4 or --c < 0xFC and 5 or c < 0xFE and 6 or\n            error('Invalid UTF-8 character sequence')\n         val = bit32.band(c, 2^(8-seq) - 1)\n      else\n         val = bit32.bor(bit32.lshift(val, 6), bit32.band(c, 0x3F))\n      end\n      seq = seq - 1\n   end\n   table.insert(res, val)\n   table.insert(res, 0)\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_pinyin.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Chinese datasets to Pinyin format\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_pinyin.py -i [input] -o [output]\n'''\n\n#Input file\nINPUT = '../data/dianping/train.csv'\n#Output file\nOUTPUT = '../data/dianping/train_pinyin.csv'\n\nimport argparse\nimport csv\nimport pypinyin\nimport unidecode\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    convertPinyin()\n\n# Convert the text in Chinese to pintin\ndef convertPinyin():\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            new_row.append(' '.join(map(\n                str.strip,\n                map(lambda s: s.replace('\\n', '\\\\n'),\n                    map(unidecode.unidecode,\n                        pypinyin.lazy_pinyin(\n                            row[i], style = pypinyin.TONE2))))))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/dianping/construct_reviews.lua",
    "content": "--[[\nCreate reviews in csv format from original txt file\nCopyright 2015-2016 Xiang Zhang\n\nUsage: th construct_reviews [input] [output]\n--]]\n\nlocal cjson = require('cjson')\nlocal io = require('io')\nlocal math = require('math')\n\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/reviews.txt'\n   local output = arg[2] or '../data/dianping/reviews.csv'\n\n   local ifd = io.open(input)\n   local ofd = io.open(output, \"w\")\n   local n = 0\n   local valid = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n, ', valid: ', valid)\n         io.flush()\n      end\n\n      -- Skip the first line\n      if n > 1 then\n         -- Break content to url and json\n         local point = line:find('%^')\n         local data = line:sub(point + 2):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         -- Parse the data\n         local parsed = cjson.decode(data)\n         local content = parsed.content:gsub(\"^%s*(.-)%s*$\", \"%1\")\n         local rate = tonumber(parsed.rate)\n         -- Record to csv\n         if rate and rate >= 0 and #content > 0 then\n            valid = valid + 1\n            content = content:gsub(\"\\n\", \"\\\\n\"):gsub(\"\\\"\", \"\\\"\\\"\")\n            ofd:write('\"'..rate..'\",\"'..content..'\"\\n')\n         end\n      end\n   end\n   ifd:close()\n   ofd:close()\n   print('\\rProcessed lines: '..n..', valid: '..valid)\nend\n\njoe.main()\nreturn joe"
  },
  {
    "path": "data/dianping/construct_string.lua",
    "content": "--[[\nCreate string serialization format from csv files\nCopyright 2015-2016 Xiang Zhang\n\nUsage: th construct_string.lua [input] [output]\n--]]\n\nlocal ffi = require('ffi')\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train.csv'\n   local output = arg[2] or '../data/dianping/train_string.t7b'\n\n   print('Counting samples')\n   local count, bytes, fields = joe.countSamples(input)\n   for i, v in ipairs(count) do\n      print('Number of samples in class '..i..': '..v)\n   end\n   print('Total number of bytes: '..bytes)\n   print('Number of text fields: '..fields)\n\n   print('Constructing data')\n   local data = joe.constructData(input, count, bytes, fields)\n   print('Saving to '..output)\n   torch.save(output, data)\nend\n\nfunction joe.countSamples(input)\n   local count = {}\n   local bytes = 0\n   local fields = nil\n   local n = 0\n   local fd = io.open(input)\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      count[class] = count[class] and count[class] + 1 or 1\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         bytes = bytes + content[i]:len() + 1\n      end\n      fields = fields or #content - 1\n      if fields ~= #content - 1 then\n         error('Number of fields is not '..fields..' at line '..n)\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return count, bytes, fields\nend\n\nfunction joe.constructData(input, count, bytes, fields)\n   local data = torch.ByteTensor(bytes)\n   local index = {}\n   for i, v in ipairs(count) do\n      index[i] = torch.LongTensor(v, fields, 2)\n   end\n\n   local progress = {}\n   local n = 0\n   local p = 1\n   local fd = io.open(input)\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      progress[class] = progress[class] and progress[class] + 1 or 1\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         index[class][progress[class]][i - 1][1] = p\n         index[class][progress[class]][i - 1][2] = content[i]:len()\n         ffi.copy(torch.data(data:narrow(1, p, content[i]:len() + 1)),\n                  content[i])\n         p = p + content[i]:len() + 1\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return {content = data, index = index}\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_tfidf.lua",
    "content": "--[[\nConstruct tfidf format from bag format\nCopyright 2016 Xiang Zhang\n\nUsage: th construct_tfidf.lua [input] [output] [list] [limit]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_charbag.t7b'\n   local output = arg[2] or '../data/dianping/train_charbagtfidf.t7b'\n   local list = arg[3] or '../data/dianping/train_charbag_list.csv'\n   local limit = arg[4] and tonumber(arg[4]) or 200000\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Loading frequency list from '..list)\n   local freq = joe.readList(list)\n   print('Frequency list length '..freq:size(1))\n\n   print('Constructing bag-of-elements TFIDF data')\n   local tfidf = joe.constructTfidf(data, freq, limit)\n\n   print('Saving to '..output)\n   torch.save(output, tfidf)\nend\n\nfunction joe.readList(list)\n   local freq = {}\n   local fd = io.open(list)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[2] = content[2]:gsub('\\\\n', '\\n')\n      freq[#freq + 1] = tonumber(content[3])\n   end\n   return torch.Tensor(freq)\nend\n\nfunction joe.constructTfidf(data, freq, limit)\n   local bag, bag_index, bag_value = data.bag, data.bag_index, data.bag_value\n   local tfidf_value = bag_value:clone()\n\n   local freq = freq\n   if freq:size(1) > limit then\n      freq:narrow(1, limit + 1, freq:size(1) - limit):zero()\n   elseif freq:size(1) < limit + 1 then\n      local new_freq = freq.new(limit + 1):zero()\n      new_freq:narrow(1, 1, freq:size(1)):copy(freq)\n      freq = new_freq\n   end\n\n   freq:apply(function (x) return x > 0 and math.log(1/x) or 0 end)\n   local indexed = freq:index(1, bag_index)\n   tfidf_value:cmul(indexed)\n\n   -- Iterate through the classes\n   for i = 1, #bag do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, bag[i]:size(1) do\n         if math.fmod(j, 10000) == 0 then\n            io.write('\\rProcessing sample: ', j, '/', bag[i]:size(1))\n            io.flush()\n         end\n         if bag[i][j][2] > 0 and\n         tfidf_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum() ~= 0 then\n            tfidf_value:narrow(1, bag[i][j][1], bag[i][j][2]):div(\n               tfidf_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum())\n         end\n      end\n      print('\\rProcessed samples: '..bag[i]:size(1)..'/'..bag[i]:size(1))\n   end\n\n   return {bag = bag, bag_index = bag_index, bag_value = tfidf_value}\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_word.lua",
    "content": "--[[\nCreate word serialization format from csv files\nCopyright 2015-2016 Xiang Zhang\n\nUsage: th construct_word.lua [input] [output]\n--]]\n\nlocal ffi = require('ffi')\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.csv'\n   local output = arg[2] or '../data/dianping/train_word.t7b'\n\n   print('Counting samples')\n   local count, length, fields = joe.countSamples(input)\n   for i, v in ipairs(count) do\n      print('Number of samples in class '..i..': '..v)\n   end\n   print('Total number of words: '..length)\n   print('Number of text fields: '..fields)\n\n   print('Constructing data')\n   local data = joe.constructData(input, count, length, fields)\n   print('Saving to '..output)\n   torch.save(output, data)\nend\n\nfunction joe.countSamples(input)\n   local count = {}\n   local length = 0\n   local fields = nil\n   local n = 0\n   local fd = io.open(input)\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      count[class] = count[class] and count[class] + 1 or 1\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub('^%s*(.-)%s*$', '%1')\n         local _, current_length = content[i]:gsub('(%d+)', '%1')\n         length = length + current_length\n      end\n      fields = fields or #content - 1\n      if fields ~= #content - 1 then\n         error('Number of fields is not '..fields..' at line '..n)\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return count, length, fields\nend\n\nfunction joe.constructData(input, count, length, fields)\n   local data = torch.LongTensor(length)\n   local index = {}\n   for i, v in ipairs(count) do\n      index[i] = torch.LongTensor(v, fields, 2)\n   end\n\n   local progress = {}\n   local n = 0\n   local p = 1\n   local fd = io.open(input)\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      progress[class] = progress[class] and progress[class] + 1 or 1\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub('^%s*(.-)%s*$', '%1')\n         index[class][progress[class]][i - 1][1] = p\n         local current_length = 0\n         for word in content[i]:gmatch('%d+') do\n            data[p] = tonumber(word)\n            p = p + 1\n         end\n         index[class][progress[class]][i - 1][2] =\n            p - index[class][progress[class]][i - 1][1]\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return {code = index, code_value = data}\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_wordbag.lua",
    "content": "--[[\nConstruct word bag-of-element format\nCopyright 2016 Xiang Zhang\n\nUsage: th construct_wordbag.lua [input] [output] [limit] [replace]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.t7b'\n   local output = arg[2] or '../data/dianping/train_wordbag.t7b'\n   local limit = arg[3] and tonumber(arg[3]) or 200000\n   local replace = arg[4] and tonumber(arg[4]) or 200001\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Counting words')\n   local count = joe.countBag(data, limit, replace)\n   print('Total number of values: '..count)\n\n   print('Constructing word bag data')\n   local bag = joe.constructBag(data, count, limit, replace)\n\n   print('Saving to '..output)\n   torch.save(output, bag)\nend\n\nfunction joe.countBag(data, limit, replace)\n   local code, code_value = data.code, data.code_value\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n         end\n         local index = {}\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            for l = 1, code[i][j][k][2] do\n               local word = code_value[code[i][j][k][1] + l - 1]\n               if word > limit then\n                  word = replace\n               end\n               if not index[word] then\n                  count = count + 1\n                  index[word] = 1\n               else\n                  index[word] = index[word] + 1\n               end\n            end\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return count\nend\n\nfunction joe.constructBag(data, count, limit, replace)\n   local code, code_value = data.code, data.code_value\n   local bag = {}\n   local bag_index = torch.LongTensor(count)\n   local bag_value = torch.DoubleTensor(count)\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      bag[i] = torch.LongTensor(code[i]:size(1), 2)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n         end\n         local index = {}\n         local pointer = {}\n         bag[i][j][1] = count + 1\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            for l = 1, code[i][j][k][2] do\n               local word = code_value[code[i][j][k][1] + l - 1]\n               if word > limit then\n                  word = replace\n               end\n               if not index[word] then\n                  count = count + 1\n                  index[word] = 1\n                  pointer[#pointer + 1] = word\n               else\n                  index[word] = index[word] + 1\n               end\n            end\n         end\n         table.sort(pointer)\n         bag[i][j][2] = #pointer\n         for m = 1, #pointer do\n            bag_index[bag[i][j][1] + m - 1] = pointer[m]\n            if pointer[m] > limit then\n               bag_value[bag[i][j][1] + m - 1] = 0\n            else\n               bag_value[bag[i][j][1] + m - 1] = index[pointer[m]]\n            end\n         end\n         if #pointer > 0 and\n         bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum() ~= 0 then\n            bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):div(\n               bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum())\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return {bag = bag, bag_index = bag_index, bag_value = bag_value}\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_wordgram.lua",
    "content": "--[[\nConstructngrams format from serialization\nCopyright 2016 Xiang Zhang\n\nUsage: th construct_wordgram.lua [input] [output] [list] [gram] [limit]\n   [replace]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.t7b'\n   local output = arg[2] or '../data/dianping/train_wordgram.t7b'\n   local list = arg[3] or '../data/dianping/train_wordgram_list.csv'\n   local gram = arg[4] and tonumber(arg[4]) or 5\n   local limit = arg[5] and tonumber(arg[5]) or 1000000\n   local replace = arg[6] and tonumber(arg[6]) or 1000001\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Reading frequency from '..list)\n   local freq, dict = joe.readList(list)\n\n   print('Counting character ngrams data')\n   local count = joe.countBag(data, dict, gram, limit, replace)\n   print('Total number of ngrams in data is '..count)\n\n   print('Constructing character bag data')\n   local bag = joe.constructBag(data, dict, count, gram, limit, replace)\n\n   print('Saving to '..output)\n   torch.save(output, bag)\nend\n\nfunction joe.readList(list)\n   local freq_table = tds.Vec()\n   local dict = tds.Hash()\n   local fd = io.open(list)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[2] = content[2]:gsub('\\\\n', '\\n')\n      freq_table[#freq_table + 1] = tonumber(content[3])\n      dict[content[1]] = #freq_table\n   end\n\n   local freq = torch.Tensor(#freq_table)\n   for i, v in ipairs(freq_table) do\n      freq[i] = v\n   end\n   return freq, dict\nend\n\nfunction joe.countBag(data, dict, gram, limit, replace)\n   local count = 0\n   local code, code_value = data.code, data.code_value\n\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n            collectgarbage()\n         end\n         local index = {}\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            -- Iterate through the grams\n            for n = 1, gram do\n               -- Iterate through the positions\n               for l = 1, code[i][j][k][2] - n + 1 do\n                  local ngram = tostring(code_value[code[i][j][k][1] + l - 1])\n                  for m = 2, n do\n                     ngram = ngram..' '..tostring(\n                        code_value[code[i][j][k][1] + l - 1 + m - 1])\n                  end\n                  local ngram_index = dict[ngram]\n                  if ngram_index == nil or ngram_index > limit then\n                     ngram_index = replace\n                  end\n                  if not index[ngram_index] then\n                     index[ngram_index] = 0\n                     count = count + 1\n                  end\n                  index[ngram_index] = index[ngram_index] + 1\n               end\n            end\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return count\nend\n\nfunction joe.constructBag(data, dict, count, gram, limit, replace)\n   local code, code_value = data.code, data.code_value\n   local bag = {}\n   local bag_index = torch.LongTensor(count)\n   local bag_value = torch.DoubleTensor(count)\n\n   local count = 0\n   -- Iterate through the classes\n   for i = 1, #code do\n      print('Processing for class '..i)\n      bag[i] = torch.LongTensor(code[i]:size(1), 2)\n      -- Iterate through the samples\n      for j = 1, code[i]:size(1) do\n         if math.fmod(j, 1000) == 0 then\n            io.write('\\rProcessing text: ', j, '/', code[i]:size(1))\n            io.flush()\n            collectgarbage()\n         end\n         local index = {}\n         local pointer = {}\n         bag[i][j][1] = count + 1\n         -- Iterate through the fields\n         for k = 1, code[i][j]:size(1) do\n            -- Iterate through the grams\n            for n = 1, gram do\n               -- Iterate through the positions\n               for l = 1, code[i][j][k][2] - n + 1 do\n                  local ngram = tostring(code_value[code[i][j][k][1] + l - 1])\n                  for m = 2, n do\n                     ngram = ngram..' '..tostring(\n                        code_value[code[i][j][k][1] + l - 1 + m - 1])\n                  end\n                  local ngram_index = dict[ngram]\n                  if ngram_index == nil or ngram_index > limit then\n                     ngram_index = replace\n                  end\n                  if not index[ngram_index] then\n                     count = count + 1\n                     index[ngram_index] = 0\n                     pointer[#pointer + 1] = ngram_index\n                  end\n                  index[ngram_index] = index[ngram_index] + 1\n               end\n            end\n         end\n         table.sort(pointer)\n         bag[i][j][2] = #pointer\n         for m = 1, #pointer do\n            bag_index[bag[i][j][1] + m - 1] = pointer[m]\n            if pointer[m] > limit then\n               bag_value[bag[i][j][1] + m - 1] = 0\n            else\n               bag_value[bag[i][j][1] + m - 1] = index[pointer[m]]\n            end\n         end\n         if #pointer > 0 and\n         bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum() ~= 0 then\n            bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):div(\n               bag_value:narrow(1, bag[i][j][1], bag[i][j][2]):sum())\n         end\n      end\n      print('\\rProcessed texts: '..code[i]:size(1)..'/'..code[i]:size(1))\n   end\n\n   return {bag = bag, bag_index = bag_index, bag_value = bag_value}\nend\n\njoe.bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\nfunction joe.utf8str(decimal)\n   local bytemarkers = joe.bytemarkers\n   if decimal < 128 then return string.char(decimal) end\n   local charbytes = {}\n   for bytes,vals in ipairs(bytemarkers) do\n      if decimal <= vals[1] then\n        for b = bytes + 1, 2, -1 do\n          local mod = decimal % 64\n          decimal = (decimal - mod) / 64\n          charbytes[b] = string.char(128+mod)\n        end\n        charbytes[1] = string.char(vals[2] + decimal)\n        break\n      end\n    end\n   return table.concat(charbytes)\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/construct_wordtoken.lua",
    "content": "--[[\nConstruct word token format from csv files\nCopyright 2017 Xiang Zhang\n\nUsage: th construct_wordtoken [input] [list] [output]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.csv'\n   local list = arg[2] or '../data/dianping/train_word_list.csv'\n   local output = arg[3] or '../data/dianping/train_wordtoken.txt'\n\n   print('Reading list from '..list)\n   local word_list = joe.readList(list)\n\n   print('Constructing word token')\n   joe.constructToken(input, output, word_list)\nend\n\nfunction joe.readList(list)\n   local word_list = tds.Vec()\n   local fd = io.open(list)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      word_list[#word_list + 1] =\n         content[1]:gsub('\\\\n', '\\n'):gsub('[%z\\001-\\032\\127]', ' '):gsub(\n            '^%s*(.-)%s*$', '%1')\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return word_list\nend\n\nfunction joe.constructToken(input, output, word_list)\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n\n   local n = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      ofd:write('__label__', class)\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub('^%s*(.-)%s*$', '%1')\n         for word in content[i]:gmatch('%d+') do\n            local word_string = word_list[tonumber(word)] or '<unk>'\n            ofd:write(' ', word_string)\n         end\n      end\n      ofd:write('\\n')\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n\n"
  },
  {
    "path": "data/dianping/convert_string_code.lua",
    "content": "--[[\nConvert string serialization to code\nCopyright 2016 Xiang Zhang\n\nUsage: th convert_string_code.lua [input] [output]\n--]]\n\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_string.t7b'\n   local output = arg[2] or '../data/dianping/train_string_code.t7b'\n\n   print('Reading from '..input)\n   local input_data = torch.load(input)\n   print('Converting to code format')\n   local output_data = joe.convert(input_data)\n   print('Saving to '..output)\n   torch.save(output, output_data)\nend\n\nfunction joe.convert(input_data)\n   local output_data = {}\n   output_data.code = input_data.index\n   output_data.code_value = input_data.content\n   return output_data\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/count_chargram.lua",
    "content": "--[[\nParallelized chargram counting program\nCopyright Xiang Zhang 2016\n\nUsage: th count_chargram.lua [input] [output_prefix] [grams] [chunks] [threads]\n   [batch] [buffer]\n\nComment: This program is a map-reduce like process. During map, each sample is\n   separated into character-ngrams. During reduce, these character-ngrams are\n   aggregated per-batch samples and output to file chunks. Which files chunk to\n   put the gram is determined by a hash value of the gram string, therefore\n   instances of the same gram always end up in the same file. This program is\n   necessary because a linear aggregation program can easily overflow memory for\n   several millions of samples.\n--]]\n\nlocal hash = require('hash')\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal threads = require('threads')\nlocal torch = require('torch')\n\nlocal Queue = require('queue')\n\n-- Library configurations\nthreads.serialization('threads.sharedserialize')\n\n-- A Logic Named Joe\nlocal joe = {}\n\n-- Constant values\njoe.SEED = 0\n\n-- Main program entry\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_code.t7b'\n   local output_prefix = arg[2] or '../data/dianping/train_chargram_count/'\n   local num_grams = arg[3] and tonumber(arg[3]) or 5\n   local chunks = arg[4] and tonumber(arg[4]) or 100\n   local num_threads = arg[5] and tonumber(arg[5]) or 10\n   local batch = arg[6] and tonumber(arg[6]) or 100000\n   local buffer = arg[7] and tonumber(arg[7]) or 1000\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n   print('Opening output files with prefix '..output_prefix)\n   local fds = {}\n   for i = 1, chunks do\n      fds[i] = io.open(output_prefix..tostring(i)..'.csv', 'w')\n   end\n   joe.fds = fds\n   print('Setting finished threads to 0')\n   joe.finished = 0\n   print('Creating record')\n   joe.record = tds.Hash()\n   print('Setting item counter to 0')\n   joe.count = 0\n   print('Storing options')\n   joe.batch = batch\n\n   print('Creating queues')\n   local queue = Queue(buffer)\n   print('Creating mutex')\n   local mutex = threads.Mutex()\n   print('Creating '..num_threads..' threads')\n   local init_thread = joe.initThread()\n   local block = threads.Threads(num_threads, init_thread)\n   block:specific(true)\n   print('Deploying thread jobs')\n   joe.deployThreads(data, num_grams, queue, mutex, block, num_threads)\n\n   print('Entering main thread loop')\n   while joe.finished < num_threads do\n      local rpc = queue:pop()\n      joe[rpc.func](unpack(rpc.arg))\n   end\n   if math.fmod(joe.count, batch) ~= 0 then\n      print('Writing records to files at '..joe.count)\n      joe.writeRecord()\n   end\n\n   print('Destroying mutex')\n   mutex:free()\n   print('Closing files')\n   for _, fd in ipairs(fds) do\n      fd:close()\n   end\n\n   print('Synchronizing and terminating the threads')\n   block:synchronize()\n   block:terminate()\nend\n\n-- Thread initialization callback\nfunction joe.initThread()\n   return function ()\n      local torch = require('torch')\n      local Queue = require('queue')\n   end\nend\n\n-- Thread job deploying threads\nfunction joe.deployThreads(data, num_grams, queue, mutex, block, num_threads)\n   local progress = torch.LongTensor(2)\n   progress[1] = 1\n   progress[2] = 0\n   for i = 1, num_threads do\n      print('Deploying job for thread '..i)\n      local thread_job = joe.threadJob(\n         data, num_grams, queue, mutex:id(), progress, i)\n      block:addjob(i, thread_job)\n      local rpc = queue:pop()\n      while rpc.func ~= 'notifyDeploy' do\n         joe[rpc.func](unpack(rpc.arg))\n         rpc = queue:pop()\n      end\n      print('rpc = notifyDeploy, thread = '..rpc.arg[1])\n   end\nend\n\n-- Write records to file\nfunction joe.writeRecord()\n   for code, item in pairs(joe.record) do\n      local chunk = hash.hash(code, joe.SEED, #joe.fds) + 1\n      joe.fds[chunk]:write(\n         '\"', code, '\",\"', item[1]:gsub('\\n', '\\\\n'):gsub('\"', '\"\"'), '\",\"',\n         item[2], '\",\"', item[3], '\"\\n')\n   end\n   joe.record = tds.Hash()\n   collectgarbage()\nend\n\n-- Thread job\nfunction joe.threadJob(data, num_grams, queue, mutex_id, progress, thread_id)\n   local utf8str = joe.utf8str()\n   return function()\n      local math = require('math')\n      local string = require('string')\n      local threads = require('threads')\n      local mutex = threads.Mutex(mutex_id)\n\n      -- Notify the deployment\n      queue:push{func = 'notifyDeploy', arg = {__threadid}}\n\n      local code, code_value = data.code, data.code_value\n      local class, item\n\n      -- Obtain next sample\n      local function nextSample()\n         mutex:lock()\n         if code[progress[1]] == nil then\n            class = progress[1]\n            item = progress[2]\n         elseif code[progress[1]]:size(1) < progress[2] + 1 then\n            progress[1] = progress[1] + 1\n            progress[2] = 1\n            class = progress[1]\n            item = progress[2]\n         else\n            progress[2] = progress[2] + 1\n            class = progress[1]\n            item = progress[2]\n         end\n         mutex:unlock()\n      end\n\n      local n = 0\n      nextSample()\n      while code[class] ~= nil do\n         n = n + 1\n         if math.fmod(n, 100) == 0 then\n            queue:push{\n               func = 'print',\n               arg = {__threadid,\n                      'Processing class '..class..', item '..item..\n                         ', total '..n}}\n            collectgarbage()\n         end\n         local term_count, doc_count = {}, {}\n         -- Iterate through the fields\n         for i = 1, code[class][item]:size(1) do\n            -- Iterate through the grams\n            for j = 1, num_grams do\n               -- Iterate through the positions\n               for k = 1, code[class][item][i][2] - j + 1 do\n                  local code_string = tostring(\n                     code_value[code[class][item][i][1] + k - 1])\n                  for l = 2, j do\n                     code_string = code_string..' '..tostring(\n                        code_value[code[class][item][i][1] + k - 1 + l - 1])\n                  end\n                  if not term_count[code_string] then\n                     term_count[code_string] = 1\n                     doc_count[code_string] = 1\n                  else\n                     term_count[code_string] = term_count[code_string] + 1\n                  end\n               end\n            end\n         end\n         -- Compress record to data\n         local items = {}\n         for code_string, _ in pairs(term_count) do\n            local gram_string = ''\n            for value in code_string:gmatch('[%S]+') do\n               local value = tonumber(value)\n               gram_string = gram_string..\n                  ((value <= 65536 and (value > 32 or value == 11)) and\n                      utf8str(value - 1) or ' ')\n            end\n            items[#items + 1] = {\n               code_string, gram_string, term_count[code_string],\n               doc_count[code_string]}\n         end\n         -- Send data to record\n         queue:push{func = 'recordItem', arg = {__threadid, items}}\n         nextSample()\n      end\n\n      -- Notify main thread that this thread has ended\n      queue:push{func = 'notifyExit', arg = {__threadid}}\n   end\nend\n\n-- Record item\nfunction joe.recordItem(thread_id, items)\n   for _, item in pairs(items) do\n      if joe.record[item[1]] then\n         joe.record[item[1]][2] = joe.record[item[1]][2] + item[3]\n         joe.record[item[1]][3] = joe.record[item[1]][3] + item[4]\n      else\n         joe.record[item[1]] = tds.Vec{item[2], item[3], item[4]}\n      end\n   end\n   joe.count = joe.count + 1\n\n   -- Check write\n   if math.fmod(joe.count, joe.batch) == 0 then\n      print('Writing records to files at '..joe.count)\n      joe.writeRecord()\n   end\nend\n\n\n\n-- Print information\nfunction joe.print(thread_id, message)\n   print('rpc = print, thread = '..thread_id..', message = '..message)\nend\n\n-- Notify exit\nfunction joe.notifyExit(thread_id)\n   joe.finished = joe.finished + 1\n   print('rpc = notifyExit, thread = '..thread_id..\n            ', finished = '..joe.finished)\nend\n\n-- UTF-8 encoding function\n-- Ref: http://stackoverflow.com/questions/7983574/how-to-write-a-unicode-symbol\n--      -in-lua\nfunction joe.utf8str()\n   local bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\n   return function (decimal)\n      local string = require('string')\n      if decimal < 128 then return string.char(decimal) end\n      local charbytes = {}\n      for bytes,vals in ipairs(bytemarkers) do\n         if decimal <= vals[1] then\n            for b = bytes + 1, 2, -1 do\n               local mod = decimal % 64\n               decimal = (decimal - mod) / 64\n               charbytes[b] = string.char(128+mod)\n            end\n            charbytes[1] = string.char(vals[2] + decimal)\n            break\n         end\n      end\n      return table.concat(charbytes)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/count_wordgram.lua",
    "content": "--[[\nParallelized wordgram counting program\nCopyright Xiang Zhang 2016\n\nUsage: th count_wordgram.lua [input] [output_prefix] [list] [grams] [chunks]\n   [threads] [batch] [buffer]\n\nComment: This program is a map-reduce like process. During map, each sample is\n   separated into character-ngrams. During reduce, these character-ngrams are\n   aggregated per-batch samples and output to file chunks. Which files chunk to\n   put the gram is determined by a hash value of the gram string, therefore\n   instances of the same gram always end up in the same file. This program is\n   necessary because a linear aggregation program can easily overflow memory for\n   several millions of samples.\n--]]\n\nlocal hash = require('hash')\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal threads = require('threads')\nlocal torch = require('torch')\n\nlocal Queue = require('queue')\n\n-- Library configurations\nthreads.serialization('threads.sharedserialize')\n\n-- A Logic Named Joe\nlocal joe = {}\n\n-- Constant values\njoe.SEED = 0\n\n-- Main program entry\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.t7b'\n   local output_prefix = arg[2] or '../data/dianping/train_wordgram_count/'\n   local list = arg[3] or '../data/dianping/train_word_list.csv'\n   local num_grams = arg[4] and tonumber(arg[4]) or 5\n   local chunks = arg[5] and tonumber(arg[5]) or 100\n   local num_threads = arg[6] and tonumber(arg[6]) or 10\n   local batch = arg[7] and tonumber(arg[7]) or 100000\n   local buffer = arg[8] and tonumber(arg[8]) or 1000\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n   print('Loading list from '..list)\n   local freq, word_list = joe.readList(list)\n   print('Opening output files with prefix '..output_prefix)\n   local fds = {}\n   for i = 1, chunks do\n      fds[i] = io.open(output_prefix..tostring(i)..'.csv', 'w')\n   end\n   joe.fds = fds\n   print('Setting finished threads to 0')\n   joe.finished = 0\n   print('Creating record')\n   joe.record = tds.Hash()\n   print('Setting item counter to 0')\n   joe.count = 0\n   print('Storing options')\n   joe.batch = batch\n\n   print('Creating queues')\n   local queue = Queue(buffer)\n   print('Creating mutex')\n   local mutex = threads.Mutex()\n   print('Creating '..num_threads..' threads')\n   local init_thread = joe.initThread()\n   local block = threads.Threads(num_threads, init_thread)\n   block:specific(true)\n   print('Deploying thread jobs')\n   joe.deployThreads(\n      data, word_list, num_grams, queue, mutex, block, num_threads)\n\n   print('Entering main thread loop')\n   while joe.finished < num_threads do\n      local rpc = queue:pop()\n      joe[rpc.func](unpack(rpc.arg))\n   end\n   if math.fmod(joe.count, batch) ~= 0 then\n      print('Writing records to files at '..joe.count)\n      joe.writeRecord()\n   end\n\n   print('Destroying mutex')\n   mutex:free()\n   print('Closing files')\n   for _, fd in ipairs(fds) do\n      fd:close()\n   end\n\n   print('Synchronizing and terminating the threads')\n   block:synchronize()\n   block:terminate()\nend\n\n-- Thread initialization callback\nfunction joe.initThread()\n   return function ()\n      local torch = require('torch')\n      local Queue = require('queue')\n   end\nend\n\n-- Thread job deploying threads\nfunction joe.deployThreads(\n      data, word_list, num_grams, queue, mutex, block, num_threads)\n   local progress = torch.LongTensor(2)\n   progress[1] = 1\n   progress[2] = 0\n   for i = 1, num_threads do\n      print('Deploying job for thread '..i)\n      local thread_job = joe.threadJob(\n         data, word_list, num_grams, queue, mutex:id(), progress, i)\n      block:addjob(i, thread_job)\n      local rpc = queue:pop()\n      while rpc.func ~= 'notifyDeploy' do\n         joe[rpc.func](unpack(rpc.arg))\n         rpc = queue:pop()\n      end\n      print('rpc = notifyDeploy, thread = '..rpc.arg[1])\n   end\nend\n\n-- Write records to file\nfunction joe.writeRecord()\n   for code, item in pairs(joe.record) do\n      local chunk = hash.hash(code, joe.SEED, #joe.fds) + 1\n      joe.fds[chunk]:write(\n         '\"', code, '\",\"', item[1]:gsub('\\n', '\\\\n'):gsub('\"', '\"\"'), '\",\"',\n         item[2], '\",\"', item[3], '\"\\n')\n   end\n   joe.record = tds.Hash()\n   collectgarbage()\nend\n\n-- Thread job\nfunction joe.threadJob(\n      data, word_list, num_grams, queue, mutex_id, progress, thread_id)\n   local utf8str = joe.utf8str()\n   return function()\n      local math = require('math')\n      local string = require('string')\n      local threads = require('threads')\n      local mutex = threads.Mutex(mutex_id)\n\n      -- Notify the deployment\n      queue:push{func = 'notifyDeploy', arg = {__threadid}}\n\n      local code, code_value = data.code, data.code_value\n      local class, item\n\n      -- Obtain next sample\n      local function nextSample()\n         mutex:lock()\n         if code[progress[1]] == nil then\n            class = progress[1]\n            item = progress[2]\n         elseif code[progress[1]]:size(1) < progress[2] + 1 then\n            progress[1] = progress[1] + 1\n            progress[2] = 1\n            class = progress[1]\n            item = progress[2]\n         else\n            progress[2] = progress[2] + 1\n            class = progress[1]\n            item = progress[2]\n         end\n         mutex:unlock()\n      end\n\n      local n = 0\n      nextSample()\n      while code[class] ~= nil do\n         n = n + 1\n         if math.fmod(n, 100) == 0 then\n            queue:push{\n               func = 'print',\n               arg = {__threadid,\n                      'Processing class '..class..', item '..item..\n                         ', total '..n}}\n            collectgarbage()\n         end\n         local term_count, doc_count = {}, {}\n         -- Iterate through the fields\n         for i = 1, code[class][item]:size(1) do\n            -- Iterate through the grams\n            for j = 1, num_grams do\n               -- Iterate through the positions\n               for k = 1, code[class][item][i][2] - j + 1 do\n                  local code_string = tostring(\n                     code_value[code[class][item][i][1] + k - 1])\n                  for l = 2, j do\n                     code_string = code_string..' '..tostring(\n                        code_value[code[class][item][i][1] + k - 1 + l - 1])\n                  end\n                  if not term_count[code_string] then\n                     term_count[code_string] = 1\n                     doc_count[code_string] = 1\n                  else\n                     term_count[code_string] = term_count[code_string] + 1\n                  end\n               end\n            end\n         end\n         -- Compress record to data\n         local items = {}\n         for code_string, _ in pairs(term_count) do\n            local gram_string = ''\n            for value in code_string:gmatch('[%S]+') do\n               local value = tonumber(value)\n               gram_string = gram_string..' '..(word_list[value] or '')\n            end\n            items[#items + 1] = {\n               code_string, gram_string, term_count[code_string],\n               doc_count[code_string]}\n         end\n         -- Send data to record\n         queue:push{func = 'recordItem', arg = {__threadid, items}}\n         nextSample()\n      end\n\n      -- Notify main thread that this thread has ended\n      queue:push{func = 'notifyExit', arg = {__threadid}}\n   end\nend\n\n-- Record item\nfunction joe.recordItem(thread_id, items)\n   for _, item in pairs(items) do\n      if joe.record[item[1]] then\n         joe.record[item[1]][2] = joe.record[item[1]][2] + item[3]\n         joe.record[item[1]][3] = joe.record[item[1]][3] + item[4]\n      else\n         joe.record[item[1]] = tds.Vec{item[2], item[3], item[4]}\n      end\n   end\n   joe.count = joe.count + 1\n\n   -- Check write\n   if math.fmod(joe.count, joe.batch) == 0 then\n      print('Writing records to files at '..joe.count)\n      joe.writeRecord()\n   end\nend\n\n\n\n-- Print information\nfunction joe.print(thread_id, message)\n   print('rpc = print, thread = '..thread_id..', message = '..message)\nend\n\n-- Notify exit\nfunction joe.notifyExit(thread_id)\n   joe.finished = joe.finished + 1\n   print('rpc = notifyExit, thread = '..thread_id..\n            ', finished = '..joe.finished)\nend\n\n-- UTF-8 encoding function\n-- Ref: http://stackoverflow.com/questions/7983574/how-to-write-a-unicode-symbol\n--      -in-lua\nfunction joe.utf8str()\n   local bytemarkers = {{0x7FF, 192}, {0xFFFF, 224}, {0x1FFFFF, 240}}\n   return function (decimal)\n      local string = require('string')\n      if decimal < 128 then return string.char(decimal) end\n      local charbytes = {}\n      for bytes,vals in ipairs(bytemarkers) do\n         if decimal <= vals[1] then\n            for b = bytes + 1, 2, -1 do\n               local mod = decimal % 64\n               decimal = (decimal - mod) / 64\n               charbytes[b] = string.char(128+mod)\n            end\n            charbytes[1] = string.char(vals[2] + decimal)\n            break\n         end\n      end\n      return table.concat(charbytes)\n   end\nend\n\nfunction joe.readList(list)\n   local freq = {}\n   local word_list = tds.Hash()\n   local fd = io.open(list)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[2] = content[2]:gsub('\\\\n', '\\n')\n      freq[#freq + 1] = tonumber(content[3])\n      word_list[#freq] = content[1]:gsub('\\\\n', '\\n')\n   end\n   return torch.Tensor(freq), word_list\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/limit_code.lua",
    "content": "--[[\nLimit the maximum code value\nCopyright 2016 Xiang Zhang\n\nUsage: th limit_code.lua [input] [output] [limit]\n--]]\n\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_word.t7b'\n   local output = arg[2] or '../data/dianping/train_word_limit.t7b'\n   local limit = arg[3] and tonumber(arg[3]) or 200000\n\n   print('Loading data from '..input)\n   local data = torch.load(input)\n\n   print('Limiting code to '..limit)\n   local code = joe.limitCode(data, limit)\n\n   print('Saving to '..output)\n   torch.save(output, code)\nend\n\nfunction joe.limitCode(data, limit)\n   local code, code_value = data.code, data.code_value\n   local preserve = code_value:le(limit):long()\n   local replace = code_value:gt(limit):long()\n   code_value:cmul(preserve):add(replace:mul(limit + 1))\n   return {code = code, code_value = code_value}\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/limit_csvlines.sh",
    "content": "#!/bin/bash\n\n# Limit csv files to designated number of lines\n# Copyright 2015 Xiang Zhang\n#\n# Usage: bash limit_csvlines.sh [input] [output] [limit]\n\nset -x;\nset -e;\n\nhead -n ${3:-1000001} $1 > $2;\n"
  },
  {
    "path": "data/dianping/queue.lua",
    "content": "--[[\nMultithreaded queue based on tds\nCopyright 2015 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal ffi = require('ffi')\nlocal serialize = require('threads.sharedserialize')\nlocal tds = require('tds')\nlocal threads = require('threads')\nlocal torch = require('torch')\n\n-- Append an underscore to distinguish between metatable and class name\nlocal Queue_ = torch.class('Queue')\n\n-- Constructor\n-- n: buffer size\nfunction Queue_:__init(size)\n   self.data = tds.hash()\n   self.pointer = torch.LongTensor(3):fill(1)\n   self.pointer[3] = 0\n   self.size = size or 10\n   self.mutex = threads.Mutex()\n   self.added_condition = threads.Condition()\n   self.removed_condition = threads.Condition()\nend\n\nfunction Queue_:push(item)\n   local storage = serialize.save(item)\n   self.mutex:lock()\n   while self.pointer[3] == self.size do\n      self.removed_condition:wait(self.mutex)\n   end\n   self.data[self.pointer[1]] = storage:string()\n   self.pointer[1] = math.fmod(self.pointer[1], self.size) + 1\n   self.pointer[3] = self.pointer[3] + 1\n   self.mutex:unlock()\n   self.added_condition:signal()\nend\n\nfunction Queue_:pop()\n   self.mutex:lock()\n   while self.pointer[3] == 0 do\n      self.added_condition:wait(self.mutex)\n   end\n   local storage = torch.CharStorage():string(self.data[self.pointer[2]])\n   self.pointer[2] = math.fmod(self.pointer[2], self.size) + 1\n   self.pointer[3] = self.pointer[3] - 1\n   self.mutex:unlock()\n   self.removed_condition:signal()\n   local item = serialize.load(storage)\n   return item\nend\n\nfunction Queue_:push_async(item)\n   if self.pointer[3] == self.size then\n      return\n   end\n   local storage = serialize.save(item)\n   self.mutex:lock()\n   if self.pointer[3] == self.size then\n      self.mutex:unlock()\n      return\n   end\n   self.data[self.pointer[1]] = storage:string()\n   self.pointer[1] = math.fmod(self.pointer[1], self.size) + 1\n   self.pointer[3] = self.pointer[3] + 1\n   self.mutex:unlock()\n   self.added_condition:signal()\n   return item\nend\n\nfunction Queue_:pop_async()\n   if self.pointer[3] == 0 then\n      return\n   end\n   self.mutex:lock()\n   if self.pointer[3] == 0 then\n      self.mutex:unlock()\n      return\n   end\n   local storage = torch.CharStorage():string(self.data[self.pointer[2]])\n   self.pointer[2] = math.fmod(self.pointer[2], self.size) + 1\n   self.pointer[3] = self.pointer[3] - 1\n   self.mutex:unlock()\n   self.removed_condition:signal()\n   local item = serialize.load(storage)\n   return item\nend\n\nfunction Queue_:free()\n   self.mutex:free()\n   self.added_condition:free()\n   self.removed_condition:free()\nend\n\nfunction Queue_:__write(f)\n   local data = self.data\n   f:writeLong(torch.pointer(data))\n   tds.C.tds_hash_retain(data)\n\n   local pointer = self.pointer\n   f:writeLong(torch.pointer(pointer))\n   pointer:retain()\n\n   f:writeObject(self.size)\n   f:writeObject(self.mutex:id())\n   f:writeObject(self.added_condition:id())\n   f:writeObject(self.removed_condition:id())\nend\n\nfunction Queue_:__read(f)\n   local data = f:readLong()\n   data = ffi.cast('tds_hash&', data)\n   ffi.gc(data, tds.C.tds_hash_free)\n   self.data = data\n\n   local pointer = f:readLong()\n   pointer = torch.pushudata(pointer, 'torch.LongTensor')\n   self.pointer = pointer\n   \n   self.size = f:readObject()\n   self.mutex = threads.Mutex(f:readObject())\n   self.added_condition = threads.Condition(f:readObject())\n   self.removed_condition = threads.Condition(f:readObject())\nend\n\n-- Return class name, not the underscored metatable\nreturn Queue\n"
  },
  {
    "path": "data/dianping/remove_duplication.py",
    "content": "#!/usr/bin/python3\n\n'''\nRemove duplication from csv format file\nCopyright 2015 Xiang Zhang\n\nUsage: python3 remove_duplication.py -i [input] -o [output]\n'''\n\n# Python 3 compatibility\nfrom __future__ import absolute_import\nfrom __future__ import division\nfrom __future__ import print_function\nfrom __future__ import unicode_literals\n\n# Input file\nINPUT = '../data/dianping/reviews_nonull.csv'\n# Output file\nOUTPUT = '../data/dianping/reviews_nodup.csv'\n\nimport argparse\nimport csv\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    removeDuplicate()\n\n# Deduplicate the text using python set\ndef removeDuplicate():\n    # Open the files\n    ifd = open(INPUT, newline = '', encoding = 'utf-8')\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    valid = 0\n    s = set()\n    for row in reader:\n        line = ' '.join(row[1:])\n        n = n + 1\n        if line not in s:\n            valid = valid + 1\n            s.add(line)\n            writer.writerow(row)\n        if n % 10000 == 0:\n            print('\\rProcessing line: {}, valid: {}'.format(n, valid), end = '')\n    print('\\rProcessed lines: {}, valid: {}'.format(n, valid))\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/dianping/remove_null.sh",
    "content": "#!/bin/bash\n\n# Remove NULL character from file\n# Copyright 2015 Xiang Zhang\n#\n# Usage: bash remove_null.sh [input] [output]\n\nset -x;\nset -e;\n\ntr -d '\\000' < $1 > $2;\n"
  },
  {
    "path": "data/dianping/segment_roman_word.lua",
    "content": "--[[\nCreate romananized word data from romanized data in csv\nCopyright 2016 Xiang Zhang\n\nUsage: th segment_roman_word.lua [input] [output] [list] [read]\n--]]\n\nlocal ffi = require('ffi')\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/dianping/train_pinyin.csv'\n   local output = arg[2] or '../data/dianping/train_pinyin_word.csv'\n   local list = arg[3] or '../data/dianping/train_pinyin_word_list.csv'\n   local read = (arg[4] == 'true')\n\n   local word_index, word_total\n   if read then\n      print('Reading word index')\n      word_index, word_total = joe.readWords(list)\n   else\n      print('Counting words')\n      local word_count, word_freq = joe.splitWords(input)\n      print('Sorting words by count')\n      word_index, word_total = joe.sortWords(list, word_count, word_freq)\n   end\n\n   print('Constructing word index output')\n   joe.constructWords(input, output, word_index, word_total)\nend\n\nfunction joe.readWords(list)\n   local word_index = tds.Hash()\n   local fd = io.open(list)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: '..n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      content[1] = content[1]:gsub('\\\\n', '\\n')\n      word_index[content[1]] = n\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n   return word_index, n\nend\n\nfunction joe.splitWords(input)\n   local word_count, word_freq = tds.Hash(), tds.Hash()\n   local fd = io.open(input)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      field_set = {}\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         content[i] = content[i]:gsub('(%p)', ' %1 ')\n         for word in content[i]:gmatch('[%S]+') do\n            word_count[word] = (word_count[word] or 0) + 1\n            if not field_set[word] then\n               field_set[word] = true\n               word_freq[word] = (word_freq[word] or 0) + 1\n            end\n         end\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   -- Normalizing word frequencies\n   for key, value in pairs(word_freq) do\n      word_freq[key] = value / n\n   end\n\n   return word_count, word_freq\nend\n\nfunction joe.sortWords(list, word_count, word_freq)\n   -- Sort the list of words\n   local word_list = tds.Vec()\n   for word, _ in pairs(word_count) do\n      word_list[#word_list + 1] = word\n   end\n   word_list:sort(function (w, v) return word_count[w] > word_count[v] end)\n\n   -- Create the word index\n   local word_index = tds.Hash()\n   for index, word in ipairs(word_list) do\n      word_index[word] = index\n   end\n\n   -- Write it to file\n   local fd = io.open(list, 'w')\n   for index, word in ipairs(word_list) do\n      fd:write('\"', word:gsub(\"\\n\", \"\\\\n\"):gsub(\"\\\"\", \"\\\"\\\"\"), '\",\"',\n               word_count[word], '\",\"', word_freq[word], '\"\\n')\n   end\n   fd:close()\n\n   return word_index, #word_list\nend\n\nfunction joe.constructWords(input, output, word_index, word_total)\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n   local n = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n\n      ofd:write('\"', content[1], '\"')\n      for i = 2, #content do\n         content[i] = content[i]:gsub('\\\\n', '\\n'):gsub(\"^%s*(.-)%s*$\", \"%1\")\n         content[i] = content[i]:gsub('(%p)', ' %1 ')\n         local first_write = true\n         ofd:write(',\"')\n         for word in content[i]:gmatch('[%S]+') do\n            local index = word_index[word] or word_total + 1\n            if first_write then\n               first_write = false\n               ofd:write(index)\n            else\n               ofd:write(' ', index)\n            end\n         end\n         ofd:write('\"')\n      end\n\n      ofd:write('\\n')\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/segment_word.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Chinese datasets to Index of Words\nCopyright 2016 Xiang Zhang\n\nUsage: python3 segment_word.py -i [input] -l [list] -o [output] [-r]\n'''\n\n#Input file\nINPUT = '../data/dianping/train.csv'\n#Output file\nOUTPUT = '../data/dianping/train_word.csv'\n# List file\nLIST = '../data/dianping/train_word_list.csv'\n# Read already defined word list\nREAD = False\n\nimport argparse\nimport csv\nimport jieba\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n    global LIST\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n    parser.add_argument('-l', '--list', help = 'Word list file', default = LIST)\n    parser.add_argument(\n        '-r', '--read', help = 'Read from list file', action = 'store_true')\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n    LIST = args.list\n    READ = args.read\n\n    if READ:\n        print('Reading word index')\n        word_index = readWords()\n    else:\n        print('Counting words')\n        word_count, word_freq = segmentWords()\n        print('Sorting words by count')\n        word_index = sortWords(word_count, word_freq)\n    print('Constructing word index output')\n    convertWords(word_index)\n\n# Read from pre-existing word list\ndef readWords():\n    # Open the files\n    ifd = open(LIST, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_index = dict()\n    n = 0\n    for row in reader:\n        word = row[0].replace('\\\\n', '\\n')\n        word_index[word] = n + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    return word_index\n\n# Segment the text in Chinese\ndef segmentWords():\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_count = dict()\n    word_freq = dict()\n    n = 0\n    for row in reader:\n        field_set = set()\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = jieba.cut(field)\n            for word in field_list:\n                word_count[word] = word_count.get(word, 0) + 1\n                if word not in field_set:\n                    field_set.add(word)\n                    word_freq[word] = word_freq.get(word, 0) + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    # Normalizing word frequency\n    for word in word_freq:\n        word_freq[word] = float(word_freq[word]) / float(n)\n    return word_count, word_freq\n\n# Sort words for a given count dictionary object\ndef sortWords(word_count, word_freq):\n    # Sort the words\n    word_list = sorted(\n        word_count, key = lambda word: word_count[word], reverse = True)\n    # Open the files\n    ofd = open(LIST, 'w', encoding = 'utf-8', newline = '')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over all the words\n    word_index = dict()\n    n = 0\n    for i in range(len(word_list)):\n        word = word_list[i]\n        row = [word.replace('\\n', '\\\\n'), str(word_count[word]),\n               str(word_freq[word])]\n        writer.writerow(row)\n        word_index[word] = i + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing word: {}'.format(n), end = '')\n    print('\\rProcessed words: {}'.format(n))\n    ofd.close()\n    return word_index\n\n# Convert the text in Chinese to word list\ndef convertWords(word_index):\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = jieba.cut(field)\n            new_row.append(' '.join(map(\n                str, map(lambda word: word_index.get(word, len(word_index) + 1),\n                         field_list))))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/dianping/select_data.lua",
    "content": "--[[\nSelect data from non-duplicate datasets\nCopyright 2015 Xiang Zhang\n\nUsage: th select_data.lua [count] [input] [output]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local count = arg[1] or '../data/dianping/reviews_count.csv'\n   local input = arg[2] or '../data/dianping/reviews_nodup.csv'\n   local output = arg[3] or '../data/dianping/data.csv'\n\n   local map = {}\n   local index = {}\n   local cfd = io.open(count)\n   for line in cfd:lines() do\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n      local target = tonumber(content[2])\n      local total = tonumber(content[3])\n      local choose = tonumber(content[4])\n\n      print('Constructing index '..class..'>'..target..': '..choose..'/'..total)\n      map[class] = target\n      index[class] = torch.ByteTensor(total):fill(1)\n      local perm = torch.randperm(total)\n      for i = 1, total - choose do\n         index[class][perm[i]] = 0\n      end\n   end\n   cfd:close()\n\n   local n = 0\n   local progress = {}\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 100000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n      local target = map[class]\n\n      progress[class] = progress[class] and progress[class] + 1 or 1\n      if index[class] and index[class][progress[class]] == 1 then\n         ofd:write(\n            '\"', target, '\"', (line:sub(content[1]:len() + 3) or ''), '\\n')\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine (line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/dianping/shuffle_lines.sh",
    "content": "#!/bin/bash\n\n# Shuffle lines in a text file\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash shuffle_lines.sh [input] [output]\n\nset -x;\nset -e;\n\nshuf $1 > $2;\n"
  },
  {
    "path": "data/dianping/sort_gram_count.sh",
    "content": "#!/bin/bash\n\n# Sort distributed grams file\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash sort_gram_count.sh [input_directory] [output_directory] [temporary] [memory]\n\nset -x;\nset -e;\n\nfor file in $1/*.csv; do\n    sort -S ${4:-50%} -t ',' -k1,1 -T ${3:-/scratch} $file > $2/`basename $file`\ndone;\n"
  },
  {
    "path": "data/dianping/sort_gram_list.sh",
    "content": "#!/bin/bash\n\n# Sort list of grams and cut the count\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash sort_gram_list.sh [input] [output] [temporary] [memory]\n\nset -x;\nset -e;\n\nsort -S ${4:-50%} -t ',' -k1,1nr -T ${3:-/scratch} $1 | cut -f 2- -d ',' > $2;\n"
  },
  {
    "path": "data/dianping/split_lines.sh",
    "content": "#!/bin/bash\n\n# Split lines in a text file\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash split_lines.sh [lines] [input] [output_prefix]\n#\n# Note: .txt postfix will be automatically added.\n\nset -x;\nset -e;\n\nsplit -d -a 1 --additional-suffix=.txt -l $1 $2 $3;\n"
  },
  {
    "path": "data/dianping/split_train.lua",
    "content": "--[[\nSplit data into training and testing subsets\nCopyright 2015 Xiang Zhang\n\nUsage: th split_train [count] [input] [train] [test]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local count = arg[1] or '../data/dianping/data_count.csv'\n   local input = arg[2] or '../data/dianping/data.csv'\n   local train = arg[3] or '../data/dianping/train.csv'\n   local test = arg[4] or '../data/dianping/test.csv'\n\n   local index = {}\n   local cfd = io.open(count)\n   for line in cfd:lines() do\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n      local total = tonumber(content[2])\n      local train_count = tonumber(content[3])\n      local test_count = tonumber(content[4])\n\n      print('Constructing index '..class..': '..train_count..\n               ','..test_count..','..total)\n      index[class] = torch.ByteTensor(total):zero()\n      local perm = torch.randperm(total)\n      for i = 1, test_count do\n         index[class][perm[i]] = 1\n      end\n   end\n   cfd:close()\n\n   local n = 0\n   local progress = {}\n   local ifd = io.open(input)\n   local trfd = io.open(train, 'w')\n   local tefd = io.open(test, 'w')\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 100000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n\n      progress[class] = progress[class] and progress[class] + 1 or 1\n      if index[class] and index[class][progress[class]] == 0 then\n         trfd:write(line, '\\n')\n      else\n         tefd:write(line, '\\n')\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\n   trfd:close()\n   tefd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine (line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/ifeng/construct_topic.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of news articles\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_topic.py -i [input directory] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/ifeng/article'\nOUTPUT = '../data/ifeng/topic/news.csv'\n\n# Classes\n# 1: Mainlaind China Politics\n# 2: International\n# 3: Taiwan, Hong Kong and Macau Politics\n# 4: Military\n# 5: Society\nCLASSES = {'11528': 1, '11574': 2, '11490': 3, '7609': 3, '4550': 4, '7837': 5}\n\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file pattern', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    createData()\n\ndef createData():\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    for prefix in CLASSES:\n        files = glob.glob(INPUT + '/' + prefix + '_*.json.xz')\n        index = CLASSES[prefix]\n        n = 0\n        filecount = 0\n        for filename in files:\n            filecount = filecount + 1\n            print('Processing file {}/{}: {}. Processed items {}.'.format(\n                    filecount, len(files), filename, n))\n            try:\n                ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n                for line in ifd:\n                    news = json.loads(line)\n                    title = news.get('title', '')\n                    content = news.get('content', list())\n                    abstract = ''\n                    if len(content) > 0:\n                        abstract = content[0]\n                    n = n + 1\n                    writer.writerow([index, title.replace('\\n', '\\\\n'),\n                                     abstract.replace('\\n', '\\\\n')])\n                ifd.close()\n            except Exception as e:\n                print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/jd/count_data.lua",
    "content": "--[[\nCount data for each class and length\nCopyright 2016 Xiang Zhang\n\nUsage: th count_data.lua [input] [output]\n--]]\n\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/jd/sentiment/comment_sorted_nonull.csv'\n   local output = arg[2] or '../data/jd/sentiment/comment_sorted_count.t7b'\n\n   print('Counting data')\n   local count = joe.count(input)\n   joe.count = count\n   print('Saving to '..output)\n   torch.save(output, count)\n   print('Plotting result')\n   joe.plot(count)\nend\n\nfunction joe.count(input)\n   local count = {}\n   local max_class = 0\n   local max_length = 0\n   local fd = io.open(input)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 100000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n      local length = 0\n      for i = 2, #content do\n         length = length + content[i]:gsub(\"^%s*(.-)%s*$\", \"%1\"):len()\n      end\n      count[class] = count[class] or {}\n      count[class][length] = (count[class][length] or 0) + 1\n\n      if class > max_class then\n         max_class = class\n      end\n      if length > max_length then\n         max_length = length\n      end\n   end\n   print('\\rProcessed lines: '..n)\n   print('total classes = '..max_class..', maximum length = '..max_length)\n   fd:close()\n\n   local result = torch.Tensor(max_class, max_length):zero()\n   for class, class_count in pairs(count) do\n      if class > 0 then\n         for length, length_count in pairs(class_count) do\n            if length > 0 then\n               result[class][length] = length_count\n            end\n         end\n      end\n   end\n\n   return result\nend\n\nfunction joe.plot(count)\n   require('gnuplot')\n   local cumulated = count:cumsum(2)\n   local plots = {}\n   for class = 1, cumulated:size(1) do\n      plots[class] = {tostring(class), cumulated[class], '-'}\n   end\n   local figure = gnuplot.figure()\n   gnuplot.plot(unpack(plots))\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine (line,sep)\n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/jd/create_comment.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of comments\nCopyright 2016 Xiang Zhang\n\nUsage: python3 create_data.py -i [input file pattern] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/jd/comment/*.json.xz'\nOUTPUT = '../data/jd/sentiment/comment.csv'\n\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file pattern', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    createData()\n\ndef createData():\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    files = glob.glob(INPUT)\n    n = 0\n    filecount = 0\n    for filename in files:\n        filecount = filecount + 1\n        print('Processing file {}/{}: {}. Processed items {}.'.format(\n                filecount, len(files), filename, n))\n        try:\n            ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n            for line in ifd:\n                review = json.loads(line)\n                score = int(review['content'].get('score', -1))\n                title = review['content'].get('title', '')\n                content = review['content'].get('content', '')\n                if score != -1:\n                    n = n + 1\n                    writer.writerow([score, title.replace('\\n', '\\\\n'),\n                                     content.replace('\\n', '\\\\n')])\n            ifd.close()\n        except Exception as e:\n            print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/jd/limit_length.lua",
    "content": "--[[\nLimit length for data\nCopyright 2016 Xiang Zhang\n\nUsage: th limit_length.lua [input] [output] [min] [max]\n--]]\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/jd/sentiment/comment_sorted_nonull.csv'\n   local output = arg[2] or '../data/jd/sentiment/comment_sorted_limited.csv'\n   local min = tonumber(arg[3] or 0)\n   local max = tonumber(arg[4] or math.huge)\n\n   print('Limiting data')\n   joe.limit(input, output, min, max)\nend\n\nfunction joe.limit(input, output, min, max)\n   local ifd = io.open(input)\n   local ofd = io.open(output, 'w')\n   local n = 0\n   local m = 0\n   for line in ifd:lines() do\n      n = n + 1\n\n      local content = joe.parseCSVLine(line)\n      local length = 0\n      for i = 2, #content do\n         length = length + content[i]:gsub(\"^%s*(.-)%s*$\", \"%1\"):len()\n      end\n\n      if length >= min and length <= max then\n         m = m + 1\n         ofd:write(line, '\\n')\n      end\n\n      if math.fmod(n, 100000) == 0 then\n         io.write('\\rProcessing line: ', n, ', Saved lines: ', m)\n         io.flush()\n      end\n   end\n   print('\\rProcessed lines: '..n..', Saved lines: '..m)\n   ifd:close()\n   ofd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine (line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/jd/sort_data.sh",
    "content": "#!/bin/bash\n\n# Sort comma-separated file starting from the second field\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash sort_data.sh [input_file] [output_file] [temporary] [memory]\n\nset -x;\nset -e;\n\nsort -S ${4:-50%} -t ',' -k2 -u -T ${3:-/scratch} $1 > $2;\n"
  },
  {
    "path": "data/joint/combine_word.lua",
    "content": "--[[\nCombine two word data together\nCopyright 2016 Xiang Zhang\n\nUsage: th combine_word_list.lua [input_1] [list_1] [input_2] [list_2] ...\n   [output] [list]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = {}\n   local input_list = {}\n   for i = 1, math.floor(#arg / 2) - 1 do\n      input[i] = arg[2 * i - 1]\n      input_list[i] = arg[2 * i]\n   end\n   local output = arg[math.floor(#arg / 2) * 2 - 1] or\n      '../data/joint/binary_train_word.csv'\n   local output_list = arg[math.floor(#arg / 2) * 2] or\n      '../data/joint/binary_train_word_list.csv'\n\n   print('Loading output list from '..output_list)\n   local list, count, freq, dict = joe.readList(output_list)\n   print('Opening output file '..output)\n   local ofd = io.open(output, 'w')\n\n   for i = 1, #input do\n      print('Loading input list from '..input_list[i])\n      local local_list, local_count, local_freq, local_dict =\n         joe.readList(input_list[i])\n      print('Building input to output map')\n      local map = joe.buildMap(local_list, dict)\n      print('Processing data from '..input[i])\n      joe.processInput(input[i], map, ofd, list)\n   end\n\n   print('Closing output file '..output)\n   ofd:close()\nend\n\nfunction joe.readList(file)\n   local list = tds.Vec()\n   local count = tds.Vec()\n   local freq = tds.Vec()\n   local dict = tds.Hash()\n   local fd = io.open(file)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[1] = content[1]:gsub('\\\\n', '\\n')\n      list:insert(content[1])\n      count:insert(tonumber(content[2]))\n      freq:insert(tonumber(content[3]))\n      dict[content[1]] = #list\n   end\n   fd:close()\n   return list, count, freq, dict\nend\n\nfunction joe.buildMap(input_list, dict)\n   local map = tds.Vec()\n   for i = 1, #input_list do\n      map[i] = dict[input_list[i]]\n   end\n   return map\nend\n\nfunction joe.processInput(input, map, ofd, list)\n   local ifd = io.open(input)\n   local n = 0\n   for line in ifd:lines() do\n      n = n + 1\n      if math.fmod(n, 10000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      -- Write class\n      local content = joe.parseCSVLine(line)\n      ofd:write('\"', content[1], '\"')\n\n      -- Write title and comment\n      for i = 2, #content do\n         ofd:write(',\"')\n         for word in content[i]:gmatch('%d+') do\n            ofd:write(map[tonumber(word)] or #list + 1, ' ')\n         end\n         ofd:write('\"')\n      end\n\n      -- Write end of line\n      ofd:write('\\n')\n   end\n   print('\\rProcessed lines: '..n)\n   ifd:close()\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/joint/combine_word_list.lua",
    "content": "--[[\nCombine two word data together\nCopyright 2016 Xiang Zhang\n\nUsage: th combine_word_list.lua [list_1] [size_1] [list_2] [size_2] ... [output]\n--]]\n\nlocal io = require('io')\nlocal math = require('math')\nlocal tds = require('tds')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input_list = {}\n   local input_size = {}\n   for i = 1, math.floor(#arg / 2) do\n      input_list[i] = arg[2 * i - 1]\n      input_size[i] = arg[2 * i]\n   end\n   local output_list = arg[math.floor(#arg / 2) * 2 + 1] or\n      '../data/joint/binary_train_word_list.csv'\n\n   local word = {}\n   for i = 1, #input_list do\n      print('Loading list from '..input_list[i])\n      local list, count, freq, dict = joe.readInputList(input_list[i])\n      word[i] = {list = list, count = count, freq = freq, dict = dict}\n   end\n   print('Merging word lists')\n   local list, count_table, freq_table, dict =\n      joe.mergeWords(word, input_size)\n   print('Writing merged word list to '..output_list)\n   joe.writeOutputList(output_list, list, count_table, freq_table, dict)\nend\n\nfunction joe.readInputList(file)\n   local list = tds.Vec()\n   local count = tds.Vec()\n   local freq = tds.Vec()\n   local dict = tds.Hash()\n   local fd = io.open(file)\n   for line in fd:lines() do\n      local content = joe.parseCSVLine(line)\n      content[1] = content[1]:gsub('\\\\n', '\\n')\n      list:insert(content[1])\n      count:insert(tonumber(content[2]))\n      freq:insert(tonumber(content[3]))\n      dict[content[1]] = #list\n   end\n   fd:close()\n   return list, count, freq, dict\nend\n\nfunction joe.writeOutputList(file, list, count_table, freq_table, dict)\n   local fd = io.open(file, 'w')\n   for index, word in ipairs(list) do\n      fd:write('\"', word:gsub('\\n', '\\\\n'):gsub('\"', '\"\"'), '\",\"',\n               count_table[word], '\",\"', freq_table[word], '\"\\n')\n   end\n   fd:close()\nend\n\nfunction joe.mergeWords(word, size)\n   local total_size = 0\n   for i, s in ipairs(size) do\n      total_size = total_size + s\n   end\n\n   local list = tds.Vec()\n   local count_table = tds.Hash()\n   local freq_table = tds.Hash()\n   for i, w in ipairs(word) do\n      for j, v in ipairs(w.list) do\n         if count_table[v] == nil then\n            list:insert(v)\n            count_table[v] = w.count[j]\n            freq_table[v] = w.freq[j] * size[i] / total_size\n         else\n            count_table[v] = count_table[v] + w.count[j]\n            freq_table[v] = freq_table[v] + w.freq[j] * size[i] / total_size\n         end\n         if math.fmod(j, 100000) == 0 then\n            io.write('\\rProcessing list ', i, ': ', j, '/', #w.list)\n            io.flush()\n         end\n      end\n      print('\\rProcessed list '..i..': '..(#w.list)..'/'..(#w.list))\n   end\n\n   print('Sorting merged word list')\n   list:sort(function(a, b) return count_table[a] > count_table[b] end)\n\n   print('Constructing merged word dictionary')\n   local dict = tds.Hash()\n   for i, w in ipairs(list) do\n      dict[w] = i\n   end\n\n   return list, count_table, freq_table, dict\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine(line,sep) \n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/nytimes/construct_topic.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of news articles\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_topic.py -i [input directory] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\nimport re\nimport urllib.parse\n\nINPUT = '../data/nytimes/article'\nOUTPUT = '../data/nytimes/topic/news.csv'\nCLASS = '../data/nytimes/topic/class.csv'\n\ndef main():\n    global INPUT\n    global OUTPUT\n    global CLASS\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file directory', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n    parser.add_argument(\n        '-c', '--classes', help = 'Class file', default = CLASS)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n    CLASS = args.classes\n\n    createData()\n\ndef createData():\n    # Open the category file\n    classes = dict()\n    count = 0\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    files = glob.glob(INPUT + '/*.json.xz')\n    n = 0\n    filecount = 0\n    for filename in files:\n        filecount = filecount + 1\n        print('Processing file {}/{}: {}. Processed items {}.'.format(\n                filecount, len(files), filename, n))\n        try:\n            ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n            for line in ifd:\n                news = json.loads(line)\n                title = news.get('title', '')\n                content = news.get('content', list())\n                abstract = ''\n                if len(content) > 0:\n                    abstract = content[0]\n                url = news.get('url', '')\n                if url != '':\n                    path = urllib.parse.urlparse(url).path\n                    start_match = re.match(r'/\\d\\d\\d\\d/\\d\\d/\\d\\d/', path)\n                    end_match = re.match(r'/\\d\\d\\d\\d/\\d\\d/\\d\\d/[^/]+', path)\n                    if start_match != None and end_match != None:\n                        classname = path[start_match.end():end_match.end()]\n                        if classes.get(classname, None) == None:\n                            classes[classname] = count + 1\n                            count = count + 1\n                        index = classes[classname]\n                        writer.writerow([index, title.replace('\\n', '\\\\n'),\n                                         abstract.replace('\\n', '\\\\n')])\n                n = n + 1\n            ifd.close()\n        except Exception as e:\n            print('Exception (ignored): {}'.format(e))\n    ofd.close()\n    # Open the class file\n    cfd = open(CLASS, 'w', newline = '', encoding = 'utf-8')\n    class_writer = csv.writer(\n        cfd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    for key in classes:\n        class_writer.writerow([classes[key], key])\n    cfd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/nytimes/count_class.lua",
    "content": "--[[\nCount data for each class and length\nCopyright 2016 Xiang Zhang\n\nUsage: th count_data.lua [input] [output]\n--]]\n\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or '../data/nytimes/topic/news_sorted.csv'\n   local output = arg[2] or '../data/nytimes/topic/news_sorted_class.t7b'\n\n   print('Counting data')\n   local count = joe.count(input)\n   joe.count = count\n   print('Saving to '..output)\n   torch.save(output, count)\nend\n\nfunction joe.count(input)\n   local count = {}\n   local fd = io.open(input)\n   local n = 0\n   for line in fd:lines() do\n      n = n + 1\n      if math.fmod(n, 100000) == 0 then\n         io.write('\\rProcessing line: ', n)\n         io.flush()\n      end\n\n      local content = joe.parseCSVLine(line)\n      local class = tonumber(content[1])\n      local length = 0\n      for i = 2, #content do\n         length = length + content[i]:gsub(\"^%s*(.-)%s*$\", \"%1\"):len()\n      end\n      count[class] = (count[class] or 0) + 1\n   end\n   print('\\rProcessed lines: '..n)\n   fd:close()\n\n   return count\nend\n\n-- Parsing csv line\n-- Ref: http://lua-users.org/wiki/LuaCsv\nfunction joe.parseCSVLine (line,sep)\n   local res = {}\n   local pos = 1\n   sep = sep or ','\n   while true do \n      local c = string.sub(line,pos,pos)\n      if (c == \"\") then break end\n      if (c == '\"') then\n         -- quoted value (ignore separator within)\n         local txt = \"\"\n         repeat\n            local startp,endp = string.find(line,'^%b\"\"',pos)\n            txt = txt..string.sub(line,startp+1,endp-1)\n            pos = endp + 1\n            c = string.sub(line,pos,pos) \n            if (c == '\"') then txt = txt..'\"' end \n            -- check first char AFTER quoted string, if it is another\n            -- quoted string without separator, then append it\n            -- this is the way to \"escape\" the quote char in a quote.\n         until (c ~= '\"')\n         table.insert(res,txt)\n         assert(c == sep or c == \"\")\n         pos = pos + 1\n      else\n         -- no quotes used, just look for the first separator\n         local startp,endp = string.find(line,sep,pos)\n         if (startp) then \n            table.insert(res,string.sub(line,pos,startp-1))\n            pos = endp + 1\n         else\n            -- no separator found -> use rest of string and terminate\n            table.insert(res,string.sub(line,pos))\n            break\n         end \n      end\n   end\n   return res\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "data/rakuten/construct_hepburn.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Japanese datasets to Hepburn Romanization\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_hepburn.py -i [input] -o [output]\n'''\n\n# Input file\nINPUT = '../data/rakuten/sentiment/full_train.csv'\n# Output file\nOUTPUT = '../data/rakuten/sentiment/full_train_hepburn.csv'\n\nimport argparse\nimport csv\nimport MeCab\nimport romkan\nimport unidecode\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    mecab = MeCab.Tagger()\n\n    convertRoman(mecab)\n\ndef romanizeText(mecab, text):\n    parsed = mecab.parse(text)\n    result = list()\n    for token in parsed.split('\\n'):\n        splitted = token.split('\\t')\n        if len(splitted) == 2:\n            word = splitted[0]\n            features = splitted[1].split(',')\n            if len(features) > 7 and features[7] != '*':\n                result.append(romkan.to_hepburn(features[7]))\n            else:\n                result.append(word)\n    return result\n\n# Convert the text in Chinese to pintin\ndef convertRoman(mecab):\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            new_row.append(' '.join(map(\n                str.strip,\n                map(lambda s: s.replace('\\n', '\\\\n'),\n                    map(unidecode.unidecode,\n                        romanizeText(mecab, row[i]))))))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/rakuten/create_review.py",
    "content": "#!/usr/bin/python3\n\n'''\nCreate data from list of LZMA compressed archives of reviews\nCopyright 2016 Xiang Zhang\n\nUsage: python3 create_data.py -i [input file pattern] -o [output file]\n'''\n\nimport argparse\nimport csv\nimport glob\nimport json\nimport lzma\n\nINPUT = '../data/rakuten/review/*.json.xz'\nOUTPUT = '../data/rakuten/sentiment/review.csv'\n\ndef main():\n    global INPUT\n    global OUTPUT\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument(\n        '-i', '--input', help = 'Input file pattern', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n\n    createData()\n\ndef createData():\n    # Open the output file\n    ofd = open(OUTPUT, 'w', newline = '', encoding = 'utf-8')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Grab the files\n    files = glob.glob(INPUT)\n    n = 0\n    filecount = 0\n    for filename in files:\n        filecount = filecount + 1\n        print('Processing file {}/{}: {}. Processed items {}.'.format(\n                filecount, len(files), filename, n))\n        try:\n            ifd = lzma.open(filename, 'rt', encoding = 'utf-8')\n            for line in ifd:\n                review = json.loads(line)\n                rate = review.get('rate', '')\n                title = review.get('title', '')\n                comment = review.get('comment', '')\n                if rate != '':\n                    n = n + 1\n                    writer.writerow([rate, title.replace('\\n', '\\\\n'),\n                                     comment.replace('\\n', '\\\\n')])\n            ifd.close()\n        except Exception as e:\n            print('Exception (ignored): {}'.format(e))\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "data/rakuten/segment_word.py",
    "content": "#!/usr/bin/python3\n\n'''\nConvert Japanese datasets to Index of Words\nCopyright 2016 Xiang Zhang\n\nUsage: python3 construct_pinyin.py -i [input] -l [list] -o [output] [-r]\n'''\n\n#Input file\nINPUT = '../data/rakuten/sentiment/full_train.csv'\n#Output file\nOUTPUT = '../data/rakuten/sentiment/full_train_word.csv'\n# List file\nLIST = '../data/rakuten/sentiment/full_train_word_list.csv'\n# Read already defined word list\nREAD = False\n\nimport argparse\nimport csv\nimport MeCab\n\n# Main program\ndef main():\n    global INPUT\n    global OUTPUT\n    global LIST\n\n    parser = argparse.ArgumentParser()\n    parser.add_argument('-i', '--input', help = 'Input file', default = INPUT)\n    parser.add_argument(\n        '-o', '--output', help = 'Output file', default = OUTPUT)\n    parser.add_argument('-l', '--list', help = 'Word list file', default = LIST)\n    parser.add_argument(\n        '-r', '--read', help = 'Read from list file', action = 'store_true')\n\n    args = parser.parse_args()\n\n    INPUT = args.input\n    OUTPUT = args.output\n    LIST = args.list\n    READ = args.read\n\n    if READ:\n        print('Reading word index')\n        word_index = readWords()\n    else:\n        print('Counting words')\n        word_count, word_freq = segmentWords()\n        print('Sorting words by count')\n        word_index = sortWords(word_count, word_freq)\n    print('Constructing word index output')\n    convertWords(word_index)\n\n# Read from pre-existing word list\ndef readWords():\n    # Open the files\n    ifd = open(LIST, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_index = dict()\n    n = 0\n    for row in reader:\n        word = row[0].replace('\\\\n', '\\n')\n        word_index[word] = n + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    return word_index\n\n# Segment the text in Chinese\ndef segmentWords():\n    mecab = MeCab.Tagger()\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    # Loop over the csv rows\n    word_count = dict()\n    word_freq = dict()\n    n = 0\n    for row in reader:\n        field_set = set()\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = list()\n            parsed_result = mecab.parse(field)\n            for token in parsed_result.split('\\n'):\n                splitted_token = token.split('\\t')\n                if len(splitted_token) == 2:\n                    word = splitted_token[0]\n                    field_list.append(word)\n            for word in field_list:\n                word_count[word] = word_count.get(word, 0) + 1\n                if word not in field_set:\n                    field_set.add(word)\n                    word_freq[word] = word_freq.get(word, 0) + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    # Normalizing word frequency\n    for word in word_freq:\n        word_freq[word] = float(word_freq[word]) / float(n)\n    return word_count, word_freq\n\n# Sort words for a given count dictionary object\ndef sortWords(word_count, word_freq):\n    # Sort the words\n    word_list = sorted(\n        word_count, key = lambda word: word_count[word], reverse = True)\n    # Open the files\n    ofd = open(LIST, 'w', encoding = 'utf-8', newline = '')\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over all the words\n    word_index = dict()\n    n = 0\n    for i in range(len(word_list)):\n        word = word_list[i]\n        row = [word.replace('\\n', '\\\\n'), str(word_count[word]),\n               str(word_freq[word])]\n        writer.writerow(row)\n        word_index[word] = i + 1\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing word: {}'.format(n), end = '')\n    print('\\rProcessed words: {}'.format(n))\n    ofd.close()\n    return word_index\n\n# Convert the text in Chinese to word list\ndef convertWords(word_index):\n    mecab = MeCab.Tagger()\n    # Open the files\n    ifd = open(INPUT, encoding = 'utf-8', newline = '')\n    ofd = open(OUTPUT, 'w', encoding = 'utf-8', newline = '')\n    reader = csv.reader(ifd, quoting = csv.QUOTE_ALL)\n    writer = csv.writer(ofd, quoting = csv.QUOTE_ALL, lineterminator = '\\n')\n    # Loop over the csv rows\n    n = 0\n    for row in reader:\n        new_row = list()\n        new_row.append(row[0])\n        for i in range(1, len(row)):\n            field = row[i].replace('\\\\n', '\\n')\n            field_list = list()\n            parsed_result = mecab.parse(field)\n            for token in parsed_result.split('\\n'):\n                splitted_token = token.split('\\t')\n                if len(splitted_token) == 2:\n                    word = splitted_token[0]\n                    field_list.append(word)\n            new_row.append(' '.join(map(\n                str, map(lambda word: word_index.get(word, len(word_index) + 1),\n                         field_list))))\n        writer.writerow(new_row)\n        n = n + 1\n        if n % 1000 == 0:\n            print('\\rProcessing line: {}'.format(n), end = '')\n    print('\\rProcessed lines: {}'.format(n))\n    ifd.close()\n    ofd.close()\n\nif __name__ == '__main__':\n    main()\n"
  },
  {
    "path": "doc/dianping.md",
    "content": "# Dianping\n\nThis documentation contains information on how to reproduce all the results for the `Dianping` datasets in the paper.\n\nThe root directory `/` in this documentation indicates the root directory of this repository.\n\n## Download the dataset\n\nOriginal text data for training and testing are available via these two links: [`train.csv.xz`](https://goo.gl/uKPxyo) [`test.csv.xz`](https://goo.gl/2QZpLx). When you download them, make sure to put them in the `/data/data/dianping` directory and unxz so that you have `train.csv` and `test.csv` available.\n\n## GlyphNet\n\nThis section introduces how to prepare and run GlyphNet experiments.\n\n### Prepare GNU Unifont\n\nRunning the glyphnet training script requires the GNU Unifont character images. We have built these images into a Torch 7 binary serialization file and it can be download via this link: [`unifont-8.0.1.t7b.xz`](https://goo.gl/aFxYHq). After downloading, put it in `/unifont/unifont` directory and unxz so that you have `unifont-8.0.1.t7b` available.\n\n### Build Byte Serialization Files\n\nThe next step is to build the serialized code files. The first step is to build the string serialization files. Switch to the `/data/dianping` directory, then execute the following commands\n\n```bash\nth construct_string.lua ../data/dianping/train.csv ../data/dianping/train_string.t7b\nth construct_string.lua ../data/dianping/test.csv ../data/dianping/test_string.t7b\n```\n\nThese 2 commands will build byte serialization files for the samples in its original language. It assumes the texts are contained in a comma-separated-value format in which the first field is treated as the class index (starting from 1), and the remaining fields are all texts.\n\nThe output files contain a lua table that has the following members\n\n* `index`: a table that contains index tensors for each class. For example `index[i]` is an n x m x 2 `LongTensor` that contains the starting position and length of byte string representing each sample in class i. We assume that class i contains n samples, and there are m text fields in the CSV file.\n* `content`: a `ByteTensor` that contains the serialization of the strings of all samples. Each string is ended with a 0 byte, which is not included in the length count in `index`.\n\n### Build Unicode Serialization Files\n\nFrom this byte-level serialization, we will be able to construct serialization files that contain unicode values to be used in the `glyphnet` training scripts. To do this, execute the following 2 commands\n\n```bash\nth construct_code.lua ../data/dianping/train_string.t7b ../data/dianping/train_code.t7b\nth construct_code.lua ../data/dianping/test_string.t7b ../data/dianping/test_code.t7b\n```\n\nEach of these code files contain a lua table that has 2 `LongTensor` members: `code` and `code_value`. The have a similar structure as the `index` and `content` members of the byte serialization files, but in this case they are for unicode values.\n\n### Execute the Experiments\n\nThen, you can switch to `/glyphnet`, and execute the following scripts to run the training program for the large GlyphNet\n\n```bash\nmkdir -p models/dianping/spatial8temporal12length512feature256\n./archive/dianping_spatial8temporal12length512feature256.sh\n```\n\nThe first command simply creates a directory where checkpointing files will be written into during training. Note that the shell scripts also accepts command-line parameters and can pass it directly to the training program. The most useful ones are probably `-driver_visualize false` and `-driver_plot false`, that disable visualization and plotting so that you can run the training programs on a headless server. You can also use `-driver_resume true` to resume from checkpointed experiments. These parameters are available for all Torch 7 training programs.\n\nSimilarly, the following commands execute the experiment for the small GlyphNet\n\n```bash\nmkdir -p models/dianping/spatial6temporal8length486feature256\n./archive/dianping_spatial6temporal8length486feature256.sh\n```\n\n## OnehotNet\n\nThis section details how to execute OnehotNet experiments. Note that OnehotNet in this article are operating at byte-level for either the original text or the romanized text. In the case of romanized text, it is the same as character-level.\n\n### Byte-Level OnehotNet for Original Text\n\nTo train OnehotNet for the original text, we only need the previously built byte serialization files. If you do not have them, see previous sections for using `construct_string.lua` data processing scripts.\n\n#### Execute the Experiments\n\nAssuming your current working directory is `/onehotnet`, the following commands execute experiments for large OnehotNet on the original text samples.\n\n```bash\nmkdir -p models/dianping/onehot4temporal12length2048feature256\n./archive/dianping_onehot4temporal12length2048feature256.sh\n```\n\nSimilarly, the small OnehotNet experiments can be done using the following commands\n\n```bash\nmkdir -p models/dianping/onehot4temporal8length1944feature256\n./archive/dianping/onehot4temporal8length19444feature256.sh\n```\n\n### Character-Level OnehotNet for Romanized Text\n\nThis section details how to execute OnehotNet for romanized text. But before that, we need to build the romanized data first.\n\n#### Build Romanized Text Serialization Files\n\nThe first step is to convert the original text into a romanization format. This is done in this project automatically using the [`pypinyin`](https://github.com/mozillazg/python-pinyin) package (version 0.12 for the results in the paper). You also want to install [`jieba`](https://github.com/fxsjy/jieba) (version 0.38 for the results in the paper) so that `pypinyin` can use it for word segmentation. All these packages were installed in a Python 3 environment.\n\nSwitch the working directory to `/data/dianping`, the following commands converting the original text to a romanization format for the Dianping dataset.\n\n```bash\npython3 construct_pinyin.py -i ../data/dianping/train.csv -o ../data/dianping/train_pinyin.csv\npython3 construct_pinyin.py -i ../data/dianping/test.csv -o ../data/dianping/test_pinyin.csv\n```\n\nThen, we can use `construct_string.lua` again for constructing the byte serialization of romanized texts.\n\n```bash\nth construct_string.lua ../data/dianping/train_pinyin.csv ../data/dianping/train_pinyin_string.t7b\nth construct_string.lua ../data/dianping/test_pinyin.csv ../data/dianping/test_pinyin_string.t7b\n```\n\n#### Execute the Experiments\n\nAssuming your current working directory is `/onehotnet`, the following commands execute experiments for large OnehotNet on the romanized text samples.\n\n```bash\nmkdir -p models/dianping/onehot4temporal12length2048feature256roman\n./archive/dianping_onehot4temporal12length2048feature256roman.sh\n```\n\nSimilarly, the small OnehotNet experiments can be done using the following commands\n\n```bash\nmkdir -p models/dianping/onehot4temporal8length1944feature256roman\n./archive/dianping/onehot4temporal8length19444feature256roman.sh\n```\n\n## EmbedNet\n\nThis section introduces how to build the data files and executing experiments for EmbedNet.\n\n### Character-Level EmbedNet for Original Text\n\nSince we already built the serialization data files for unicode characters for GlyphNet, we can directly use them. The only step required is to run the commands for training the models.\n\nAssuming the current working directory is `/embednet`, the following commands will start the training process for large character-level EmbedNet.\n\n```bash\nmkdir -p models/dianping/temporal12length512feature256\n./archive/dianping_temporal12length512feature256.sh\n```\n\nAnd for small character-level EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal8length486feature256\n./archive/dianping_temporal8length486feature256.sh\n```\n\n### Byte-Level EmbedNet for Original Text\n\nThis section details how to train byte-level EmbedNet for the original text\n\n#### Convert Byte Serialization Files\n\nSince the EmbedNet training program assumes the data files contain a table of 2 members `code` and `code_value`, we need to change the variable names in the string serialization files to match this. This can be done in `/data/dianping` by executing the following commands\n\n```bash\nth convert_string_code.lua ../data/dianping/train_string.t7b ../data/dianping/train_string_code.t7b\nth convert_string_code.lua ../data/dianping/test_string.t7b ../data/dianping/test_string_code.t7b\n```\n\n#### Execute the Experiments\n\nAssuming the current working director is `/embednet`, the following commands start the training process for the large byte-level EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal12length512feature256byte\n./archive/dianping_temporal12length512feature256byte.sh\n```\n\nAnd for small byte-level EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal8length486feature256byte\n./archive/dianping_temporal8length486feature256byte.sh\n```\n\n### Character-Level EmbedNet for Romanized Text\n\nNote that characters for romanized text is the same as bytes. Therefore, the steps are exactly the same as the byte-level EmbedNet, except for romanized text instead of original text.\n\n#### Convert Byte Serialization Files\n\nIn `/data/dianping`, execute the following commands\n\n```bash\nth convert_string_code.lua ../data/dianping/train_pinyin_string.t7b ../data/dianping/train_pinyin_string_code.t7b\nth convert_string_code.lua ../data/dianping/test_pinyin_string.t7b ../data/dianping/test_pinyin_string_code.t7b\n```\n\n#### Execute the Experiments\n\nAssuming the current working director is `/embednet`, the following commands start the training process for the large character-level EmbedNet for romanized text\n\n```bash\nmkdir -p models/dianping/temporal12length512feature256roman\n./archive/dianping_temporal12length512feature256roman.sh\n```\n\nAnd for small EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal8length486feature256roman\n./archive/dianping_temporal8length486feature256roman.sh\n```\n\n### Word-Level Embednet for Original Text\n\nThis section introduces how to segment word from the text, build the word serialization files, and execute the commands.\n\n#### Build Word Serialization Files for Original Text\n\nThe first step for building the word serialization files is to segment the words. This is done by executing a Python 3 script as follows, assuming you have the [`jieba`](https://github.com/fxsjy/jieba) package installed (version 0.38 for the results in the paper) and the working directory is `/data/dianping`.\n\n```bash\npython3 segment_word.py -i ../data/dianping/train.csv -o ../data/dianping/train_word.csv -l ../data/dianping/train_word_list.csv\npython3 segment_word.py -i ../data/dianping/test.csv -o ../data/dianping/test_word.csv -l ../data/dianping/train_word_list.csv -r\n```\n\nThe first command generate 2 data files. `train_word.csv` is a file containing sequences of indices of segmented words from the original text fields, whereas `train_word_list.csv` contains the list of words. The second command read the same list of words generated from the training data (therefore the `-r` option) and use that list to build sequences for the testing data. This is done deliberately so that new words not in the training data are not considered for classification results.\n\nThe second step is to build the word serialization files from the segmentation results.\n\n```bash\nth construct_word.lua ../data/dianping/train_word.csv ../data/dianping/train_word.t7b\nth construct_word.lua ../data/dianping/test_word.csv ../data/dianping/test_word.t7b\n```\n\n#### Execute the Experiments\n\nWhen we have `train_word.t7b` and `test_word.t7b`, we can start executing the experiments for word-level EmbedNet models. Assume that the current directory is `/embednet`, the following commands start the training process for the large word-level EmbedNet for original text\n\n```bash\nmkdir -p models/dianping/temporal12length512feature256word\n./archive/dianping_temporal12length512feature256word.sh\n```\n\nAnd for small EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal8length486feature256word\n./archive/dianping_temporal8length486feature256word.sh\n```\n\n### Word-Level EmbedNet for Romanized Text\n\nSimilar to the original text, romanized text also require word segmentation before being able to pass through the EmbedNet training program.\n\n#### Build Word Serialization Files for Romanized Text\n\nWord segmentation for romanized text is pretty simple. Assume you are in `/data/dianping`, the following commands do the job\n\n```bash\nth segment_roman_word.lua ../data/dianping/train_pinyin.csv ../data/dianping/train_pinyin_word.csv ../data/dianping/train_pinyin_word_list.csv\nth segment_roman_word.lua ../data/dianping/test_pinyin.csv ../data/dianping/test_pinyin_word.csv ../data/dianping/train_pinyin_word_list.csv true\n```\n\nNote the additional `true` argument in the second command-line to inform the script to use the training word list for constructing the indices for the testing data.\n\nThen, word serialization files can be built from the segmentation results using the following commands.\n\n```bash\nth construct_word.lua ../data/dianping/train_pinyin_word.csv ../data/dianping/train_pinyin_word.t7b\nth construct_word.lua ../data/dianping/test_pinyin_word.csv ../data/dianping/test_pinyin_word.t7b\n```\n\n#### Execute the Experiments\n\nWhen we have `train_pinyinword.t7b` and `test_pinyinword.t7b`, we can start executing the experiments for word-level EmbedNet models. Assume that the current directory is `/embednet`, the following commands start the training process for the large word-level EmbedNet for original text\n\n```bash\nmkdir -p models/dianping/temporal12length512feature256romanword\n./archive/dianping_temporal12length512feature256romanword.sh\n```\n\nAnd for small EmbedNet\n\n```bash\nmkdir -p models/dianping/temporal8length486feature256romanword\n./archive/dianping_temporal8length486feature256romanword.sh\n```\n\n## Linear Model\n\nThis section details how to reproduce the results for linear models.\n\n### Character-Level 1-Gram Linear Model for Original Text\n\nTo run the linear model for using bag-of-character features, we need to build the feature serialization files first.\n\n#### Build Character-Level 1-Gram Feature Serialization Files\n\nTo build the character-level 1-gram feature serialization files, execute the following commands from `/data/dianping`.\n\n```bash\nth construct_charbag.lua ../data/dianping/train_code.t7b ../data/dianping/train_charbag.t7b ../data/dianping/train_charbag_list.csv\nth construct_charbag.lua ../data/dianping/test_code.t7b ../data/dianping/test_charbag.t7b ../data/dianping/train_charbag_list.csv true\n```\n\nThe first command creates a file `train_charbag.t7b`, which contains a table that has the following members\n\n* `bag`: a table where `bag[i]` contains a n-by-2 `LongTensor`. It contains the beginning index and length of values in `bag_index` and `bag_value` for each sample.\n* `bag_index`: a 1-D `LongTensor` that contains the character indices of all samples.\n* `bag_value`: a 1-D `DoubleTensor` that contains the frequency of the corresponding character indices.\n\nThe seond command creates the feature serialization file for testing data, but using the same character index that was created from training data. The additional `true` parameter means to read from list rather than create a new one.\n\nAll of the feature serialization files for linear models has the same data structure design.\n\nTo prepare feature serialization files for the TFIDF variant of bag-of-character linear model, execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_charbag.t7b ../data/dianping/train_charbagtfidf.t7b ../data/dianping/train_charbag_list.csv\nth construct_tfidf.lua ../data/dianping/test_charbag.t7b ../data/dianping/test_charbagtfidf.t7b ../data/dianping/train_charbag_list.csv\n```\n\nNote that constructing serialization files for testing data still uses the character frequency list from training data.\n\n#### Execute the Experiments\n\nTo execute the experiment for character-level 1-gram linear model, execute the following commands from `/linearnet`\n\n```bash\nmkdir -p models/dianping/charbag\n./archive/dianping_charbag.sh\n```\n\nTo execute the experiment for the TFIDF version, execute the following command from `/linearnet`\n\n```bash\nmkdir -p models/dianping/charbagtfidf\n./archive/dianping_charbagtfidf.sh\n```\n\n### Character-Level 5-Gram Linear Model for Original Text\n\nBefore being able to execute the 5-gram experiments, we have to build the feature serialization files first.\n\n#### Build Character-Level 5-Gram Feature Serialization Files\n\nIn this work, 5-gram features actually mean features of grams from 1 to 5. It is usually infeasible to store all of these feature in memory, and building the features coud take a significant amount of time. Therefore, we build a list of grams ranked by their frequency via a multi-threaded program first, and then build the 5-gram feature serialization files using it.\n\nTo build the list of character grams, execute the following commands from `/data/dianping`\n\n```bash\nmkdir -p ../data/dianping/train_chargram_count\nth count_chargram.lua ../data/dianping/train_code.t7b ../data/dianping/train_chargram_count/\n\nmkdir -p ../data/dianping/train_chargram_count_sort\n./sort_gram_count.sh ../data/dianping/train_chargram_count ../data/dianping/train_chargram_count_sort /tmp\n\nth combine_gram_count.lua ../data/dianping/train_chargram_count_sort/ ../data/dianping/train_chargram_count_combine.csv\n\n./sort_gram_list.sh ../data/dianping/train_chargram_count_combine.csv ../data/dianping/train_chargram_list.csv\n\n./limit_csv_lines.sh ../data/dianping/train_chargram_list.csv ../data/dianping/train_chargram_list_limit.csv 1000001\n```\n\nThe commands proceeds by first using 10 threads to construct chunks of counts of character grams, and then sort and combine them to form the combined list. It is then sorted to list grams by its frequency, and finally we choose the 1,000,001 most frequent ones. This should be enough because we are limiting the number of features in 5-gram models to 1,000,000.\n\nThen, you can build the character-level 5-gram feature serialization files using the following commands from `/data/dianping`\n\n```bash\nth construct_chargram.lua ../data/dianping/train_code.t7b ../data/dianping/train_chargram.t7b ../data/dianping/train_chargram_list_limit.csv\nth construct_chargram.lua ../data/dianping/test_code.t7b ../data/dianping/test_chargram.t7b ../data/dianping/train_chargram_list_limit.csv\n```\n\nNote that the features for testing data are built using the gram list from the training data.\n\nTo build the feature serialization files for TFIDF version of the model, execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_chargram.t7b ../data/dianping/train_chargramtfidf.t7b ../data/dianping/train_chargram_list_limit.csv 1000000\nth construct_tfidf.lua ../data/dianping/test_chargram.t7b ../data/dianping/test_chargramtfidf.t7b ../data/dianping/train_chargram_list_limit.csv 1000000\n```\n\n#### Execute the Experiments\n\nTo execute the experiment for character-level 5-gram linear model, run the following commands from `/linearnet`\n\n```bash\nmkdir -p models/dianping/chargram\n./archive/dianping_chargram.sh\n```\n\nAnd for the TFIDF version\n\n```bash\nmkdir -p models/dianping/chargramtfidf\n./archive/dianping_chargramtfidf.sh\n```\n\n### Word-Level 1-Gram Linear Model for Original Text\n\nThis section first introduces how to build bag-of-word features, and then details how to execute the experiments.\n\n#### Build Word-Level 1-Gram Feature Serialization Files\n\nThe following commands from `/data/dianping` can create the word-level 1-gram features for linear model\n\n```bash\nth construct_wordbag.lua ../data/dianping/train_word.t7b ../data/dianping/train_wordbag.t7b 200000 200001\nth construct_wordbag.lua ../data/dianping/test_word.t7b ../data/dianping/test_wordbag.t7b 200000 200001\n```\n\nThis is possible because the word segmentation process previously done for word-level EmbedNet already sorts the words by its frequency from the training data. The program also automatically limit the number of features to 200000 and replace all other features to the 200001-th one.\n\nTo construct the TFIDF feature, simply execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_wordbag.t7b ../data/dianping/train_wordbagtfidf.t7b ../data/dianping/train_word_list.csv 200000\nth construct_tfidf.lua ../data/dianping/test_wordbag.t7b ../data/dianping/test_wordbagtfidf.t7b ../data/dianping/train_word_list.csv 200000\n```\n\n#### Execute the Experiments\n\nFrom `/linearnet`, the following commands execute the experiment for bag-of-word model\n\n```bash\nmkdir -p models/dianping/wordbag\n./archive/dianping_wordbag.sh\n```\n\nAnd for the TFIDF version\n\n```bash\nmkdir -p models/dianping/wordbagtfidf\n./archive/dianping_wordbagtfidf.sh\n```\n\n### Word-Level 5-Gram Linear Model for Original Text\n\nThis section introduces how to build word-level 5-gram feature serialization files and how to execute the experiments.\n\n#### Build Word-Level 5-Gram Feature Serialization Files\n\nSimilar to the character-level 5-gram features, we need a multi-threaded program to build the list of grams first before being able to build the feature serialization files. The list can be built by executing the following commands from `/data/dianping`\n\n```bash\nmkdir -p ../data/dianping/train_wordgram_count\nth count_wordgram.lua ../data/dianping/train_word.t7b ../data/dianping/train_wordgram_count/ ../data/dianping/train_word_list.csv\n\nmkdir -p ../data/dianping/train_wordgram_count_sort\n./sort_gram_count.sh ../data/dianping/train_wordgram_count ../data/dianping/train_wordgram_count_sort /tmp\n\nth combine_gram_count.lua ../data/dianping/train_wordgram_count_sort/ ../data/dianping/train_wordgram_count_combine.csv\n\n./sort_gram_list.sh ../data/dianping/train_wordgram_count_combine.csv ../data/dianping/train_wordgram_list.csv\n\n./limit_csv_lines.sh ../data/dianping/train_wordgram_list.csv ../data/dianping/train_wordgram_list_limit.csv 1000001\n```\n\nThe commands proceeds by first using 10 threads to construct chunks of counts of word grams, and then sort and combine them to form the combined list. It is then sorted to list grams by its frequency, and finally we choose the 1,000,001 most frequent ones. This should be enough because we are limiting the number of features in 5-gram models to 1,000,000.\n\nThen, you can build the word-level 5-gram feature serialization files using the following commands from `/data/dianping`\n\n```bash\nth construct_wordgram.lua ../data/dianping/train_word.t7b ../data/dianping/train_wordgram.t7b ../data/dianping/train_wordgram_list_limit.csv\nth construct_wordgram.lua ../data/dianping/test_word.t7b ../data/dianping/test_wordgram.t7b ../data/dianping/train_wordgram_list_limit.csv\n```\n\nNote that the features for testing data are built using the gram list from the training data.\n\nTo build the feature serialization files for TFIDF version of the model, execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_wordgram.t7b ../data/dianping/train_wordgramtfidf.t7b ../data/dianping/train_wordgram_list_limit.csv 1000000\nth construct_tfidf.lua ../data/dianping/test_wordgram.t7b ../data/dianping/test_wordgramtfidf.t7b ../data/dianping/train_wordgram_list_limit.csv 1000000\n```\n\n#### Execute the Experiments\n\nTo execute the experiment for word-level 5-gram linear model, run the following commands from `/linearnet`\n\n```bash\nmkdir -p models/dianping/wordgram\n./archive/dianping_wordgram.sh\n```\n\nAnd for the TFIDF version\n\n```bash\nmkdir -p models/dianping/wordgramtfidf\n./archive/dianping_wordgramtfidf.sh\n```\n\n### Word-Level 1-Gram Linear Model for Romanized Text\n\nThis section first introduces how to build bag-of-word features for romanized text, and then details how to execute the experiments.\n\n#### Build Word-Level 1-Gram Feature Serialization Files\n\nThe following commands from `/data/dianping` can create the word-level 1-gram features for romanized text\n\n```bash\nth construct_wordbag.lua ../data/dianping/train_pinyin_word.t7b ../data/dianping/train_pinyin_wordbag.t7b 200000 200001\nth construct_wordbag.lua ../data/dianping/test_pinyin_word.t7b ../data/dianping/test_pinyin_wordbag.t7b 200000 200001\n```\n\nThis is possible because the word segmentation process previously done for romanized word-level EmbedNet already sorts the words by its frequency from the training data. The program also automatically limit the number of features to 200000 and replace all other features to the 200001-th one.\n\nTo construct the TFIDF feature, simply execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_pinyin_wordbag.t7b ../data/dianping/train_pinyin_wordbagtfidf.t7b ../data/dianping/train_pinyin_word_list.csv 200000\nth construct_tfidf.lua ../data/dianping/test_pinyin_wordbag.t7b ../data/dianping/test_pinyin_wordbagtfidf.t7b ../data/dianping/train_pinyin_word_list.csv 200000\n```\n\n#### Execute the Experiments\n\nFrom `/linearnet`, the following commands execute the experiment for bag-of-word model for romanized text\n\n```bash\nmkdir -p models/dianping/wordbagroman\n./archive/dianping_wordbagroman.sh\n```\n\nAnd for the TFIDF version\n\n```bash\nmkdir -p models/dianping/wordbagtfidfroman\n./archive/dianping_wordbagtfidfroman.sh\n```\n\n### Word-Level 5-Gram Linear Model for Romanized Text\n\nThis section introduces how to build word-level 5-gram feature serialization files for romanized text and how to execute the experiments.\n\n#### Build Word-Level 5-Gram Feature Serialization Files\n\nSimilar to the character-level 5-gram features, we need a multi-threaded program to build the list of grams first before being able to build the feature serialization files. The list can be built by executing the following commands from `/data/dianping`\n\n```bash\nmkdir -p ../data/dianping/train_pinyin_wordgram_count\nth count_wordgram.lua ../data/dianping/train_pinyin_word.t7b ../data/dianping/train_pinyin_wordgram_count/ ../data/dianping/train_pinyin_word_list.csv\n\nmkdir -p ../data/dianping/train_pinyin_wordgram_count_sort\n./sort_gram_count.sh ../data/dianping/train_pinyin_wordgram_count ../data/dianping/train_pinyin_wordgram_count_sort /tmp\n\nth combine_gram_count.lua ../data/dianping/train_pinyin_wordgram_count_sort/ ../data/dianping/train_pinyin_wordgram_count_combine.csv\n\n./sort_gram_list.sh ../data/dianping/train_pinyin_wordgram_count_combine.csv ../data/dianping/train_pinyin_wordgram_list.csv\n\n./limit_csv_lines.sh ../data/dianping/train_pinyin_wordgram_list.csv ../data/dianping/train_pinyin_wordgram_list_limit.csv 1000001\n```\n\nThe commands proceeds by first using 10 threads to construct chunks of counts of word grams, and then sort and combine them to form the combined list. It is then sorted to list grams by its frequency, and finally we choose the 1,000,001 most frequent ones. This should be enough because we are limiting the number of features in 5-gram models to 1,000,000.\n\nThen, you can build the word-level 5-gram feature serialization files for romanized text using the following commands from `/data/dianping`\n\n```bash\nth construct_wordgram.lua ../data/dianping/train_pinyin_word.t7b ../data/dianping/train_pinyin_wordgram.t7b ../data/dianping/trainpinyin_wordgram_list_limit.csv\nth construct_wordgram.lua ../data/dianping/test_pinyin_word.t7b ../data/dianping/test_pinyin_wordgram.t7b ../data/dianping/train_pinyin_wordgram_list_limit.csv\n```\n\nNote that the features for testing data are built using the gram list from the training data.\n\nTo build the feature serialization files for TFIDF version of the model, execute the following commands from `/data/dianping`\n\n```bash\nth construct_tfidf.lua ../data/dianping/train_pinyin_wordgram.t7b ../data/dianping/train_pinyin_wordgramtfidf.t7b ../data/dianping/train_pinyin_wordgram_list_limit.csv 1000000\nth construct_tfidf.lua ../data/dianping/test_pinyin_wordgram.t7b ../data/dianping/test_pinyin_wordgramtfidf.t7b ../data/dianping/train_pinyin_wordgram_list_limit.csv 1000000\n```\n\n#### Execute the Experiments\n\nTo execute the experiment for word-level 5-gram linear model, run the following commands from `/linearnet`\n\n```bash\nmkdir -p models/dianping/wordgramroman\n./archive/dianping_wordgramroman.sh\n```\n\nAnd for the TFIDF version\n\n```bash\nmkdir -p models/dianping/wordgramtfidfroman\n./archive/dianping_wordgramtfidfroman.sh\n```\n\n## fastText\n\nThis section introduces how to build the token files and run experiments for the fastText models. Note that before being able to execute the experiments in this section, you must make sure that you have [fastText](https://github.com/facebookresearch/fastText) installed and there is `fasttext` command in your `PATH`.\n\n### Character-Level fastText for Original Text\n\nWe first build the token files for character-level fastText, and then detail how to execute the experiments.\n\n#### Build Character-Level Token Files\n\nTo build the character token files from the original text files, execute the following commands from `/data/dianping`\n\n```bash\nth construct_chartoken.lua ../data/dianping/train.csv ../data/dianping/train_chartoken.txt\nth construct_chartoken.lua ../data/dianping/test.csv ../data/dianping/test_chartoken.txt\n```\n\nOptionally, you can also build the evaluation token files by separating the training dataset to a 1:9 ratio.\n\n```bash\n./shuffle_lines.sh ../data/dianping/train_chartoken.txt ../data/dianping/train_chartoken_shuffle.txt\n./split_lines.sh 1800000 ../data/dianping/train_chartoken_shuffle.txt ../data/dianping/train_chartoken_shuffle_split_\n```\n\nNote that the second command above will produce 2 files `train_chartoken_shuffle_split_0.txt` and `train_chartoken_shuffle_split_1.txt`.\n\n#### Execute the Experiments\n\nTo execute the character-level 1-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charunigram_evaluation\n./archive/dianping_charunigram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the character-level 1-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charunigram_tuned\n./archive/dianping_charunigram_tuned.sh\n```\n\nTo execute the character-level 2-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charbigram_evaluation\n./archive/dianping_charbigram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the character-level 2-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charbigram_tuned\n./archive/dianping_charbigram_tuned.sh\n```\n\nTo execute the character-level 5-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charpentagram_evaluation\n./archive/dianping_charpentagram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the character-level 5-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/charpentagram_tuned\n./archive/dianping_charpentagram_tuned.sh\n```\n\n### Word-Level fastText for Original Text\n\nWe first build the token files for word-level fastText, and then detail how to execute the experiments.\n\n#### Build Word-Level Token Files\n\nTo build the word token files from the original text files, execute the following commands from `/data/dianping`\n\n```bash\nth construct_wordtoken.lua ../data/dianping/train_word.csv ../data/dianping/train_word_list.csv ../data/dianping/train_wordtoken.txt\nth construct_wordtoken.lua ../data/dianping/test_word.csv ../data/dianping/train_word_list.csv ../data/dianping/test_wordtoken.txt\n```\n\nOptionally, you can also build the evaluation token files by separating the training dataset to a 1:9 ratio.\n\n```bash\n./shuffle_lines.sh ../data/dianping/train_wordtoken.txt ../data/dianping/train_wordtoken_shuffle.txt\n./split_lines.sh 1800000 ../data/dianping/train_wordtoken_shuffle.txt ../data/dianping/train_wordtoken_shuffle_split_\n```\n\nNote that the second command above will produce 2 files `train_wordtoken_shuffle_split_0.txt` and `train_wordtoken_shuffle_split_1.txt`.\n\n#### Execute the Experiments\n\nTo execute the word-level 1-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordunigram_evaluation\n./archive/dianping_wordunigram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 1-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordunigram_tuned\n./archive/dianping_wordunigram_tuned.sh\n```\n\nTo execute the word-level 2-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordbigram_evaluation\n./archive/dianping_wordbigram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 2-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordbigram_tuned\n./archive/dianping_wordbigram_tuned.sh\n```\n\nTo execute the word-level 5-gram evaluation experiment, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordpentagram_evaluation\n./archive/dianping_wordpentagram_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 5-gram experiment, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordpentagram_tuned\n./archive/dianping_wordpentagram_tuned.sh\n```\n\n### Word-Level fastText for Romanized Text\n\nWe first build the token files for word-level fastText on romanized test, and then detail how to execute the experiments.\n\n#### Build Word-Level Token Files\n\nTo build the word token files from the original text files, execute the following commands from `/data/dianping`\n\n```bash\nth construct_wordtoken.lua ../data/dianping/train_pinyin_word.csv ../data/dianping/train_word_pinyin_list.csv ../data/dianping/train_pinyin_wordtoken.txt\nth construct_wordtoken.lua ../data/dianping/test_pinyin_word.csv ../data/dianping/train_pinyin_word_list.csv ../data/dianping/test_pinyin_wordtoken.txt\n```\n\nOptionally, you can also build the evaluation token files by separating the training dataset to a 1:9 ratio.\n\n```bash\n./shuffle_lines.sh ../data/dianping/train_pinyin_wordtoken.txt ../data/dianping/train_pinyin_wordtoken_shuffle.txt\n./split_lines.sh 1800000 ../data/dianping/train_pinyin_wordtoken_shuffle.txt ../data/dianping/train_pinyin_wordtoken_shuffle_split_\n```\n\nNote that the second command above will produce 2 files `train_pinyin_wordtoken_shuffle_split_0.txt` and `train_pinyin_wordtoken_shuffle_split_1.txt`.\n\n#### Execute the Experiments\n\nTo execute the word-level 1-gram evaluation experiment on romanized text, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordunigramroman_evaluation\n./archive/dianping_wordunigramroman_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 1-gram experiment on romanized text, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordunigramroman_tuned\n./archive/dianping_wordunigramroman_tuned.sh\n```\n\nTo execute the word-level 2-gram evaluation experiment on romanized text, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordbigramroman_evaluation\n./archive/dianping_wordbigramroman_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 2-gram experiment on romanized text, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordbigramroman_tuned\n./archive/dianping_wordbigramroman_tuned.sh\n```\n\nTo execute the word-level 5-gram evaluation experiment on romanized text, do the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordpentagramroman_evaluation\n./archive/dianping_wordpentagramroman_evaluation.sh\n```\n\nThis will iterate through 2, 5 and 10 epoches for the best option on the evaluation data. You can check whether the evaluated hyperparameter confirms with that in the paper.\n\nTo execute the word-level 5-gram experiment on romanized text, use the following commands from `/fasttext`\n\n```bash\nmkdir -p models/dianping/wordpentagramroman_tuned\n./archive/dianping_wordpentagramroman_tuned.sh\n```\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal12length512feature256 -train_data_file data/11st/sentiment/binary_train_code.t7b -test_data_file data/11st/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/11st/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/11st/sentiment/binary_train_rr_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/binary_test_rr_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/11st/sentiment/binary_train_rr_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/binary_test_rr_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/11st/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal8length486feature256 -driver_variation small -train_data_file data/11st/sentiment/binary_train_code.t7b -test_data_file data/11st/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/11st/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/11st/sentiment/binary_train_rr_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/binary_test_rr_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/11st/sentiment/binary_train_rr_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/binary_test_rr_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stbinary_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/11st/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal12length512feature256 -train_data_file data/11st/sentiment/full_train_code.t7b -test_data_file data/11st/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/11st/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/11st/sentiment/full_train_rr_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/full_test_rr_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/11st/sentiment/full_train_rr_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/full_test_rr_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/11st/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal8length486feature256 -driver_variation small -train_data_file data/11st/sentiment/full_train_code.t7b -test_data_file data/11st/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/11st/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/11st/sentiment/full_train_rr_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/11st/sentiment/full_test_rr_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/11st/sentiment/full_train_rr_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/full_test_rr_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/11stfull_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/11st/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/11st/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonbinary_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonbinary/temporal12length512feature256 -train_data_file data/amazon/binary_train_code.t7b -test_data_file data/amazon/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonbinary_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonbinary/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/amazon/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/amazon/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonbinary_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonbinary/temporal8length486feature256 -driver_variation small -train_data_file data/amazon/binary_train_code.t7b -test_data_file data/amazon/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonbinary_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonbinary/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/amazon/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/amazon/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonfull_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonfull/temporal12length512feature256 -train_data_file data/amazon/full_train_code.t7b -test_data_file data/amazon/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonfull_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonfull/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/amazon/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/amazon/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonfull_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonfull/temporal8length486feature256 -driver_variation small -train_data_file data/amazon/full_train_code.t7b -test_data_file data/amazon/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/amazonfull_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonfull/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/amazon/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/amazon/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/chinanews/topic/train_code.t7b -test_data_file data/chinanews/topic/test_code.t7b -driver_location models/chinanews/temporal12length512feature256 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/chinanews/topic/train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/chinanews/topic/test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/chinanews/topic/train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/chinanews/topic/test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/chinanews/topic/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/chinanews/topic/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/chinanews/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/chinanews/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/chinanews/topic/train_code.t7b -test_data_file data/chinanews/topic/test_code.t7b -driver_location models/chinanews/temporal8length486feature256 -driver_variation small \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/chinanews/topic/train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/chinanews/topic/test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/chinanews/topic/train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/chinanews/topic/test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/chinanews/topic/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/chinanews/topic/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/chinanews_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/chinanews/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/chinanews/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/dianping/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/dianping/train_string_code.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/dianping/test_string_code.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/dianping/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/dianping/train_pinyin_string_code.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/dianping/test_pinyin_string_code.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/dianping/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/dianping/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/dianping/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/dianping/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/dianping/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/dianping/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/temporal8length486feature256 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/temporal8length486feature256byte -driver_dimension 257 -train_data_file data/dianping/train_string_code.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/dianping/test_string_code.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/temporal8length486feature256roman -driver_dimension 257 -train_data_file data/dianping/train_pinyin_string_code.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/dianping/test_pinyin_string_code.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/temporal8length486feature256romanword -driver_dimension 200002 -train_data_file data/dianping/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/dianping/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/dianping_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/temporal8length486feature256word -driver_dimension 200002 -train_data_file data/dianping/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/dianping/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/ifeng/topic/train_code.t7b -test_data_file data/ifeng/topic/test_code.t7b -driver_location models/ifeng/temporal12length512feature256 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/ifeng/topic/train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/ifeng/topic/test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/ifeng/topic/train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/ifeng/topic/test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/ifeng/topic/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/ifeng/topic/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/ifeng/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/ifeng/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/ifeng/topic/train_code.t7b -test_data_file data/ifeng/topic/test_code.t7b -driver_location models/ifeng/temporal8length486feature256 -driver_variation small \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/ifeng/topic/train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/ifeng/topic/test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/ifeng/topic/train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/ifeng/topic/test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/ifeng/topic/train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/ifeng/topic/test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/ifeng_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/ifeng/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/ifeng/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/binary_train_code.t7b -test_data_file data/jd/sentiment/binary_test_code.t7b -driver_location models/jdbinary/temporal12length512feature256 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/jd/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/jd/sentiment/binary_train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/binary_test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/jd/sentiment/binary_train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/binary_test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/jd/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal8length486feature256 -driver_variation small -train_data_file data/jd/sentiment/binary_train_code.t7b -test_data_file data/jd/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/jd/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/jd/sentiment/binary_train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/binary_test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/jd/sentiment/binary_train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/binary_test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdbinary_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/jd/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/full_train_code.t7b -test_data_file data/jd/sentiment/full_test_code.t7b -driver_location models/jdfull/temporal12length512feature256 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/jd/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/jd/sentiment/full_train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/full_test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/jd/sentiment/full_train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/full_test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/jd/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal8length486feature256 -driver_variation small -train_data_file data/jd/sentiment/full_train_code.t7b -test_data_file data/jd/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/jd/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/jd/sentiment/full_train_pinyin_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/jd/sentiment/full_test_pinyin_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/jd/sentiment/full_train_pinyin_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/full_test_pinyin_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jdfull_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/jd/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/jd/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/binary_train_code.t7b -test_data_file data/joint/binary_test_code.t7b -driver_location models/jointbinary/temporal12length512feature256 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/joint/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/joint/binary_train_roman_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/binary_test_roman_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal12length512feature256romanword -driver_steps 400000 -driver_dimension 200002 -train_data_file data/joint/binary_train_roman_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/binary_test_roman_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/joint/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/binary_test_word_limit.t7b -test_data_replace 200002 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/binary_train_code.t7b -test_data_file data/joint/binary_test_code.t7b -driver_location models/jointbinary/temporal8length486feature256 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jointbinary/temporal8length486feature256byte -driver_dimension 257 -train_data_file data/joint/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/joint/binary_train_roman_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/binary_test_roman_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/temporal8length486feature256romanword -driver_variation small -driver_steps 400000 -driver_dimension 200002 -train_data_file data/joint/binary_train_roman_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/binary_test_roman_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointbinary_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jointbinary/temporal8length486feature256word -driver_dimension 200002 -train_data_file data/joint/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/binary_test_word_limit.t7b -test_data_replace 200002 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/full_train_code.t7b -test_data_file data/joint/full_test_code.t7b -driver_location models/jointfull/temporal12length512feature256 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/joint/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/joint/full_train_roman_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/full_test_roman_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal12length512feature256romanword -driver_steps 400000 -driver_dimension 200002 -train_data_file data/joint/full_train_roman_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/full_test_roman_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/joint/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/full_test_word_limit.t7b -test_data_replace 200002 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/full_train_code.t7b -test_data_file data/joint/full_test_code.t7b -driver_location models/jointfull/temporal8length486feature256 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/joint/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/joint/full_train_roman_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/joint/full_test_roman_byte.t7b -test_data_replace 257 -test_data_shift 1 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal8length486feature256romanword -driver_variation small -driver_steps 400000 -driver_dimension 200002 -train_data_file data/joint/full_train_roman_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/full_test_roman_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/jointfull_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/joint/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/joint/full_test_word_limit.t7b -test_data_replace 200002 -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "embednet/archive/nytimes_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/nytimes/temporal12length512feature256 -train_data_file data/nytimes/topic/train_code.t7b -test_data_file data/nytimes/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/nytimes_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/nytimes/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/nytimes/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/nytimes/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/nytimes_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/nytimes/temporal8length486feature256 -driver_variation small -train_data_file data/nytimes/topic/train_code.t7b -test_data_file data/nytimes/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/nytimes_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/nytimes/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/nytimes/topic/train_word_limit.t7b -train_data_replace 200002 -test_data_file data/nytimes/topic/test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal12length512feature256 -train_data_file data/rakuten/sentiment/binary_train_code.t7b -test_data_file data/rakuten/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/rakuten/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/rakuten/sentiment/binary_train_hepburn_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/binary_test_hepburn_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/rakuten/sentiment/binary_train_hepburn_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/binary_test_hepburn_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/rakuten/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal8length486feature256 -driver_variation small -train_data_file data/rakuten/sentiment/binary_train_code.t7b -test_data_file data/rakuten/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/rakuten/sentiment/binary_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/binary_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/rakuten/sentiment/binary_train_hepburn_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/binary_test_hepburn_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/rakuten/sentiment/binary_train_hepburn_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/binary_test_hepburn_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenbinary_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/rakuten/sentiment/binary_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/binary_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal12length512feature256 -train_data_file data/rakuten/sentiment/full_train_code.t7b -test_data_file data/rakuten/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal12length512feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal12length512feature256byte -driver_dimension 257 -train_data_file data/rakuten/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal12length512feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal12length512feature256roman -driver_dimension 257 -train_data_file data/rakuten/sentiment/full_train_hepburn_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/full_test_hepburn_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal12length512feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal12length512feature256romanword -driver_dimension 200002 -train_data_file data/rakuten/sentiment/full_train_hepburn_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/full_test_hepburn_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal12length512feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal12length512feature256word -driver_dimension 200002 -train_data_file data/rakuten/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal8length486feature256 -driver_variation small -train_data_file data/rakuten/sentiment/full_train_code.t7b -test_data_file data/rakuten/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal8length486feature256byte.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal8length486feature256byte -driver_variation small -driver_dimension 257 -train_data_file data/rakuten/sentiment/full_train_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/full_test_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal8length486feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal8length486feature256roman -driver_variation small -driver_dimension 257 -train_data_file data/rakuten/sentiment/full_train_hepburn_byte.t7b -train_data_replace 257 -train_data_shift 1 -test_data_file data/rakuten/sentiment/full_test_hepburn_byte.t7b -test_data_replace 257 -test_data_shift 1 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal8length486feature256romanword.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal8length486feature256romanword -driver_variation small -driver_dimension 200002 -train_data_file data/rakuten/sentiment/full_train_hepburn_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/full_test_hepburn_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/archive/rakutenfull_temporal8length486feature256word.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/temporal8length486feature256word -driver_variation small -driver_dimension 200002 -train_data_file data/rakuten/sentiment/full_train_word_limit.t7b -train_data_replace 200002 -test_data_file data/rakuten/sentiment/full_test_word_limit.t7b -test_data_replace 200002 \"$@\";\n"
  },
  {
    "path": "embednet/config.lua",
    "content": "--[[\nConfiguration for EmbedNet\nCopyright Xiang Zhang 2016\n--]]\n\n-- Name space\nlocal config = {}\n\n-- Training data configurations\nconfig.train_data = {}\nconfig.train_data.file = 'data/dianping/train_code.t7b'\nconfig.train_data.batch = 16\nconfig.train_data.replace = 65537\nconfig.train_data.shift = 0\n\n-- Testing data configurations\nconfig.test_data = {}\nconfig.test_data.file = 'data/dianping/test_code.t7b'\nconfig.test_data.batch = 16\nconfig.test_data.replace = 65537\nconfig.test_data.shift = 0\n\n-- Model configurations\nconfig.model = {}\nconfig.model.cudnn = true\n\n-- Model variations configuration\nconfig.variation = {}\n\n-- Large model configuration\nlocal embedding = {}\nembedding[1] = {name = 'nn.LookupTable', nIndex = 65537, nOutput = 256,\n                paddingValue = config.train_data.replace}\nembedding[2] = {name = 'nn.Transpose', permutations = {{2, 3}}}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[16] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[17] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[18] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[19] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[20] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[21] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[22] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[23] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[24] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[25] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[26] = {name = 'nn.Reshape', size = 4096, batchMode = true}\ntemporal[27] = {name = 'nn.Linear', inputSize = 4096, outputSize = 1024}\ntemporal[28] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[29] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[30] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[31] = {name = 'nn.LogSoftMax'}\nconfig.variation['large'] =\n   {embedding = embedding, temporal = temporal, length = 512}\n\n-- Small model configuration\nlocal embedding = {}\nembedding[1] = {name = 'nn.LookupTable', nIndex = 65537, nOutput = 256,\n                paddingValue = config.train_data.replace}\nembedding[2] = {name = 'nn.Transpose', permutations = {{2, 3}}}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[16] = {name = 'nn.Reshape', size = 4608, batchMode = true}\ntemporal[17] = {name = 'nn.Linear', inputSize = 4608, outputSize = 1024}\ntemporal[18] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[19] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[20] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[21] = {name = 'nn.LogSoftMax'}\nconfig.variation['small'] =\n   {embedding = embedding, temporal = temporal, length = 486}\n\n-- Trainer settings\nconfig.train = {}\nconfig.train.momentum = 0.9\nconfig.train.decay = 1e-5\n-- These are just multipliers to config.driver.rate\n-- For every config.driver.schedule * config.driver.steps\nconfig.train.rates =\n   {1/1, 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, 1/128, 1/256, 1/512, 1/1024}\n\n-- Tester settings\nconfig.test = {}\n\n-- Visualizer settings\nconfig.visualizer = {}\nconfig.visualizer.width = 1200\nconfig.visualizer.scale = 4\nconfig.visualizer.height = 64\n\n-- Driver configurations\nconfig.driver = {}\nconfig.driver.type = 'torch.CudaTensor'\nconfig.driver.device = 1\nconfig.driver.loss = 'nn.ClassNLLCriterion'\nconfig.driver.variation = 'large'\nconfig.driver.dimension = 65537\nconfig.driver.steps = 100000\nconfig.driver.epoches = 100\nconfig.driver.schedule = 8\nconfig.driver.rate = 1e-5\nconfig.driver.interval = 5\nconfig.driver.location = 'models/dianping/temporal12length512feature256'\nconfig.driver.plot = true\nconfig.driver.visualize = true\nconfig.driver.debug = false\nconfig.driver.resume = false\n\n-- Main configuration\nconfig.joe = {}\n\nreturn config\n"
  },
  {
    "path": "embednet/data.lua",
    "content": "--[[\nData class for Embedding Net\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal torch = require('torch')\n\nlocal parent = require('glyphnet/data')\n\nlocal Data = class(parent)\n\n-- Constructor for Data\n-- config: configuration table\n--   .file: file for data\n--   .batch: batch of data\n--   .replace: the code to for replacing padding space\nfunction Data:_init(config)\n   self.data = torch.load(config.file)\n   self.length = config.length or 512\n   self.batch = config.batch or 16\n   self.replace = config.replace or 65537\n   self.shift = config.shift or 0\nend\n\nfunction Data:initSample(sample, label)\n   local sample = sample or torch.Tensor(self.batch, self.length)\n   local label = label or torch.Tensor(self.batch)\n   sample:fill(self.replace)\n   return sample, label\nend\n\nfunction Data:index(sample, class, item)\n   local code, code_value = self.data.code, self.data.code_value\n   local position = 1\n\n   for field = 1, code[class][item]:size(1) do\n      -- Break if current position is larger than sample length\n      if position > sample:size(1) then\n         break\n      end\n      -- Determine the actual length\n      local length = code[class][item][field][2]\n      if position + length - 1 > sample:size(1) then\n         length = sample:size(1) - position + 1\n      end\n      -- Copy the data over\n      if length > 0 then\n         sample:narrow(1, position, length):copy(\n            code_value:narrow(1, code[class][item][field][1], length)):add(\n            self.shift)\n      end\n      -- Increment the position value\n      position = position + length\n   end\n\n   return sample\nend\n\nreturn Data\n"
  },
  {
    "path": "embednet/driver.lua",
    "content": "--[[\nDriver for EmbedNet training\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\n\nlocal parent = require('glyphnet/driver')\nlocal Driver = class(parent)\n\n-- Initialize variation\nfunction Driver:initVariation()\n   print('Driver using model variation '..self.variation)\n   self.options.model.embedding =\n      self.options.variation[self.variation].embedding\n   self.options.model.temporal = self.options.variation[self.variation].temporal\n\n   print('Driver adjusting data length to '..\n            self.options.variation[self.variation].length)\n   self.options.train_data.length =\n      self.options.variation[self.variation].length\n   self.options.test_data.length =\n      self.options.variation[self.variation].length\n\n   self.dimension = self.options.driver.dimension\n   print('Driver adjusting data index dimension to '..self.dimension)\n   self.options.model.embedding[1].nIndex = self.dimension\n   self.options.model.embedding[1].paddingValue =\n      self.options.train_data.replace\nend\n\n-- Visualize the model\nfunction Driver:visualizeModel()\n   local Visualizer = require('visualizer')\n   self.options.visualizer.title = 'Embedding model'\n   self.embedding_visualizer = self.embedding_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = 'Temporal model'\n   self.temporal_visualizer = self.temporal_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = nil\n\n   self.embedding_visualizer:drawSequential(self.model.embedding)\n   self.temporal_visualizer:drawSequential(self.model.temporal)\nend\n\nreturn Driver\n"
  },
  {
    "path": "embednet/model.lua",
    "content": "--[[\nModel for EmbedNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal nn = require('nn')\n\nlocal parent = require('glyphnet/model')\n\nlocal Model = class(parent)\n\n-- Model constructor\n-- config: configuration table\n--   .embedding: configuration table of the embedding model\n--   .temporal: configuration table of the temporal model\n--   .file: the model file to load\n--   .pretrain: whether the keep the embedding pretrained\n--   .embedding_file: the file for pretrained embedding model\n--   .cudnn: whether to use NVidia CUDNN\nfunction Model:_init(config)\n   -- Read or create model\n   if config.file then\n      local model = torch.load(config.file)\n      self.embedding = self:makeCleanSequential(model.embedding)\n      self.temporal = self:makeCleanSequential(model.temporal)\n   else\n      if config.embedding_file then\n         self.embedding = self:makeCleanSequential(\n            torch.load(config.embedding_file))\n      else\n         self.embedding = self:createCleanSequential(config.embedding)\n         self:initSequential(self.embedding)\n      end\n      self.temporal = self:createCleanSequential(config.temporal)\n      self:initSequential(self.temporal)\n   end\n\n   -- Saving configurations\n   self.pretrain = config.pretrain\n   self.cudnn = config.cudnn\n   self.config = config\n   self.tensortype = torch.getdefaulttensortype()\nend\n\nfunction Model:forward(input)\n   self.feature = self.embedding:forward(input)\n   self.output = self.temporal:forward(self.feature)\n   return self.output\nend\n\nfunction Model:backward(input, grad_output)\n   self.grad_feature = self.temporal:backward(self.feature, grad_output)\n   if self.pretrain then\n      return self.grad_feature\n   else\n      self.grad_input = self.embedding:backward(input, self.grad_feature)\n      return self.grad_input\n   end\nend\n\nfunction Model:getParameters()\n   return nn.Module.getParameters(self)\nend\n\nfunction Model:parameters()\n   local parameters, gradients = {}, {}\n\n   if not self.pretrain then\n      local embedding_parameters, embedding_gradients =\n         self.embedding:parameters()\n      for i = 1, #embedding_parameters do\n         parameters[#parameters + 1] = embedding_parameters[i]\n         gradients[#gradients + 1] = embedding_gradients[i]\n      end\n   end\n\n   local temporal_parameters, temporal_gradients = self.temporal:parameters()\n   for i = 1, #temporal_parameters do\n      parameters[#parameters + 1] = temporal_parameters[i]\n      gradients[#gradients + 1] = temporal_gradients[i]\n   end\n\n   return parameters, gradients\nend\n\nfunction Model:type(tensortype)\n   if tensortype ~= nil and tensortype ~= self.tensortype then\n      if tensortype == 'torch.CudaTensor' then\n         require('cunn')\n         self.embedding = self:makeCudaSequential(self.embedding)\n         self.temporal = self:makeCudaSequential(self.temporal)\n      else\n         self.embedding = self:makeCleanSequential(self.embedding)\n         self.temporal = self:makeCleanSequential(self.temporal)\n      end\n      self.embedding:type(tensortype)\n      self.temporal:type(tensortype)\n      self.tensortype = tensortype\n   end\n\n   return self.tensortype\nend\n\nfunction Model:setMode(mode)\n   self:setModeSequential(self.embedding, mode)\n   self:setModeSequential(self.temporal, mode)\nend\n\n\nfunction Model:save(file)\n   local embedding = self:clearSequential(\n      self:makeCleanSequential(self.embedding))\n   local temporal = self:clearSequential(\n      self:makeCleanSequential(self.temporal))\n   torch.save(file, {embedding = embedding, temporal = temporal})\nend\n\nModel.initModule['nn.LookupTable'] = function (self, m)\n   m.weight:normal(0, math.sqrt(1 / m.weight:size(2)))\n   if m.paddingValue > 0 then\n      m.weight[m.paddingValue]:zero()\n   end\nend\nModel.initModule['nn.Transpose'] = function (self, m) end\n\nModel.setModeModule['train']['nn.LookupTable'] = function (self, m) end\nModel.setModeModule['train']['nn.Transpose'] = function (self, m) end\n\nModel.setModeModule['test']['nn.LookupTable'] = function(self, m) end\nModel.setModeModule['test']['nn.Transpose'] = function(self, m) end\n\nModel.createCleanModule['nn.LookupTable'] = function (self, m)\n   return nn.LookupTable(m.nIndex, m.nOutput, m.paddingValue)\nend\nModel.createCleanModule['nn.Transpose'] = function (self, m)\n   return nn.Transpose(unpack(m.permutations))\nend\n\nModel.makeCleanModule['nn.LookupTable'] = function(self, m)\n   local new = nn.LookupTable(\n      m.weight:size(1), m.weight:size(2), m.paddingValue)\n   new.weight:copy(m.weight)\n   return new\nend\nModel.makeCleanModule['nn.Transpose'] = function (self, m)\n   return nn.Transpose(unpack(m.permutations))\nend\n\nModel.makeCudaModule['nn.LookupTable'] = function (self, m)\n   local new = nn.LookupTable(\n      m.weight:size(1), m.weight:size(2), m.paddingValue)\n   new.weight:copy(m.weight)\n   return new\nend\nModel.makeCudaModule['nn.Transpose'] = function (self, m)\n   return nn.Transpose(unpack(m.permutations))\nend\n\nreturn Model\n"
  },
  {
    "path": "embednet/unittest/data.lua",
    "content": "--[[\nUnit test for EmbedNet data component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Data = require('data')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.length = 512\n   config.test_data.length = 512\n\n   print('Creating testing data object')\n   local data = Data(config.test_data)\n\n   self.config = config\n   self.data = data\nend\n\nfunction joe:printSample(sample, label, count)\n   local count = count or sample:size(1)\n   for i = 1, count do\n      io.write(label[i], ':')\n      for j = 1, sample:size(2) do\n         io.write(' ', sample[i][j])\n      end\n      io.write('\\n')\n   end\n   io.flush()\nend\n\nfunction joe:getBatchTest()\n   local data = self.data\n   print('Getting a batch')\n   local sample, label = data:getBatch()\n   self:printSample(sample, label)\n   print('Getting a second batch')\n   sample, label = data:getBatch(sample, label)\n   self:printSample(sample, label)\nend\n\nfunction joe:iteratorTest()\n   local data = self.data\n   for sample, label, count in data:iterator() do\n      io.write(count, ':')\n      for i = 1, count do\n         io.write(' ', label[i])\n      end\n      io.write('\\n')\n      io.flush()\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/driver.lua",
    "content": "--[[\nUnit test for EmbedNet driver component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Driver = require('driver')\n\n--  A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Creating driver')\n   config.train_data.file = 'data/dianping/unittest_code.t7b'\n   config.test_data.file = 'data/dianping/unittest_code.t7b'\n   config.driver.debug = true\n   config.driver.device = 3\n   config.driver.steps = 10\n   config.driver.epoches = 5\n   local driver = Driver(config, config.driver)\n\n   self.config = config\n   self.driver = driver\nend\n\nfunction joe:driverTest()\n   local driver = self.driver\n   print('Testing driver')\n   driver:run()\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/model.lua",
    "content": "--[[\nUnit Test for EmbedNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n\n   local model = Model(config.model)\n   print('Embedding model:')\n   print(model.embedding)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(2, 512):mul(65537):ceil()\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size())\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Embedding model')\n   print(loaded.embedding)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/model_cudnn.lua",
    "content": "--[[\nUnit Test for EmbedNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   config.model.cudnn = true\n\n   local model = Model(config.model)\n   model:cuda()\n   print('Embedding model:')\n   print(model.embedding)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(16, 512):mul(65537):ceil():cuda()\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   cutorch.synchronize()\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size()):cuda()\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   cutorch.synchronize()\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Embedding model')\n   print(loaded.embedding)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/model_cunn.lua",
    "content": "--[[\nUnit Test for EmbedNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   config.model.cudnn = nil\n\n   local model = Model(config.model)\n   model:cuda()\n   print('Embedding model:')\n   print(model.embedding)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(16, 512):mul(65537):ceil():cuda()\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   cutorch.synchronize()\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size()):cuda()\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   cutorch.synchronize()\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Embedding model')\n   print(loaded.embedding)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/test.lua",
    "content": "--[[\nUnit test for EmbedNet test component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/test_cuda.lua",
    "content": "--[[\nUnit test for EmbedNet test component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/train.lua",
    "content": "--[[\nUnit test for EmbedNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create trainer')\n   for i, v in pairs(config.train.rates) do\n      config.train.rates[i] = v * config.driver.rate\n   end\n   local train = Train(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 10 steps')\n   train:run(100, callback)\nend\n\nfunction joe:callback()\n   self.time = os.time()\n   return function (train, i)\n      if os.difftime(os.time(), self.time) >= 5 then\n         print('stp: '..train.step..', rat: '..train.rate..\n                  ', err: '..train.error..', obj: '..train.objective..\n                  ', dat: '..train.time.data..', fwd: '..train.time.forward..\n                  ', bwd: '..train.time.backward..', upd: '..train.time.update)\n         self.time = os.time()\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/unittest/train_cuda.lua",
    "content": "--[[\nUnit test for EmbedNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.embedding = config.variation['large'].embedding\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create trainer')\n   for i, v in pairs(config.train.rates) do\n      config.train.rates[i] = v * config.driver.rate\n   end\n   local train = Train(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 100000 steps')\n   train:run(100000, callback)\nend\n\nfunction joe:callback()\n   self.time = os.time()\n   return function (train, i)\n      if os.difftime(os.time(), self.time) >= 5 then\n         print('stp: '..train.step..', rat: '..train.rate..\n                  ', err: '..train.error..', obj: '..train.objective..\n                  ', dat: '..train.time.data..', fwd: '..train.time.forward..\n                  ', bwd: '..train.time.backward..', upd: '..train.time.update)\n         self.time = os.time()\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "embednet/visualizer.lua",
    "content": "--[[\nVisualization module for EmbedNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\n\nlocal parent = require('glyphnet/visualizer')\nlocal Visualizer = class(parent)\n\nVisualizer.drawModule['nn.LookupTable'] = Visualizer.drawModule['nn.Linear']\n\nreturn Visualizer\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charbigram;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charbigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charbigram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charpentagram;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charpentagram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charpentagram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charunigram;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charunigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/charunigram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigram;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigramroman;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordbigramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagram;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagramroman;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordpentagramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigram;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigram_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigramroman;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stbinary_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stbinary/wordunigramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/binary_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/binary_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charbigram;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charbigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charbigram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charpentagram;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charpentagram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charpentagram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charunigram;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charunigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/charunigram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigram;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigramroman;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordbigramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagram;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagramroman;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordpentagramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigram;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigram_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigram_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigramroman;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigramroman_evaluation;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/11stfull_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/11stfull/wordunigramroman_tuned;\nTRAIN_DATA=data/11st/sentiment/full_train_rr_wordtoken_shuffle.txt;\nTEST_DATA=data/11st/sentiment/full_test_rr_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charbigram;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charbigram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charbigram_tuned;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charpentagram;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charpentagram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charpentagram_tuned;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charunigram;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charunigram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/charunigram_tuned;\nTRAIN_DATA=data/amazon/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordbigram;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordbigram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordbigram_tuned;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordpentagram;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordpentagram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordpentagram_tuned;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordunigram;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordunigram_evaluation;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonbinary_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonbinary/wordunigram_tuned;\nTRAIN_DATA=data/amazon/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charbigram;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charbigram_evaluation;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charbigram_tuned;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charpentagram;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charpentagram_evaluation;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charpentagram_tuned;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charunigram;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charunigram_evaluation;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/charunigram_tuned;\nTRAIN_DATA=data/amazon/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordbigram;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordbigram_evaluation;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordbigram_tuned;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordpentagram;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordpentagram_evaluation;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordpentagram_tuned;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordunigram;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordunigram_evaluation;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/amazon/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/amazonfull_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/amazonfull/wordunigram_tuned;\nTRAIN_DATA=data/amazon/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/amazon/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charbigram;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charbigram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charbigram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charpentagram;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charpentagram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charpentagram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charunigram;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charunigram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/charunigram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigram;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigramroman;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigramroman_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordbigramroman_tuned;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagram;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagramroman;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagramroman_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordpentagramroman_tuned;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigram;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigram_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigram_tuned;\nTRAIN_DATA=data/chinanews/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigramroman;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigramroman_evaluation;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/chinanews_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/chinanews/wordunigramroman_tuned;\nTRAIN_DATA=data/chinanews/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/chinanews/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charbigram;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charbigram_evaluation;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charbigram_tuned;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charpentagram;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charpentagram_evaluation;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charpentagram_tuned;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charunigram;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charunigram_evaluation;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/charunigram_tuned;\nTRAIN_DATA=data/dianping/train_chartoken_shuffle.txt;\nTEST_DATA=data/dianping/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigram;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigram_evaluation;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigram_tuned;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigramroman;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigramroman_evaluation;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordbigramroman_tuned;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagram;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagram_evaluation;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagram_tuned;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagramroman;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagramroman_evaluation;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordpentagramroman_tuned;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigram;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigram_evaluation;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigram_tuned;\nTRAIN_DATA=data/dianping/train_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigramroman;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigramroman_evaluation;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/dianping/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/dianping_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/dianping/wordunigramroman_tuned;\nTRAIN_DATA=data/dianping/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/dianping/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charbigram;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charbigram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charbigram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charpentagram;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charpentagram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charpentagram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charunigram;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charunigram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/charunigram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigram;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigramroman;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigramroman_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordbigramroman_tuned;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 2 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagram;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagramroman;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagramroman_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordpentagramroman_tuned;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 2 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigram;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigram_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigram_tuned;\nTRAIN_DATA=data/ifeng/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigramroman;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigramroman_evaluation;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/ifeng_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/ifeng/wordunigramroman_tuned;\nTRAIN_DATA=data/ifeng/topic/train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/ifeng/topic/test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charbigram;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charbigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charbigram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charpentagram;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charpentagram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charpentagram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charunigram;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charunigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/charunigram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigram;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigramroman;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordbigramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagram;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagramroman;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordpentagramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigram;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigram_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigramroman;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdbinary_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdbinary/wordunigramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/binary_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/binary_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charbigram;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charbigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charbigram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charpentagram;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charpentagram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charpentagram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charunigram;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charunigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/charunigram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigram;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigramroman;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordbigramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagram;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagramroman;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordpentagramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigram;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigram_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigram_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigramroman;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigramroman_evaluation;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jdfull_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jdfull/wordunigramroman_tuned;\nTRAIN_DATA=data/jd/sentiment/full_train_pinyin_wordtoken_shuffle.txt;\nTEST_DATA=data/jd/sentiment/full_test_pinyin_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charbigram;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charbigram_evaluation;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charbigram_tuned;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charpentagram;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charpentagram_evaluation;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charpentagram_tuned;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charunigram;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charunigram_evaluation;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/charunigram_tuned;\nTRAIN_DATA=data/joint/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigram;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigram_evaluation;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigram_tuned;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigramroman;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigramroman_evaluation;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordbigramroman_tuned;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagram;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagram_evaluation;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagram_tuned;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagramroman;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagramroman_evaluation;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordpentagramroman_tuned;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigram;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigram_evaluation;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigram_tuned;\nTRAIN_DATA=data/joint/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigramroman;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigramroman_evaluation;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/binary_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointbinary_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointbinary/wordunigramroman_tuned;\nTRAIN_DATA=data/joint/binary_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/binary_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charbigram;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charbigram_evaluation;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charbigram_tuned;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charpentagram;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charpentagram_evaluation;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charpentagram_tuned;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charunigram;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charunigram_evaluation;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/charunigram_tuned;\nTRAIN_DATA=data/joint/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigram;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigram_evaluation;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigram_tuned;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigramroman;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigramroman_evaluation;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordbigramroman_tuned;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagram;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagram_evaluation;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagram_tuned;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagramroman;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagramroman_evaluation;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordpentagramroman_tuned;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigram;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigram_evaluation;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigram_tuned;\nTRAIN_DATA=data/joint/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigramroman;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigramroman_evaluation;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/joint/full_train_roman_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/jointfull_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/jointfull/wordunigramroman_tuned;\nTRAIN_DATA=data/joint/full_train_roman_wordtoken_shuffle.txt;\nTEST_DATA=data/joint/full_test_roman_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charbigram;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charbigram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charbigram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charpentagram;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charpentagram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charpentagram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charunigram;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charunigram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/charunigram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_chartoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordbigram;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordbigram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordbigram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordpentagram;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordpentagram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordpentagram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordunigram;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordunigram_evaluation;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/nytimes/topic/train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/nytimes_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/nytimes/wordunigram_tuned;\nTRAIN_DATA=data/nytimes/topic/train_wordtoken_shuffle.txt;\nTEST_DATA=data/nytimes/topic/test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charbigram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charbigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charbigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charpentagram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charpentagram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charpentagram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charunigram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charunigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/charunigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigramroman;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordbigramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagramroman;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordpentagramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigram;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigramroman;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenbinary_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenbinary/wordunigramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/binary_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/binary_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charbigram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charbigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charbigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charpentagram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charpentagram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charpentagram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charunigram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charunigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_charunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/charunigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_chartoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_chartoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigramroman;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordbigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordbigramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 2 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagramroman;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordpentagramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordpentagramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 5 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigram;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigram_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigram_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigram_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigram_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigramroman;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigramroman_evaluation.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigramroman_evaluation;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_0.txt;\nTEST_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle_split_1.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_2 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 2 -thread 10;\nfasttext test $LOCATION/model_2.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_5 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 5 -thread 10;\nfasttext test $LOCATION/model_5.bin $TEST_DATA;\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model_10 -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model_10.bin $TEST_DATA;\n"
  },
  {
    "path": "fasttext/archive/rakutenfull_wordunigramroman_tuned.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2017 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nLOCATION=models/rakutenfull/wordunigramroman_tuned;\nTRAIN_DATA=data/rakuten/sentiment/full_train_hepburn_wordtoken_shuffle.txt;\nTEST_DATA=data/rakuten/sentiment/full_test_hepburn_wordtoken_shuffle.txt;\n\nfasttext supervised -input $TRAIN_DATA -output $LOCATION/model -dim 10 -lr 0.1 -wordNgrams 1 -minCount 1 -bucket 10000000 -epoch 10 -thread 10;\nfasttext test $LOCATION/model.bin $TRAIN_DATA;\nfasttext test $LOCATION/model.bin $TEST_DATA;\n"
  },
  {
    "path": "glyphnet/archive/11stbinary_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/11stbinary/spatial6temporal8length486feature256 -train_data_file data/11st/sentiment/binary_train_code.t7b -test_data_file data/11st/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/11stbinary_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stbinary/spatial8temporal12length512feature256 -train_data_file data/11st/sentiment/binary_train_code.t7b -test_data_file data/11st/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/11stfull_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/11stfull/spatial6temporal8length486feature256 -train_data_file data/11st/sentiment/full_train_code.t7b -test_data_file data/11st/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/11stfull_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/11stfull/spatial8temporal12length512feature256 -train_data_file data/11st/sentiment/full_train_code.t7b -test_data_file data/11st/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/amazonbinary_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/amazonbinary/spatial6temporal8length486feature256 -train_data_file data/amazon/binary_train_code.t7b -test_data_file data/amazon/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/amazonbinary_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonbinary/spatial8temporal12length512feature256 -train_data_file data/amazon/binary_train_code.t7b -test_data_file data/amazon/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/amazonfull_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/amazonfull/spatial6temporal8length486feature256 -train_data_file data/amazon/full_train_code.t7b -test_data_file data/amazon/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/amazonfull_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/amazonfull/spatial8temporal12length512feature256 -train_data_file data/amazon/full_train_code.t7b -test_data_file data/amazon/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/chinanews_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/chinanews/spatial6temporal8length486feature256 -train_data_file data/chinanews/topic/train_code.t7b -test_data_file data/chinanews/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/chinanews_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/chinanews/spatial8temporal12length512feature256 -train_data_file data/chinanews/topic/train_code.t7b -test_data_file data/chinanews/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/dianping_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/spatial6temporal8length486feature256 \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/dianping_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/ifeng_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/ifeng/spatial6temporal8length486feature256 -train_data_file data/ifeng/topic/train_code.t7b -test_data_file data/ifeng/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/ifeng_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/ifeng/spatial8temporal12length512feature256 -train_data_file data/ifeng/topic/train_code.t7b -test_data_file data/ifeng/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jdbinary_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jdbinary/spatial6temporal8length486feature256 -train_data_file data/jd/sentiment/binary_train_code.t7b -test_data_file data/jd/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jdbinary_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdbinary/spatial8temporal12length512feature256 -train_data_file data/jd/sentiment/binary_train_code.t7b -test_data_file data/jd/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jdfull_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jdfull/spatial6temporal8length486feature256 -train_data_file data/jd/sentiment/full_train_code.t7b -test_data_file data/jd/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jdfull_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jdfull/spatial8temporal12length512feature256 -train_data_file data/jd/sentiment/full_train_code.t7b -test_data_file data/jd/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jointbinary_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jointbinary/spatial6temporal8length486feature256 -train_data_file data/joint/binary_train_code.t7b -test_data_file data/joint/binary_test_code.t7b -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jointbinary_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointbinary/spatial8temporal12length512feature256 -train_data_file data/joint/binary_train_code.t7b -test_data_file data/joint/binary_test_code.t7b -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jointfull_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/jointfull/spatial6temporal8length486feature256 -train_data_file data/joint/full_train_code.t7b -test_data_file data/joint/full_test_code.t7b -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/jointfull_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/jointfull/spatial8temporal12length512feature256 -train_data_file data/joint/full_train_code.t7b -test_data_file data/joint/full_test_code.t7b -driver_steps 400000 \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/nytimes_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/nytimes/spatial6temporal8length486feature256 -train_data_file data/nytimes/topic/train_code.t7b -test_data_file data/nytimes/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/nytimes_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/nytimes/spatial8temporal12length512feature256 -train_data_file data/nytimes/topic/train_code.t7b -test_data_file data/nytimes/topic/test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/rakutenbinary_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/rakutenbinary/spatial6temporal8length486feature256 -train_data_file data/rakuten/sentiment/binary_train_code.t7b -test_data_file data/rakuten/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/rakutenbinary_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenbinary/spatial8temporal12length512feature256 -train_data_file data/rakuten/sentiment/binary_train_code.t7b -test_data_file data/rakuten/sentiment/binary_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/rakutenfull_spatial6temporal8length486feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/rakutenfull/spatial6temporal8length486feature256 -train_data_file data/rakuten/sentiment/full_train_code.t7b -test_data_file data/rakuten/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/archive/rakutenfull_spatial8temporal12length512feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/rakutenfull/spatial8temporal12length512feature256 -train_data_file data/rakuten/sentiment/full_train_code.t7b -test_data_file data/rakuten/sentiment/full_test_code.t7b \"$@\";\n"
  },
  {
    "path": "glyphnet/config.lua",
    "content": "--[[\nConfiguration for GlyphNet\nCopyright Xiang Zhang 2015-2016\n--]]\n\n-- Name space\nlocal config = {}\n\n-- Training data configurations\nconfig.train_data = {}\nconfig.train_data.file = 'data/dianping/train_code.t7b'\nconfig.train_data.unifont = 'unifont/unifont-8.0.01.t7b'\nconfig.train_data.batch = 16\n\n-- Testing data configurations\nconfig.test_data = {}\nconfig.test_data.file = 'data/dianping/test_code.t7b'\nconfig.test_data.unifont = 'unifont/unifont-8.0.01.t7b'\nconfig.test_data.batch = 16\n\n-- Model configurations\nconfig.model = {}\nconfig.model.cudnn = true\nconfig.model.group = 16\n\n-- Model variations configuration\nconfig.variation = {}\n\n-- Large network configuration\nlocal spatial = {}\nspatial[1] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 1, nOutputPlane = 64,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[3] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 64, nOutputPlane = 64,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[5] = {name = 'nn.SpatialMaxPooling',\n              kW = 2, kH = 2, dW = 2, dH = 2, padW = 0, padH = 0}\nspatial[6] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 64, nOutputPlane = 128,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[8] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 128, nOutputPlane = 128,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[10] = {name = 'nn.SpatialMaxPooling',\n               kW = 2, kH = 2, dW = 2, dH = 2, padW = 0, padH = 0}\nspatial[11] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 128, nOutputPlane = 256,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[13] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 256, nOutputPlane = 256,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[15] = {name = 'nn.SpatialMaxPooling',\n              kW = 2, kH = 2, dW = 2, dH = 2, padW = 0, padH = 0}\nspatial[16] = {name = 'nn.Reshape', size = 1024, bachMode = true}\nspatial[17] = {name = 'nn.Linear', inputSize = 1024, outputSize = 1024}\nspatial[18] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[19] = {name = 'nn.Linear', inputSize = 1024, outputSize = 256}\nspatial[20] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[16] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[17] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[18] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[19] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[20] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[21] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[22] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[23] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[24] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[25] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[26] = {name = 'nn.Reshape', size = 4096, batchMode = true}\ntemporal[27] = {name = 'nn.Linear', inputSize = 4096, outputSize = 1024}\ntemporal[28] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[29] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[30] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[31] = {name = 'nn.LogSoftMax'}\nconfig.variation['large'] =\n   {spatial = spatial, temporal = temporal, length = 512}\n\n-- Small network configuration\nlocal spatial = {}\nspatial[1] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 1, nOutputPlane = 64,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 2, padH = 2}\nspatial[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[3] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 64, nOutputPlane = 64,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[5] = {name = 'nn.SpatialMaxPooling',\n              kW = 3, kH = 3, dW = 3, dH = 3, padW = 0, padH = 0}\nspatial[6] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 64, nOutputPlane = 128,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[8] = {name = 'nn.SpatialConvolution',\n              nInputPlane = 128, nOutputPlane = 128,\n              kW = 3, kH = 3, dW = 1, dH = 1, padW = 1, padH = 1}\nspatial[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[10] = {name = 'nn.SpatialMaxPooling',\n               kW = 3, kH = 3, dW = 3, dH = 3, padW = 0, padH = 0}\nspatial[11] = {name = 'nn.Reshape', size = 512, bachMode = true}\nspatial[12] = {name = 'nn.Linear', inputSize = 512, outputSize = 256}\nspatial[13] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nspatial[14] = {name = 'nn.Linear', inputSize = 256, outputSize = 256}\nspatial[15] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[16] = {name = 'nn.Reshape', size = 4608, batchMode = true}\ntemporal[17] = {name = 'nn.Linear', inputSize = 4608, outputSize = 1024}\ntemporal[18] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[19] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[20] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[21] = {name = 'nn.LogSoftMax'}\nconfig.variation['small'] =\n   {spatial = spatial, temporal = temporal, length = 486}\n\n-- Trainer settings\nconfig.train = {}\nconfig.train.momentum = 0.9\nconfig.train.decay = 1e-5\n-- These are just multipliers to config.driver.rate\n-- For every config.driver.schedule * config.driver.steps\nconfig.train.rates =\n   {1/1, 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, 1/128, 1/256, 1/512, 1/1024}\n\n-- Tester settings\nconfig.test = {}\n\n-- Visualizer settings\nconfig.visualizer = {}\nconfig.visualizer.width = 1200\nconfig.visualizer.scale = 4\nconfig.visualizer.height = 64\n\n-- Driver configurations\nconfig.driver = {}\nconfig.driver.type = 'torch.CudaTensor'\nconfig.driver.device = 1\nconfig.driver.loss = 'nn.ClassNLLCriterion'\nconfig.driver.variation = 'large'\nconfig.driver.steps = 100000\nconfig.driver.epoches = 100\nconfig.driver.schedule = 8\nconfig.driver.rate = 1e-5\nconfig.driver.interval = 5\nconfig.driver.location = 'models/dianping/spatial8temporal12length512feature256'\nconfig.driver.plot = true\nconfig.driver.visualize = true\nconfig.driver.debug = false\nconfig.driver.resume = false\n\n-- Main configuration\nconfig.joe = {}\n\nreturn config\n"
  },
  {
    "path": "glyphnet/data.lua",
    "content": "--[[\nData program for GlyphNet\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal torch = require('torch')\n\nlocal Data = class()\n\n-- Constructor for Data\n-- config: configuration table\n--   .file: the data file location\n--   .unifont: the unifont data location\n--   .length: the text length in the data\n--   .batch: the batch size\nfunction Data:_init(config)\n   self.data = torch.load(config.file)\n   self.unifont = torch.load(config.unifont or 'unifont/unifont-8.0.01.t7b')\n   self.length = config.length or 512\n   self.batch = config.batch or 16\nend\n\nfunction Data:getClasses()\n   return #self.data.code\nend\n\nfunction Data:getBatch(sample, label)\n   local code, code_value = self.data.code, self.data.code_value\n   local sample, label = self:initSample(sample, label)\n\n   -- Loop over batch dimension\n   for i = 1, sample:size(1) do\n      local class = torch.random(#code)\n      local item = torch.random(code[class]:size(1))\n\n      -- Assign sample\n      self:index(sample[i], class, item)\n      -- Assign label\n      label[i] = class\n   end\n\n   return sample, label\nend\n\nfunction Data:iterator(sample, label)\n   local code, code_value = self.data.code, self.data.code_value\n   local sample, label = self:initSample(sample, label)\n\n   local class = 1\n   local item = 1\n   local count = 0\n\n   return function()\n      if code[class] == nil then return end\n\n      sample, label = self:initSample(sample, label)\n      count = 0\n      for i = 1, sample:size(1) do\n         if item > code[class]:size(1) then\n            class = class + 1\n            item = 1\n            if code[class] == nil then\n               if count > 0 then\n                  break\n               else\n                  return\n               end\n            end\n         end\n         self:index(sample[i], class, item)\n         label[i] = class\n         count = count + 1\n         item = item + 1\n      end\n\n      return sample, label, count\n   end\nend\n\nfunction Data:initSample(sample, label)\n   local height, width = self.unifont:size(3), self.unifont:size(2)\n   local sample = sample or\n      torch.Tensor(self.batch, self.length, height, width)\n   local label = label or torch.Tensor(self.batch)\n   sample:zero()\n   return sample, label\nend\n\nfunction Data:index(sample, class, item)\n   local code, code_value = self.data.code, self.data.code_value\n   local position = 1\n   for field = 1, code[class][item]:size(1) do\n      -- Break if current position is larger than sample length\n      if position > sample:size(1) then\n         break\n      end\n      -- Determine the actual length\n      local length = code[class][item][field][2]\n      if position + length - 1 > sample:size(1) then\n         length = sample:size(1) - position + 1\n      end\n      -- Copy the data over\n      sample:narrow(1, position, length):index(\n         self.unifont, 1, code_value:narrow(\n            1, code[class][item][field][1], length))\n      position = position + length\n   end\n\n   return sample\nend\n\nreturn Data\n"
  },
  {
    "path": "glyphnet/driver.lua",
    "content": "--[[\nDriver for GlyphNet training\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal nn = require('nn')\nlocal os = require('os')\nlocal paths = require('paths')\nlocal torch = require('torch')\n\nlocal Data = require('data')\nlocal Model = require('model')\nlocal Train = require('train')\nlocal Test = require('test')\n\nlocal Driver = class()\n\n-- Constructor for driver\n-- options: configuration table for other classes\n-- config: configuration table for driver\n--   .type: tensor type to do computation\n--   .device: device id for CUDA. Only valid for .type = 'torch.CudaTensor'\n--   .loss: the loss class to be used\n--   .variation: the variation of the model\n--   .steps: number of steps for each epoch\n--   .epoches: number of epoches\n--   .rate: initial learning rate\n--   .schedule: rate change schedule\n--   .interval: print time interval\n--   .location: save location\n--   .plot: whether to plot the result\n--   .visualize: whether to visualize the models\n--   .debug: whether to do debugging\n--   .resume: whether to do resumption\nfunction Driver:_init(options, config)\n   local config = config or {}\n   self.type = config.type or 'torch.DoubleTensor'\n   self.device = config.device or 1\n   self.loss = config.loss or 'nn.ClassNLLCriterion'\n   self.variation = config.variation or 'large'\n   self.steps = config.steps or 100000\n   self.epoches = config.epoches or 100\n   self.rate = config.rate or 1e-3\n   self.schedule = config.schedule or 8\n   self.interval = config.interval or 5\n   self.location = config.location or '.'\n   self.plot = config.plot\n   self.visualize = config.visualize\n   self.debug = config.debug\n   self.resume = config.resume\n   self.options = options\n\n   -- Update the rates for training\n   local rates = {}\n   for i, v in pairs(self.options.train.rates) do\n      rates[(i - 1) * self.steps * self.schedule + 1] = v * self.rate\n      self.options.train.rates = rates\n   end\n\n   -- CUDA settings\n   if self.type == 'torch.CudaTensor' then\n      local cutorch = require('cutorch')\n      print('Driver setting device to '..self.device)\n      cutorch.setDevice(self.device)\n   end\n\n   -- Initialize random seed\n   math.randomseed(os.time())\n   torch.manualSeed(os.time())\n\n   -- Handle model variation\n   self:initVariation()\n\n   -- Load data\n   print('Driver loading training data')\n   self.train_data = Data(self.options.train_data)\n   print('Driver loading testing data')\n   self.test_data = Data(self.options.test_data)\n\n   -- Handle final output number of classes. Assuming last module is nn.Linear.\n   local num_class = self.train_data:getClasses()\n   for i = #self.options.model.temporal, 1, -1 do\n      if self.options.model.temporal[i].name == 'nn.Linear' then\n         print('Driver adjusting number of classes in model to '..num_class)\n         self.options.model.temporal[i].outputSize = num_class\n         break\n      end\n   end\n\n   -- Handle resumption\n   if self.resume then\n      local record_file = paths.concat(self.location, 'record.t7b')\n      print('Driver loading resumption from '..record_file)\n      self.record = torch.load(record_file)\n\n      local model_file = paths.concat(\n         self.location, 'model_'..#self.record..'.t7b')\n      print('Driver loading model from '..model_file)\n      self.options.model.file = model_file\n      self.model = Model(self.options.model)\n\n      local state_file = paths.concat(\n         self.location, 'state_'..#self.record..'.t7b')\n      print('Driver loading training state from '..state_file)\n      self.options.train.state = torch.load(state_file)\n \n      print('Driver setting train step to '..(#self.record * self.steps))\n      self.options.train.step = #self.record * self.steps\n\n      for i = 1, #self.record do\n         self:printResult(i)\n      end\n      if self.plot then\n         self:plotRecord()\n      end\n   else\n      self.record = {}\n      print('Driver loading model')\n      self.model = Model(self.options.model)\n   end\n\n   print('Driver setting model type to '..self.type)\n   self.model:type(self.type)\n   print('Driver loading trainer')\n   self.trainer_loss = nn[self.loss:sub(4)]()\n   self.trainer_loss:type(self.type)\n   self.trainer = Train(\n      self.train_data, self.model, self.trainer_loss, self.options.train)\n   print('Driver loading tester for training data')\n   self.train_tester_loss = nn[self.loss:sub(4)]()\n   self.train_tester_loss:type(self.type)\n   self.train_tester = Test(\n      self.train_data, self.model, self.train_tester_loss, self.options.test)\n   print('Driver loading tester for testing data')\n   self.test_tester_loss = nn[self.loss:sub(4)]()\n   self.test_tester_loss:type(self.type)\n   self.test_tester = Test(\n      self.test_data, self.model, self.test_tester_loss, self.options.test)\n\n   if self.visualize then\n      self:visualizeModel()\n   end\n\n   self.time = os.time()\nend\n\n-- Initialize variation\nfunction Driver:initVariation()\n   print('Driver using model variation '..self.variation)\n   self.options.model.spatial = self.options.variation[self.variation].spatial\n   self.options.model.temporal = self.options.variation[self.variation].temporal\n\n   print('Driver adjusting data length to '..\n            self.options.variation[self.variation].length)\n   self.options.train_data.length =\n      self.options.variation[self.variation].length\n   self.options.test_data.length =\n      self.options.variation[self.variation].length\nend\n\n-- Run the training process\nfunction Driver:run()\n   local begin_epoch = #self.record + 1\n   local end_epoch = #self.record + self.epoches\n   for i = begin_epoch, end_epoch do\n      print('Driver setting model to training mode')\n      self.model:setModeTrain()\n      print('Driver training for epoch '..i)\n      self.trainer:run(\n         self.steps, function(train, step) self:logTrain(train, step) end)\n      if self.visualize then\n         self:visualizeModel()\n      end\n\n      print('Driver setting model to testing mode')\n      self.model:setModeTest()\n      print('Driver testing on training data for epoch '..i)\n      self.train_tester:run(function(test, step) self:logTest(test, step) end)\n      print('Driver testing on testing data for epoch '..i)\n\n      self.test_tester:run(function(test, step) self:logTest(test, step) end)\n      print('Driver saving for epoch '..i)\n      self:save()\n      self:printResult()\n      if self.plot then\n         self:plotRecord()\n      end\n   end\nend\n\n-- Save the record and the model\nfunction Driver:save()\n   local epoch = epoch or #self.record + 1\n\n   -- Make a backup for the record\n   print('Driver backing up record.t7b')\n   local record_file = paths.concat(self.location, 'record.t7b')\n   os.rename(record_file, record_file..'.backup')\n\n   -- Save the new record\n   print('Driver saving new records to '..record_file)\n   self.record[epoch] = {\n      train_loss = self.train_tester.total_objective,\n      test_loss = self.test_tester.total_objective,\n      train_error = self.train_tester.total_error,\n      test_error = self.test_tester.total_error\n   }\n   torch.save(record_file, self.record)\n\n   -- Save the model\n   local model_file = paths.concat(self.location, 'model_'..epoch..'.t7b')\n   print('Driver saving model to '..model_file)\n   self.model:save(model_file)\n\n   -- Save the training state\n   local state_file = paths.concat(self.location, 'state_'..epoch..'.t7b')\n   print('Driver saving training state to '..state_file)\n   torch.save(state_file, self.trainer.state:type(torch.getdefaulttensortype()))\nend\n\n-- Print current result\nfunction Driver:printResult(epoch)\n   local epoch = epoch or #self.record\n   print('Driver epoch = '..epoch..\n            ', train_error = '..self.record[epoch].train_error..\n            ', test_error = '..self.record[epoch].test_error..\n            ', train_loss = '..self.record[epoch].train_loss..\n            ', test_loss = '..self.record[epoch].test_loss)\nend\n\n-- Plot the record\nfunction Driver:plotRecord()\n   require('gnuplot')\n   self.error_figure = self.error_figure or gnuplot.figure()\n   self.loss_figure = self.loss_figure or gnuplot.figure()\n\n   local epoch = torch.linspace(1, #self.record, #self.record)\n   local train_error = torch.Tensor(epoch:size())\n   local test_error = torch.Tensor(epoch:size())\n   local train_loss = torch.Tensor(epoch:size())\n   local test_loss = torch.Tensor(epoch:size())\n   for i = 1, #self.record do\n      train_error[i] = self.record[i].train_error\n      test_error[i] = self.record[i].test_error\n      train_loss[i] = self.record[i].train_loss\n      test_loss[i] = self.record[i].test_loss\n   end\n\n   gnuplot.figure(self.error_figure)\n   gnuplot.plot({'Training error', epoch, train_error},\n                {'Testing error', epoch, test_error})\n   gnuplot.title('Training and testing error')\n   gnuplot.figure(self.loss_figure)\n   gnuplot.plot({'Training loss', epoch, train_loss},\n                {'Testing loss', epoch, test_loss})\n   gnuplot.title('Training and testing loss')\nend\n\n-- Visualize the model\nfunction Driver:visualizeModel()\n   local Visualizer = require('visualizer')\n   self.options.visualizer.title = 'Spatial model'\n   self.spatial_visualizer = self.spatial_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = 'Temporal model'\n   self.temporal_visualizer = self.temporal_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = nil\n\n   self.spatial_visualizer:drawSequential(self.model.spatial)\n   self.temporal_visualizer:drawSequential(self.model.temporal)\nend\n\n-- Log training\nfunction Driver:logTrain(train, step)\n   -- If it is not time to log, return\n   if os.difftime(os.time(), self.time) < self.interval then return end\n\n   local message = 'Train step = '..train.step..\n      ', rate = '..string.format('%.2e', train.rate)..\n      ', error = '..string.format('%.2e', train.error)..\n      ', loss = '..string.format('%.2e', train.objective)..\n      ', data = '..string.format('%.2e', train.time.data)..\n      ', forward = '..string.format('%.2e', train.time.forward)..\n      ', backward = '..string.format('%.2e', train.time.backward)..\n      ', update = '..string.format('%.2e', train.time.update)\n\n   if self.debug then\n      message = message..\n         ', input = ['..string.format(\"%.2e\",train.input:min())..\n         ' '..string.format(\"%.2e\",train.input:max())..\n         ' '..string.format(\"%.2e\",train.input:mean())..\n         ' '..string.format(\"%.2e\",train.input:std())..']'..\n         ', params = ['..string.format(\"%.2e\",train.params:min())..\n         ' '..string.format(\"%.2e\",train.params:max())..\n         ' '..string.format(\"%.2e\",train.params:mean())..\n         ' '..string.format(\"%.2e\",train.params:std())..']'..\n         ', grads = ['..string.format(\"%.2e\",train.grads:min())..\n         ' '..string.format(\"%.2e\",train.grads:max())..\n         ' '..string.format(\"%.2e\",train.grads:mean())..\n         ' '..string.format(\"%.2e\",train.grads:std())..']'..\n         ', state = ['..string.format(\"%.2e\",train.state:min())..\n         ' '..string.format(\"%.2e\",train.state:max())..\n         ' '..string.format(\"%.2e\",train.state:mean())..\n         ' '..string.format(\"%.2e\",train.state:std())..']'\n\n      if self.visualize then\n         self:visualizeModel()\n      end\n   end\n\n   print(message)\n   self.time = os.time()\nend\n\n-- Log testing\nfunction Driver:logTest(test)\n   -- If it not time to log, return\n   if os.difftime(os.time(), self.time) < self.interval then return end\n\n   local message = 'Test count = '..test.total_count..\n      ', error = '..string.format('%.2e', test.error)..\n      ', loss = '..string.format('%.2e', test.objective)..\n      ', total_error = '..string.format('%.2e', test.total_error)..\n      ', total_loss = '..string.format('%.2e', test.total_objective)..\n      ', data = '..string.format('%.2e', test.time.data)..\n      ', forward = '..string.format('%.2e', test.time.forward)..\n      ', update = '..string.format('%.2e', test.time.update)\n\n   if self.debug then\n      message = message..\n         ', input = ['..string.format(\"%.2e\",test.input:min())..\n         ' '..string.format(\"%.2e\",test.input:max())..\n         ' '..string.format(\"%.2e\",test.input:mean())..\n         ' '..string.format(\"%.2e\",test.input:std())..']'\n   end\n\n   print(message)\n   self.time = os.time()\nend\n\nreturn Driver\n"
  },
  {
    "path": "glyphnet/main.lua",
    "content": "--[[\nMain program for GlyphNet training\nCopyright 2015 Xiang Zhang\n--]]\n\nlocal torch = require('torch')\n\nlocal Driver = require('driver')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main(arg)\n   -- Load the configuration\n   local config = dofile('config.lua')\n   -- Build parameter table based on configuration\n   local params = joe.buildArgumentTable(config)\n   -- Parse arguments based on configuration\n   config = joe.parseArguments(arg, params, config)\n   -- Build the driver\n   local driver = Driver(config, config.driver)\n   -- Start the driver\n   driver:run()\nend\n\nfunction joe.buildArgumentTable(config, params, prefix)\n   local params = params or {}\n   local prefix = prefix or ''\n   for key, val in pairs(config) do\n      if type(key) == 'string' then\n         local val_type = type(val)\n         if val_type == 'string' or val_type == 'number' then\n            params[prefix..key] = val\n         elseif val_type == 'boolean' then\n            params[prefix..key] = tostring(val)\n         elseif val_type == 'table' then\n            params = joe.buildArgumentTable(val, params, prefix..key..'_')\n         else\n            print('Joe argument '..prefix..key..' type unsupported')\n         end\n      else\n         print('Joe argument key '..prefix..tostring(key)..' not a string')\n      end\n   end\n   return params\nend\n\nfunction joe.parseArguments(arg, params, config)\n   local cmd = torch.CmdLine()\n   for key, val in pairs(params) do\n      cmd:option('-'..key, val)\n   end\n\n   local parsed = cmd:parse(arg)\n   return joe.parseArgumentTable(config, parsed)\nend\n\nfunction joe.parseArgumentTable(config, params, prefix)\n   local prefix = prefix or ''\n\n   for key, val in pairs(config) do\n      if type(key) == 'string' then\n         local val_type = type(val)\n         if val_type == 'string' or val_type == 'number' then\n            config[key] = params[prefix..key] or val\n         elseif val_type == 'boolean' then\n            if params[prefix..key] == 'true' then\n               config[key] = true\n            elseif params[prefix..key] == 'false' then\n               config[key] = false\n            else\n               error('Argument '..prefix..key..' must be true or false')\n            end\n         elseif val_type == 'table' then\n            config[key] = joe.parseArgumentTable(val, params, prefix..key..'_')\n         end\n      end\n   end\n   return config\nend\n\n-- Call the main program\njoe.main(arg)\n"
  },
  {
    "path": "glyphnet/model.lua",
    "content": "--[[\nModel for GlyphNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal cudnn\nlocal nn = require('nn')\nlocal torch = require('torch')\n\nlocal Modules = require('modules')\n\nlocal Model = class()\n\n-- Model constructor\n-- config: configuration table\n--   .spatial: configuration table of the spatial network\n--   .temporal: configuration table of the temporal network\n--   .file: (optional) the model file\n--   .cudnn: (optional) whether to use NVidia cudnn\n--   .group: (optional) number of spatial network groups\nfunction Model:_init(config)\n   -- Read or create model\n   if config.file then\n      local model = torch.load(config.file)\n      self.spatial = self:makeCleanSequential(model.spatial)\n      self.temporal = self:makeCleanSequential(model.temporal)\n   else\n      self.spatial = self:createCleanSequential(config.spatial)\n      self.temporal = self:createCleanSequential(config.temporal)\n      self:initSequential(self.spatial)\n      self:initSequential(self.temporal)\n   end\n\n   -- Saving configurations\n   self.cudnn = config.cudnn\n   self.config = config\n   self.tensortype = torch.getdefaulttensortype()\n\n   -- Initialize intermediate values\n   self.feature = torch.Tensor()\n   self.feature_cache = torch.Tensor()\n   self.grad_feature = torch.Tensor()\n   self.grad_input = torch.Tensor()\n\n   -- Initialize groups\n   self:initGroup(config.group)\nend\n\nfunction Model:initGroup(group)\n   local group = group or 1\n\n   -- Clean current network group\n   if self.group then\n      self.group = nil\n      collectgarbage()\n   end\n\n   -- Create new group\n   self.group = {}\n   for i = 1, group do\n      self.group[i] = self.spatial:clone(\n         'weight', 'bias', 'gradWeight', 'gradBias')\n   end\nend\n\nfunction Model:forward(input)\n   -- Do forward propagation for spatial model group\n   local input_group = input:view(\n      #self.group, -1, 1, input:size(3), input:size(4))\n   local feature = self.group[1]:forward(input_group[1])\n   self.feature_cache:resize(#self.group, feature:size(1), feature:size(2))\n   self.feature_cache[1]:copy(feature)\n   for i = 2, #self.group do\n      local feature = self.group[i]:forward(input_group[i])\n      self.feature_cache[i]:copy(feature)\n   end\n\n   -- Do forward propagation for temporal model\n   self.feature:resize(\n      input:size(1), self.feature_cache:size(3), input:size(2)):copy(\n      self.feature_cache:view(\n         input:size(1), input:size(2), self.feature_cache:size(3)):transpose(\n         2, 3))\n   self.output = self.temporal:forward(self.feature)\n\n   return self.output\nend\n\nfunction Model:backward(input, grad_output)\n   -- Do backward propagation for temporal model\n   local grad_feature = self.temporal:backward(self.feature, grad_output)\n   self.grad_feature:resizeAs(self.feature_cache):view(\n      input:size(1), input:size(2), self.feature_cache:size(3)):copy(\n      grad_feature:transpose(2, 3)):div(input:size(2))\n\n   -- Do backward propagation for spatial model group\n   local input_group = input:view(\n      #self.group, -1, 1, input:size(3), input:size(4))\n   self.grad_input:resizeAs(input)\n   local grad_input_group = self.grad_input:view(\n      #self.group, -1, 1, input:size(3), input:size(4))\n   for i = 1, #self.group do\n      local grad_input = self.group[i]:backward(\n         input_group[i], self.grad_feature[i])\n      grad_input_group[i]:copy(grad_input)\n   end\n\n   return self.grad_input\nend\n\nfunction Model:getParameters()\n   local parameters, gradients = nn.Module.getParameters(self)\n   self:initGroup(#self.group)\n   return parameters, gradients\nend\n\nfunction Model:parameters()\n   local parameters, gradients = {}, {}\n\n   local spatial_parameters, spatial_gradients = self.spatial:parameters()\n   for i = 1, #spatial_parameters do\n      parameters[#parameters + 1] = spatial_parameters[i]\n      gradients[#gradients + 1] = spatial_gradients[i]\n   end\n\n   local temporal_parameters, temporal_gradients = self.temporal:parameters()\n   for i = 1, #temporal_parameters do\n      parameters[#parameters + 1] = temporal_parameters[i]\n      gradients[#gradients + 1] = temporal_gradients[i]\n   end\n\n   return parameters, gradients\nend\n\nfunction Model:type(tensortype)\n   if tensortype ~= nil and tensortype ~= self.tensortype then\n      if tensortype == 'torch.CudaTensor' then\n         require('cunn')\n         self.spatial = self:makeCudaSequential(self.spatial)\n         self.temporal = self:makeCudaSequential(self.temporal)\n      else\n         self.spatial = self:makeCleanSequential(self.spatial)\n         self.temporal = self:makeCleanSequential(self.temporal)\n      end\n      self.spatial:type(tensortype)\n      self.temporal:type(tensortype)\n      self.feature = self.feature:type(tensortype)\n      self.feature_cache = self.feature_cache:type(tensortype)\n      self.grad_feature = self.grad_feature:type(tensortype)\n      self.grad_input = self.grad_input:type(tensortype)\n      self.tensortype = tensortype\n      self:initGroup(#self.group)\n   end\n\n   return self.tensortype\nend\n\nfunction Model:cuda()\n   return self:type('torch.CudaTensor')\nend\n\nfunction Model:double()\n   return self:type('torch.DoubleTensor')\nend\n\nfunction Model:float()\n   return self:type('torch.FloatTensor')\nend\n\nfunction Model:setMode(mode)\n   self:setModeSequential(self.temporal, mode)\n   self:setModeSequential(self.spatial, mode)\n   for i = 1, #self.group do\n      self:setModeSequential(self.group[i], mode)\n   end\nend\n\nfunction Model:setModeTrain()\n   self:setMode('train')\nend\n\nfunction Model:setModeTest()\n   self:setMode('test')\nend\n\nfunction Model:save(file)\n   local spatial = self:clearSequential(\n      self:makeCleanSequential(self.spatial))\n   local temporal = self:clearSequential(\n      self:makeCleanSequential(self.temporal))\n   torch.save(file, {spatial = spatial, temporal = temporal})\nend\n\n-- Clear sequential model\nfunction Model:clearSequential(sequential)\n   local function recursiveClear(key, param)\n      local param = param\n      if torch.type(param) == 'table' then\n         for k, v in pairs(param) do\n            param[k] = recursiveClear(k, v)\n         end\n      elseif torch.isTensor(param) and key ~= 'weight' and key ~= 'bias' then\n         param = param.new()\n      end\n      return param\n   end\n\n   for _, m in ipairs(sequential.modules) do\n      for k, v in pairs(m) do\n         m[k] = recursiveClear(k, v)\n      end\n   end\n\n   return sequential\nend\n\n-- Initialize sequential using microsoft initialization\nfunction Model:initSequential(sequential)\n   for _, m in ipairs(sequential.modules) do\n      self.initModule[torch.type(m)](self, m)\n   end\nend\n\n-- Setting the mode of sequential modules\nfunction Model:setModeSequential(sequential, mode)\n   for _, m in ipairs(sequential.modules) do\n      self.setModeModule[mode][torch.type(m)](self, m)\n   end\nend\n\n-- Create a clean sequential\nfunction Model:createCleanSequential(config)\n   local new = nn.Sequential()\n   for _, m in ipairs(config) do\n      new:add(self.createCleanModule[m.name](self, m))\n   end\n   return new\nend\n\n-- Make a clean sequential\nfunction Model:makeCleanSequential(sequential)\n   local new = nn.Sequential()\n   for _, m in ipairs(sequential.modules) do\n      new:add(self.makeCleanModule[torch.type(m)](self, m))\n   end\n   return new\nend\n\n-- Make a CUDA sequential\nfunction Model:makeCudaSequential(sequential)\n   if self.cudnn then\n      cudnn = require('cudnn')\n   end\n   local new = nn.Sequential()\n   for _, m in ipairs(sequential.modules) do\n      new:add(self.makeCudaModule[torch.type(m)](self, m))\n   end\n   return new\nend\n\n-- Initialize modules\nModel.initModule = {}\nModel.initModule['nn.LogSoftMax'] = function (self, m) end\nModel.initModule['nn.Threshold'] = function (self, m) end\nModel.initModule['nn.Reshape'] = function (self, m) end\nModel.initModule['nn.Dropout'] = function (self, m) end\nModel.initModule['nn.Linear'] = function (self, m)\n   m.bias:zero()\n   m.weight:normal(0, math.sqrt(2 / m.weight:size(1)))\nend\nModel.initModule['nn.SpatialConvolution'] = function (self, m)\n   m.bias:zero()\n   m.weight:normal(\n      0, math.sqrt(2 / m.weight:size(1) / m.weight:size(3) / m.weight:size(4)))\nend\nModel.initModule['nn.SpatialMaxPooling'] = function (self, m) end\nModel.initModule['nn.TemporalConvolutionMM'] = function (self, m)\n   m.bias:zero()\n   m.weight:normal(0, math.sqrt(2 / m.weight:size(1) / m.weight:size(3)))\nend\nModel.initModule['nn.TemporalMaxPoolingMM'] = function (self, m) end\n\n-- Set module mode to train\nModel.setModeModule = {}\nModel.setModeModule['train'] = {}\nModel.setModeModule['train']['nn.LogSoftMax'] = function (self, m) end\nModel.setModeModule['train']['cudnn.LogSoftMax'] =\n   Model.setModeModule['train']['nn.LogSoftMax']\nModel.setModeModule['train']['nn.Threshold'] = function (self, m) end\nModel.setModeModule['train']['nn.Reshape'] = function (self, m) end\nModel.setModeModule['train']['nn.Dropout'] = function (self, m)\n   m.train = true\nend\nModel.setModeModule['train']['nn.Linear'] = function (self, m) end\nModel.setModeModule['train']['nn.SpatialConvolution'] = function (self, m) end\nModel.setModeModule['train']['cudnn.SpatialConvolution'] =\n   Model.setModeModule['train']['nn.SpatialConvolution']\nModel.setModeModule['train']['nn.SpatialMaxPooling'] = function (self, m) end\nModel.setModeModule['train']['cudnn.SpatialMaxPooling'] =\n   Model.setModeModule['train']['nn.SpatialMaxPooling']\nModel.setModeModule['train']['nn.TemporalConvolutionMM'] =\n   function (self, m) end\nModel.setModeModule['train']['cudnn.TemporalConvolutionCudnn'] =\n   function (self, m) end\nModel.setModeModule['train']['nn.TemporalMaxPoolingMM'] = function (self, m) end\nModel.setModeModule['train']['cudnn.TemporalMaxPoolingCudnn'] =\n   Model.setModeModule['train']['nn.TemporalMaxPoolingMM']\n\n-- Set module mode to test\nModel.setModeModule['test'] = {}\nModel.setModeModule['test']['nn.LogSoftMax'] = function (self, m) end\nModel.setModeModule['test']['cudnn.LogSoftMax'] =\n   Model.setModeModule['test']['nn.LogSoftMax']\nModel.setModeModule['test']['nn.Threshold'] = function (self, m) end\nModel.setModeModule['test']['nn.Reshape'] = function (self, m) end\nModel.setModeModule['test']['nn.Dropout'] = function (self, m)\n   m.train = false\nend\nModel.setModeModule['test']['nn.Linear'] = function (self, m) end\nModel.setModeModule['test']['nn.SpatialConvolution'] = function (self, m) end\nModel.setModeModule['test']['cudnn.SpatialConvolution'] =\n   Model.setModeModule['test']['nn.SpatialConvolution']\nModel.setModeModule['test']['nn.SpatialMaxPooling'] = function (self, m) end\nModel.setModeModule['test']['cudnn.SpatialMaxPooling'] =\n   Model.setModeModule['test']['nn.SpatialMaxPooling']\nModel.setModeModule['test']['nn.TemporalConvolutionMM'] =\n   function (self, m) end\nModel.setModeModule['test']['cudnn.TemporalConvolutionCudnn'] =\n   function (self, m) end\nModel.setModeModule['test']['nn.TemporalMaxPoolingMM'] = function (self, m) end\nModel.setModeModule['test']['cudnn.TemporalMaxPoolingCudnn'] =\n   Model.setModeModule['test']['nn.TemporalMaxPoolingMM']\n\n-- Create clean modules\nModel.createCleanModule = {}\nModel.createCleanModule['nn.LogSoftMax'] = function (self, m)\n   return nn.LogSoftMax()\nend\nModel.createCleanModule['nn.Threshold'] = function (self, m)\n   return nn.Threshold(m.th, m.v, m.ip)\nend\nModel.createCleanModule['nn.Reshape'] = function (self, m)\n   return nn.Reshape(m.size, m.batchMode)\nend\nModel.createCleanModule['nn.Dropout'] = function (self, m)\n   return nn.Dropout(m.p, not m.v2, m.inplace)\nend\nModel.createCleanModule['nn.Linear'] = function (self, m)\n   return nn.Linear(m.inputSize, m.outputSize, m.bias)\nend\nModel.createCleanModule['nn.SpatialConvolution'] = function (self, m)\n   return nn.SpatialConvolution(\n      m.nInputPlane, m.nOutputPlane, m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\nend\nModel.createCleanModule['nn.SpatialMaxPooling'] = function (self, m)\n   return nn.SpatialMaxPooling(m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\nend\nModel.createCleanModule['nn.TemporalConvolutionMM'] = function (self, m)\n   return nn.TemporalConvolutionMM(\n      m.inputFrameSize, m.outputFrameSize, m.kW, m.dW, m.padW)\nend\nModel.createCleanModule['nn.TemporalMaxPoolingMM'] = function (self, m)\n   return nn.TemporalMaxPoolingMM(m.kW, m.dW)\nend\n\n-- Make clean modules\nModel.makeCleanModule = {}\nModel.makeCleanModule['nn.LogSoftMax'] = function (self, m)\n   return nn.LogSoftMax()\nend\nModel.makeCleanModule['cudnn.LogSoftMax'] =\n   Model.makeCleanModule['nn.LogSoftMax']\nModel.makeCleanModule['nn.Threshold'] = function (self, m)\n   return nn.Threshold(m.threshold, m.val, m.inplace)\nend\nModel.makeCleanModule['nn.Reshape'] = function (self, m)\n   return nn.Reshape(m.size, m.batchMode)\nend\nModel.makeCleanModule['nn.Dropout'] = function (self, m)\n   return nn.Dropout(m.p, not m.v2, m.inplace)\nend\nModel.makeCleanModule['nn.Linear'] = function (self, m)\n   local new = nn.Linear(m.weight:size(2), m.weight:size(1), m.bias)\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCleanModule['nn.SpatialConvolution'] = function (self, m)\n   local new = nn.SpatialConvolution(\n      m.nInputPlane, m.nOutputPlane, m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCleanModule['cudnn.SpatialConvolution'] =\n   Model.makeCleanModule['nn.SpatialConvolution']\nModel.makeCleanModule['nn.SpatialMaxPooling'] = function (self, m)\n   return nn.SpatialMaxPooling(m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\nend\nModel.makeCleanModule['cudnn.SpatialMaxPooling'] =\n   Model.makeCleanModule['nn.SpatialMaxPooling']\nModel.makeCleanModule['nn.TemporalConvolutionMM'] = function (self, m)\n   local new = nn.TemporalConvolutionMM(\n      m.input_feature, m.output_feature, m.kernel, m.stride, m.pad)\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCleanModule['cudnn.TemporalConvolutionCudnn'] = function (self, m)\n   local new = nn.TemporalConvolutionMM(\n      m.nInputPlane, m.nOutputPlane, m.kW, m.dW, m.padW)\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCleanModule['nn.TemporalMaxPoolingMM'] = function (self, m)\n   return nn.TemporalMaxPoolingMM(m.kW, m.dW)\nend\nModel.makeCleanModule['cudnn.TemporalMaxPoolingCudnn'] =\n   Model.makeCleanModule['nn.TemporalMaxPoolingMM']\n\n-- Make CUDA modules\nModel.makeCudaModule = {}\nModel.makeCudaModule['nn.LogSoftMax'] = function (self, m)\n   if self.cudnn and cudnn.LogSoftMax then\n      return cudnn.LogSoftMax()\n   else\n      return nn.LogSoftMax()\n   end\nend\nModel.makeCudaModule['cudnn.LogSoftMax'] = Model.makeCudaModule['nn.LogSoftMax']\nModel.makeCudaModule['nn.Threshold'] = function (self, m)\n   return nn.Threshold(m.threshold, m.val, m.inplace)\nend\nModel.makeCudaModule['nn.Reshape'] = function (self, m)\n   return nn.Reshape(m.size, m.batchMode)\nend\nModel.makeCudaModule['nn.Dropout'] = function (self, m)\n   return nn.Dropout(m.p, not m.v2, m.inplace)\nend\nModel.makeCudaModule['nn.Linear'] = function (self, m)\n   local new = nn.Linear(m.weight:size(2), m.weight:size(1), m.bias)\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCudaModule['nn.SpatialConvolution'] = function (self, m)\n   local new\n   if self.cudnn then\n      new = cudnn.SpatialConvolution(\n         m.nInputPlane, m.nOutputPlane, m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\n   else\n      new = nn.SpatialConvolution(\n         m.nInputPlane, m.nOutputPlane, m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\n   end\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCudaModule['cudnn.SpatialConvolution'] =\n   Model.makeCudaModule['nn.SpatialConvolution']\nModel.makeCudaModule['nn.SpatialMaxPooling'] = function (self, m)\n   if self.cudnn then\n      return cudnn.SpatialMaxPooling(m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\n   else\n      return nn.SpatialMaxPooling(m.kW, m.kH, m.dW, m.dH, m.padW, m.padH)\n   end\nend\nModel.makeCudaModule['cudnn.SpatialMaxPooling'] =\n   Model.makeCudaModule['nn.SpatialMaxPooling']\nModel.makeCudaModule['nn.TemporalConvolutionMM'] = function (self, m)\n   local new\n   if self.cudnn then\n      new = cudnn.TemporalConvolutionCudnn(\n         m.input_feature, m.output_feature, m.kernel, m.stride, m.pad)\n   else\n      new = nn.TemporalConvolutionMM(\n         m.input_feature, m.output_feature, m.kernel, m.stride, m.pad)\n   end\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCudaModule['cudnn.TemporalConvolutionCudnn'] = function (self, m)\n   local new\n   if self.cudnn then\n      new = cudnn.TemporalConvolutionCudnn(\n         m.nInputPlane, m.nOutputPlane, m.kW, m.dW, m.padW)\n   else\n      new = nn.TemporalConvolutionMM(\n         m.nInputPlane, m.nOutputPlane, m.kW, m.dW, m.padW)\n   end\n   new.weight:copy(m.weight)\n   new.bias:copy(m.bias)\n   return new\nend\nModel.makeCudaModule['nn.TemporalMaxPoolingMM'] = function (self, m)\n   if self.cudnn then\n      return cudnn.TemporalMaxPoolingCudnn(m.kW, m.dW)\n   else\n      return nn.TemporalMaxPoolingMM(m.kW, m.dW)\n   end\nend\nModel.makeCudaModule['cudnn.TemporalMaxPoolingCudnn'] =\n   Model.makeCudaModule['nn.TemporalMaxPoolingMM']\n\nreturn Model\n"
  },
  {
    "path": "glyphnet/modules/TemporalConvolutionCudnn.lua",
    "content": "--[[\nTemporal max pooling module .with data order consistent with MM\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal TemporalConvolutionCudnn, parent =\n   torch.class('cudnn.TemporalConvolutionCudnn', 'cudnn.SpatialConvolution')\n\nfunction TemporalConvolutionCudnn:__init(\n      input_feature, output_feature, kW, dW, padW)\n   parent.__init(self, input_feature, output_feature, kW, 1, dW, 1, padW, 0)\nend\n\nfunction TemporalConvolutionCudnn:updateOutput(input)\n   local input_view\n\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n   end\n\n   local output = parent.updateOutput(self, input_view)\n\n   if input:dim() ~= output:dim() then\n      if input:dim() == 2 then\n         self.output = output:view(output:size(1), output:size(3))\n      else\n         self.output = output:view(output:size(1), output:size(2), output:size(4))\n      end\n   end\n\n   return self.output\nend\n\nfunction TemporalConvolutionCudnn:updateGradInput(input, grad_output)\n   local input_view\n   local grad_output_view\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), 1, grad_output:size(2))\n      self.output = self.output:view(\n         self.output:size(1), 1, self.output:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), grad_output:size(2), 1, grad_output:size(3))\n      self.output = self.output:view(\n         self.output:size(1), self.output:size(2), 1, self.output:size(3))\n   end\n\n   local grad_input = parent.updateGradInput(self, input_view, grad_output_view)\n\n   if self.gradInput:dim() ~= input:dim() then\n      if input:dim() == 2 then\n         self.output = self.output:view(\n            self.output:size(1), self.output:size(3))\n         self.gradInput = grad_input:view(grad_input:size(1), grad_input:size(3))\n      else\n         self.output = self.output:view(\n            self.output:size(1), self.output:size(2), self.output:size(4))\n         self.gradInput = grad_input:view(\n            grad_input:size(1), grad_input:size(2), grad_input:size(4))\n      end\n   end\n\n   return self.gradInput\nend\n\nfunction TemporalConvolutionCudnn:accGradParameters(input, grad_output, scale)\n   local input_view\n   local grad_output_view\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), 1, grad_output:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), grad_output:size(2), 1, grad_output:size(3))\n   end\n\n   parent.accGradParameters(self, input_view, grad_output_view, scale)\nend\n\nfunction TemporalConvolutionCudnn:__tostring__()\n   return string.format(\n      '%s(%d -> %d, %d, %d, %d)',\n      torch.type(self), self.nInputPlane, self.nOutputPlane,\n      self.kW, self.dW, self.padW)\nend\n"
  },
  {
    "path": "glyphnet/modules/TemporalConvolutionMM.lua",
    "content": "--[[\nTemporal convolution module that supports padding\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal TemporalConvolutionMM, parent =\n   torch.class('nn.TemporalConvolutionMM', 'nn.Module')\n\nfunction TemporalConvolutionMM:__init(\n      input_feature, output_feature, kernel, stride, pad)\n   parent.__init(self)\n   \n   self.input_feature = input_feature\n   self.output_feature = output_feature\n   self.kernel = kernel\n   self.stride = stride or 1\n   self.pad = pad or 0\n   \n   self.weight = torch.Tensor(output_feature, input_feature, kernel)\n   self.bias = torch.Tensor(output_feature)\n   self.gradWeight = torch.Tensor(output_feature, input_feature, kernel)\n   self.gradBias = torch.Tensor(output_feature)\n   \n   self.pad_cache = torch.Tensor()\n   self.unfold_cache = torch.Tensor()\n   self.interlace_cache = torch.Tensor()\n   self.weight_cache = torch.Tensor(\n      self.weight:size(2), self.weight:size(1), self.weight:size(3))\n   self.reverse_index = torch.LongTensor(self.kernel)\n   \n   for i = 1, self.kernel do\n      self.reverse_index[i] = self.kernel - i + 1\n   end\n   \n   self:reset()\nend\n\nfunction TemporalConvolutionMM:reset(stdv)\n   if stdv then\n      stdv = stdv * math.sqrt(3)\n   else\n      stdv = 1/math.sqrt(self.kernel * self.input_feature)\n   end\n   self.weight:uniform(-stdv, stdv)\n   self.bias:uniform(-stdv, stdv)\nend\n\nfunction TemporalConvolutionMM:updateOutput(input)\n   if input:dim() ~= 2 and input:dim() ~= 3 then\n      error('Input dimension must be 2 or 3')\n   end\n   \n   -- Create temporary input cache that is to be unfolded\n   if input:dim() == 2 then\n      self.pad_cache:resize(\n         input:size(1), input:size(2) + 2 * self.pad):zero():narrow(\n         2, self.pad + 1, input:size(2)):copy(input)\n   else\n      self.pad_cache:resize(\n         input:size(1), input:size(2),\n         input:size(3) + 2 * self.pad):zero():narrow(\n         3, self.pad + 1, input:size(3)):copy(input)\n   end\n   \n   -- Unfold the input cache\n   local unfolded = self.pad_cache:unfold(\n      self.pad_cache:dim(), self.kernel, self.stride):transpose(\n      self.pad_cache:dim(), self.pad_cache:dim() + 1)\n   self.unfold_cache:resizeAs(unfolded):copy(unfolded)\n   \n   -- Do matrix multiplication\n   if input:dim() == 2 then\n      self.output:resize(\n         self.output_feature, self.unfold_cache:size(3)):copy(\n         self.bias:view(-1, 1):expandAs(self.output))\n      self.output:addmm(\n         1, self.output, 1,\n         self.weight:view(self.weight:size(1), -1),\n         self.unfold_cache:view(-1, self.unfold_cache:size(3)))\n   else\n      self.output:resize(\n         self.unfold_cache:size(1), self.output_feature,\n         self.unfold_cache:size(4)):copy(\n         self.bias:view(1, -1, 1):expandAs(self.output))\n      local weight = self.weight:view(\n         1, self.weight:size(1),\n         self.weight:size(2) * self.weight:size(3)):expand(\n         self.unfold_cache:size(1), self.weight:size(1),\n         self.weight:size(2) * self.weight:size(3))\n      self.output:baddbmm(\n         1, self.output, 1, weight,\n         self.unfold_cache:view(\n            self.unfold_cache:size(1), -1, self.unfold_cache:size(4)))\n   end\n   \n   return self.output\nend\n\nfunction TemporalConvolutionMM:updateGradInput(input, grad_output)\n   -- Reverse the weight on the kernel dimension\n   self.weight_cache:indexCopy(\n      3, self.reverse_index, self.weight:transpose(1, 2))\n   \n   -- Resize the initialize the interlace cache\n   if input:dim() == 2 then\n      self.interlace_cache:resize(\n         grad_output:size(1),\n         self.stride * (grad_output:size(2) - 1) + 1):zero()\n      self.interlace_cache:narrow(\n         2, 1, self.interlace_cache:size(2) - 1):unfold(\n         2, self.stride, self.stride):select(3, 1):copy(\n         grad_output:narrow(2, 1, grad_output:size(2) - 1))\n      self.interlace_cache:select(2, self.interlace_cache:size(2)):copy(\n         grad_output:select(2, grad_output:size(2)))\n   else\n      self.interlace_cache:resize(\n         grad_output:size(1), grad_output:size(2),\n         self.stride * (grad_output:size(3) - 1) + 1):zero()\n      self.interlace_cache:narrow(\n         3, 1, self.interlace_cache:size(3) - 1):unfold(\n         3, self.stride, self.stride):select(4, 1):copy(\n         grad_output:narrow(3, 1, grad_output:size(3) - 1))\n      self.interlace_cache:select(3, self.interlace_cache:size(3)):copy(\n         grad_output:select(3, grad_output:size(3)))\n   end\n   \n   -- Resize and initialize the padded cache\n   if input:dim() == 2 then\n      self.pad_cache:resize(\n         grad_output:size(1), input:size(2) + self.kernel - 1)\n      local length = math.min(\n         self.pad_cache:size(2), self.interlace_cache:size(2))\n      self.pad_cache:zero():narrow(\n         2, (self.pad_cache:size(2) - length) / 2 + 1, length):copy(\n         self.interlace_cache:narrow(\n            2, (self.interlace_cache:size(2) - length) / 2 + 1, length))\n   else\n      self.pad_cache:resize(\n         grad_output:size(1), grad_output:size(2),\n         input:size(3) + self.kernel - 1)\n      local length = math.min(\n         self.pad_cache:size(3), self.interlace_cache:size(3))\n      self.pad_cache:zero():narrow(\n         3, (self.pad_cache:size(3) - length) / 2 + 1, length):copy(\n         self.interlace_cache:narrow(\n            3, (self.interlace_cache:size(3) - length) / 2 + 1, length))\n   end\n   \n   -- Unfold the output cache\n   local unfolded = self.pad_cache:unfold(\n      self.pad_cache:dim(), self.kernel, 1):transpose(\n      self.pad_cache:dim(), self.pad_cache:dim() + 1)\n   self.unfold_cache:resizeAs(unfolded):copy(unfolded)\n   \n   -- Do matrix multiplication\n   self.gradInput:resizeAs(input):zero()\n   if input:dim() == 2 then\n      self.gradInput:addmm(\n         1, self.gradInput, 1,\n         self.weight_cache:view(self.weight:size(2), -1),\n         self.unfold_cache:view(-1, self.unfold_cache:size(3)))\n   else\n      local weight = self.weight_cache:view(\n         1, self.weight:size(2),\n         self.weight:size(1) * self.weight:size(3)):expand(\n         unfolded:size(1), self.weight:size(2),\n         self.weight:size(1) * self.weight:size(3))\n      self.gradInput:baddbmm(\n         1, self.gradInput, 1, weight,\n         self.unfold_cache:view(\n            self.unfold_cache:size(1), -1, self.unfold_cache:size(4)))\n   end\n   \n   return self.gradInput\nend\n\nfunction TemporalConvolutionMM:accGradParameters(input, grad_output, scale)\n   local scale = scale or 1\n   \n   -- Create temporary input cache that is to be unfolded\n   if input:dim() == 2 then\n      self.pad_cache:resize(\n         input:size(1), input:size(2) + 2 * self.pad):zero():narrow(\n         2, self.pad + 1, input:size(2)):copy(input)\n   else\n      self.pad_cache:resize(\n         input:size(1), input:size(2),\n         input:size(3) + 2 * self.pad):zero():narrow(\n         3, self.pad + 1, input:size(3)):copy(input)\n   end\n\n   -- Unfold the input cache\n   local unfolded = self.pad_cache:unfold(\n      self.pad_cache:dim(), self.kernel, self.stride):transpose(\n      self.pad_cache:dim() - 1, self.pad_cache:dim())\n   self.unfold_cache:resizeAs(unfolded):copy(unfolded)\n\n   -- Do matrix multiplication\n   local grad_weight = self.gradWeight:view(self.weight:size(1), -1)\n   if input:dim() == 2 then\n      grad_weight:addmm(\n         1, grad_weight, scale, grad_output,\n         self.unfold_cache:view(unfolded:size(1), -1))\n      self.gradBias:add(scale, grad_output:sum(2))\n   else\n      if grad_weight.addbmm then\n         grad_weight:addbmm(\n            1, grad_weight, scale, grad_output,\n            self.unfold_cache:view(\n               self.unfold_cache:size(1), self.unfold_cache:size(2), -1))\n      else\n         for i = 1, grad_output:size(1) do\n            grad_weight:addmm(\n               1, grad_weight, scale, grad_output:select(1, i),\n               self.unfold_cache:select(1, i):view(\n                  self.unfold_cache:size(2), -1))\n         end\n      end\n      self.gradBias:add(scale, grad_output:sum(3):sum(1))\n   end\nend\n\nTemporalConvolutionMM.sharedAccUpdateGradParameters =\n   TemporalConvolutionMM.accUpdateGradParameters\n\nfunction TemporalConvolutionMM:__tostring__()\n   return string.format(\n      '%s(%d -> %d, %d, %d, %d)', torch.type(self), self.input_feature,\n      self.output_feature, self.kernel, self.stride, self.pad)\nend\n"
  },
  {
    "path": "glyphnet/modules/TemporalMaxPoolingCudnn.lua",
    "content": "--[[\nTemporal max pooling module with data order consistent with MM\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal TemporalMaxPoolingCudnn, parent =\n   torch.class('cudnn.TemporalMaxPoolingCudnn', 'cudnn.SpatialMaxPooling')\n\nfunction TemporalMaxPoolingCudnn:__init(kW, dW, padW)\n   parent.__init(self, kW, 1, dW, 1, padW, 0)\nend\n\nfunction TemporalMaxPoolingCudnn:updateOutput(input)\n   local input_view\n\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n   end\n\n   local output = parent.updateOutput(self, input_view)\n\n   if self.output:dim() ~= input:dim() then\n      if input:dim() == 2 then\n         self.output = output:view(output:size(1), output:size(3))\n      else\n         self.output = output:view(output:size(1), output:size(2), output:size(4))\n      end\n   end\n\n   return self.output\nend\n\nfunction TemporalMaxPoolingCudnn:updateGradInput(input, grad_output)\n   local input_view\n   local grad_output_view\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), 1, grad_output:size(2))\n      self.output = self.output:view(\n         self.output:size(1), 1, self.output:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), grad_output:size(2), 1, grad_output:size(3))\n      self.output = self.output:view(\n         self.output:size(1), self.output:size(2), 1, self.output:size(3))\n   end\n\n   local grad_input = parent.updateGradInput(self, input_view, grad_output_view)\n\n   if self.gradInput:dim() ~= input:dim() then\n      if input:dim() == 2 then\n         self.output = self.output:view(\n            self.utput:size(1), self.output:size(3))\n         self.gradInput = grad_input:view(\n            grad_input:size(1), grad_input:size(3))\n      else\n         self.output = self.output:view(\n            self.output:size(1), self.output:size(2), self.output:size(4))\n         self.gradInput = grad_input:view(\n            grad_input:size(1), grad_input:size(2), grad_input:size(4))\n      end\n   end\n\n   return self.gradInput\nend\n\nfunction TemporalMaxPoolingCudnn:__tostring__()\n   return string.format('%s(%d, %d)', torch.type(self), self.kW, self.dW)\nend\n"
  },
  {
    "path": "glyphnet/modules/TemporalMaxPoolingMM.lua",
    "content": "--[[\nTemporal max pooling module with data order consistent with MM\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal TemporalMaxPoolingMM, parent =\n   torch.class('nn.TemporalMaxPoolingMM', 'nn.SpatialMaxPooling')\n\nfunction TemporalMaxPoolingMM:__init(kW, dW)\n   parent.__init(self, kW, 1, dW, 1)\nend\n\nfunction TemporalMaxPoolingMM:updateOutput(input)\n   local input_view\n\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n   end\n\n   local output = parent.updateOutput(self, input_view)\n\n   if input:dim() == 2 then\n      self.output = output:view(output:size(1), output:size(3))\n   else\n      self.output = output:view(output:size(1), output:size(2), output:size(4))\n   end\n\n   return self.output\nend\n\nfunction TemporalMaxPoolingMM:updateGradInput(input, grad_output)\n   local input_view\n   local grad_output_view\n   if input:dim() == 2 then\n      input_view = input:view(input:size(1), 1, input:size(2))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), 1, grad_output:size(2))\n      self.output = self.output:view(\n         self.output:size(1), 1, self.output:size(2))\n   else\n      input_view = input:view(input:size(1), input:size(2), 1, input:size(3))\n      grad_output_view = grad_output:view(\n         grad_output:size(1), grad_output:size(2), 1, grad_output:size(3))\n      self.output = self.output:view(\n         self.output:size(1), self.output:size(2), 1, self.output:size(3))\n   end\n\n   local grad_input = parent.updateGradInput(self, input_view, grad_output_view)\n\n   if input:dim() == 2 then\n      self.output = self.output:view(\n         self.utput:size(1), self.output:size(3))\n      self.gradInput = grad_input:view(grad_input:size(1), grad_input:size(3))\n   else\n      self.output = self.output:view(\n         self.output:size(1), self.output:size(2), self.output:size(4))\n      self.gradInput = grad_input:view(\n         grad_input:size(1), grad_input:size(2), grad_input:size(4))\n   end\n\n   return self.gradInput\nend\n\nfunction TemporalMaxPoolingMM:__tostring__()\n   return string.format('%s(%d, %d)', torch.type(self), self.kW, self.dW)\nend\n"
  },
  {
    "path": "glyphnet/modules.lua",
    "content": "--[[\nAdditional modules for GlyphNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal status, cudnn = pcall(require, 'cudnn')\nlocal nn = require('nn')\n\n-- nn.TemporalConvolutionMM\nif not nn.TemporalConvolutionMM then\n   dofile('modules/TemporalConvolutionMM.lua')\nend\n\n-- nn.TemporalMaxPoolingMM\nif not nn.TemporalMaxPoolingMM then\n   dofile('modules/TemporalMaxPoolingMM.lua')\nend\n\n-- cudnn.TemporalConvolutionCudnn\nif status == true and not cudnn.TemporalMaxPoolingCudnn then\n   dofile('modules/TemporalConvolutionCudnn.lua')\nend\n\n-- cudnn.TemporalMaxPoolingCudnn\nif status == true and not cudnn.TemporalMaxPoolingCudnn then\n   dofile('modules/TemporalMaxPoolingCudnn.lua')\nend\n\nreturn nn\n"
  },
  {
    "path": "glyphnet/scroll.lua",
    "content": "--[[\nThe schollable UI\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\n\nlocal Scroll = class()\n\n-- Initialize a scroll interface\n-- width: (optional) the pixel width of the scollable area. Default is 600.\n-- title: (optional) title for the window\nfunction Scroll:_init(width,title)\n   require('qtuiloader')\n   require('qtwidget')\n   require('qttorch')\n\n   self.file = 'scroll.ui'\n   self.win = qtuiloader.load(self.file)\n   self.frame = self.win.frame\n   self.painter = qt.QtLuaPainter(self.frame)\n   self.width = width or 600\n   self.height = 0\n   self.fontSize = 15\n   self.x = 0\n   self.y = 0\n   self.border = 1\n\n   self:resize(self.width, self.height)\n   self:setFontSize(self.fontSize)\n   if title then \n      self:setTitle(title)\n   end\n   self:show()\nend\n\n-- Resize the window to designated width and height\nfunction Scroll:resize(width, height)\n   self.width = width or self.width\n   self.height = height or self.height\n\n   self.frame.size = qt.QSize{width = self.width,height = self.height}\nend\n\n-- Set the text width\nfunction Scroll:setFontSize(size)\n   self.painter:setfontsize(size or 15)\n   self.fontSize = size\nend\n\n-- Set border width\nfunction Scroll:setBorder(width)\n   self.border = width\nend\n\n-- Draw text\nfunction Scroll:drawText(text)\n   -- Drawing text must happen on a new line\n   if self.x ~= 0 then\n      self.x = 0\n      self.y = self.height\n   end\n\n   -- Determine height and resize if necessary\n   if self.height < self.y+self.fontSize+1 then\n      self:resize(self.width,self.y+self.fontSize+1+self.border)\n   end\n\n   -- Draw the yellow main text\n   self.painter:gbegin()\n   self.painter:moveto(self.x,self.y+self.fontSize-1)\n   self.painter:setcolor(1,1,0,1)\n   self.painter:show(text)\n   self.painter:stroke()\n   self.painter:gend()\n\n   -- Draw the black shadow text\n   self.painter:gbegin()\n   self.painter:moveto(self.x,self.y+self.fontSize+1-1)\n   self.painter:setcolor(0,0,0,1)\n   self.painter:show(text)\n   self.painter:stroke()\n   self.painter:gend()\n   \n   -- Move the cursor to next line\n   self.x = 0\n   if self.height < self.y+self.fontSize+1+self.border then\n      self:resize(self.width,self.y+self.fontSize+1+self.border)\n   end\n   self.y = self.height\nend\n\n-- Draw image\nfunction Scroll:drawImage(im, scale)\n   -- Get the image height and width\n   local scale = scale or 1\n   local height, width\n   if im:dim() == 2 then\n      height = im:size(1) * scale\n      width = im:size(2) * scale\n   elseif im:dim() == 3 then\n      height = im:size(2) * scale\n      width = im:size(3) * scale\n   else\n      error(\"Image must be 2-dim or 3-dim data\")\n   end\n\n   -- Determine whether a new line is needed\n   if self.x ~= 0 and self.x + width > self.width then\n      self.x = 0\n      self.y = self.height\n   end\n\n   -- Determine whether need to resize the document area\n   if self.y + height > self.height then\n      self:resize(self.width, self.y + height + self.border)\n   end\n\n   -- Draw the image\n   self.painter:gbegin()\n   self.painter:image(self.x, self.y, width, height, qt.QImage.fromTensor(im))\n   self.painter:stroke()\n   self.painter:gend()\n\n   -- Move the cursor\n   self.x = self.x + width + self.border\nend\n\n-- Draw a new line\nfunction Scroll:drawEndOfLine()\n   self.x = 0\n   self.y = self.height\nend\n\n-- Hint for heights\nfunction Scroll:hintImageHeight(im, scale)\n   -- Get the image height and width\n   local scale = scale or 1\n   local height, width\n   if im:dim() == 2 then\n      height = im:size(1) * scale\n      width = im:size(2) * scale\n   elseif im:dim() == 3 then\n      height = im:size(2) * scale\n      width = im:size(3) * scale\n   else\n      error(\"Image must be 2-dim or 3-dim data\")\n   end\n\n   -- Determine whether a new line is needed\n   if self.x ~= 0 and self.x + width > self.width then\n      return self.height\n   else\n      return self.y\n   end\nend\n\n-- Show the window\nfunction Scroll:show()\n   self.win:show()\nend\n\n-- Hide the window\nfunction Scroll:hide()\n   self.win:hide()\nend\n\n-- Save to file\nfunction Scroll:save(file)\n   self.painter:write(file)\nend\n\n-- Set window title\nfunction Scroll:setTitle(title)\n   self.win:setWindowTitle(title)\nend\n\n-- Reset the drawing area\nfunction Scroll:clear()\n   self:resize(self.width,0)\n   self.x = 0\n   self.y = 0\nend\n\nreturn Scroll\n"
  },
  {
    "path": "glyphnet/scroll.ui",
    "content": "<?xml version=\"1.0\" encoding=\"UTF-8\"?>\n<ui version=\"4.0\">\n <class>window</class>\n <widget class=\"QWidget\" name=\"window\">\n  <property name=\"geometry\">\n   <rect>\n    <x>0</x>\n    <y>0</y>\n    <width>640</width>\n    <height>480</height>\n   </rect>\n  </property>\n  <property name=\"windowTitle\">\n   <string>Scrollable Window</string>\n  </property>\n  <layout class=\"QGridLayout\" name=\"gridLayout\">\n   <item row=\"0\" column=\"0\">\n    <widget class=\"QScrollArea\" name=\"scrollArea\">\n     <property name=\"sizePolicy\">\n      <sizepolicy hsizetype=\"Expanding\" vsizetype=\"Expanding\">\n       <horstretch>0</horstretch>\n       <verstretch>0</verstretch>\n      </sizepolicy>\n     </property>\n     <property name=\"midLineWidth\">\n      <number>0</number>\n     </property>\n     <property name=\"widgetResizable\">\n      <bool>false</bool>\n     </property>\n     <widget class=\"QWidget\" name=\"frame\">\n      <property name=\"geometry\">\n       <rect>\n        <x>0</x>\n        <y>0</y>\n        <width>600</width>\n        <height>440</height>\n       </rect>\n      </property>\n      <property name=\"sizePolicy\">\n       <sizepolicy hsizetype=\"Preferred\" vsizetype=\"Preferred\">\n        <horstretch>0</horstretch>\n        <verstretch>0</verstretch>\n       </sizepolicy>\n      </property>\n     </widget>\n    </widget>\n   </item>\n  </layout>\n </widget>\n <resources/>\n <connections/>\n</ui>\n"
  },
  {
    "path": "glyphnet/test.lua",
    "content": "--[[\nTester for GlyphNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal torch = require('torch')\nlocal sys = require('sys')\n\nlocal Test = class()\n\n-- Constructor for Test\n-- data: the data object\n-- model: the model object\n-- loss: the loss object\n-- config: configuration table\nfunction Test:_init(data, model, loss, config)\n   self.data = data\n   self.model = model\n   self.loss = loss\n\n   self.time = {}\nend\n\n-- Run for all the data\n-- callback: (optional) a callback function to execute after each step\nfunction Test:run(callback)\n   self.total_error = 0\n   self.total_objective = 0\n   self.total_count = 0\n   self.clock = sys.clock()\n   for input, label, count in self.data:iterator() do\n      self:runStep(input, label, count)\n      if callback then callback(self) end\n      self.clock = sys.clock()\n   end\nend\n\n-- Run for one minibatch step\nfunction Test:runStep(input, label, count)\n   -- Get a batch of data\n   self.input_untyped, self.label_untyped = input, label\n   self.input = self.input or self.input_untyped:type(self.model:type())\n   self.input:copy(self.input_untyped)\n   self.label = self.label or self.label_untyped:type(self.model:type())\n   self.label:copy(self.label_untyped)\n   self.count = count\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.data = sys.clock() - self.clock\n\n   -- Forward propagation\n   self.clock = sys.clock()\n   self.output = self.model:forward(self.input)\n   self.objective = self.loss:forward(self.output, self.label)\n   if type(self.objective) ~= 'number' then self.objective = self.objectve[1] end\n   self.max, self.decision = self.output:type(\n      torch.getdefaulttensortype()):max(2)\n   self.max = self.max:squeeze()\n   self.decision = self.decision:squeeze():narrow(1, 1, count):type(\n      torch.getdefaulttensortype())\n   self.error = torch.ne(\n      self.decision, self.label_untyped:narrow(1, 1, count)):type(\n      torch.getdefaulttensortype()):sum() / count\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.forward = sys.clock() - self.clock\n\n   -- Update the results\n   self.clock = sys.clock()\n   self.total_objective =\n      (self.total_objective * self.total_count + self.objective * count) /\n      (self.total_count + count)\n   self.total_error =\n      (self.total_error * self.total_count + self.error * count) /\n      (self.total_count + count)\n   self.total_count = self.total_count + count\n   self.time.update = sys.clock() - self.clock\nend\n\nreturn Test\n"
  },
  {
    "path": "glyphnet/train.lua",
    "content": "--[[\nTrainer for GlyphNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal torch = require('torch')\nlocal sys = require('sys')\n\nlocal Train = class()\n\n-- Constructor for Train\n-- data: the data object\n-- model: the model object\n-- loss: the loss object\n-- config: configuration table\nfunction Train:_init(data, model, loss, config)\n   self.data = data\n   self.model = model\n   self.loss = loss\n\n   self.rates = config.rates or {1e-3}\n   self.step = config.step or 0\n   self.momentum = config.momentum or 0\n   self.decay = config.decay or 0\n   self.recapture = config.recapture\n\n   self.params, self.grads = self.model:getParameters()\n   if config.state then\n      self.state = config.state:type(self.model:type())\n   else\n      self.state = self.grads:clone():zero()\n   end\n\n   -- Find current learning rate\n   local max_step = 1\n   self.rate = self.rates[1]\n   for step, rate in pairs(self.rates) do\n      if step <= self.step and step > max_step then\n         max_step = step\n         self.rate = rate\n      end\n   end\n\n   self.time = {}\nend\n\n-- Run for a number of steps\n-- steps: number of steps\n-- callback: (optional) a callback function to execute after each step\nfunction Train:run(steps, callback)\n   if self.recapture then\n      self.params, self.grads = self.model:getParameters()\n   end\n\n   for i = 1, steps do\n      self.step = self.step + 1\n      self:runStep()\n      if callback then callback(self, i) end\n   end\nend\n\n-- Run for one minibatch step\nfunction Train:runStep()\n   -- Get a batch of data/\n   self.clock = sys.clock()\n   self.input_untyped, self.label_untyped = self.data:getBatch(\n      self.input_untyped, self.label_untyped)\n   self.input = self.input or self.input_untyped:type(self.model:type())\n   self.input:copy(self.input_untyped)\n   self.label = self.label or self.label_untyped:type(self.model:type())\n   self.label:copy(self.label_untyped)\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.data = sys.clock() - self.clock\n\n   -- Forward propagation\n   self.clock = sys.clock()\n   self.output = self.model:forward(self.input)\n   self.objective = self.loss:forward(self.output, self.label)\n   if type(self.objective) ~= 'number' then self.objective = self.objectve[1] end\n   self.max, self.decision = self.output:type(\n      torch.getdefaulttensortype()):max(2)\n   self.max = self.max:squeeze()\n   self.decision = self.decision:squeeze():type(torch.getdefaulttensortype())\n   self.error = torch.ne(self.decision, self.label_untyped):type(\n      torch.getdefaulttensortype()):sum() / self.label:size(1)\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.forward = sys.clock() - self.clock\n\n   -- Backward propagation\n   self.clock = sys.clock()\n   self.grads:zero()\n   self.grad_output = self.loss:backward(self.output, self.label)\n   self.grad_input = self.model:backward(self.input, self.grad_output)\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.backward = sys.clock() - self.clock\n\n   -- Update the step\n   self.clock = sys.clock()\n   self:sgd()\n   if self.model:type() == 'torch.CudaTensor' then cutorch.synchronize() end\n   self.time.update = sys.clock() - self.clock\nend\n\nfunction Train:sgd()\n   self.rate = self.rates[self.step] or self.rate\n   if self.momentum and self.momentum > 0 then\n      self.state:mul(self.momentum):add(self.grads:mul(-self.rate))\n      self.params:mul(1 - self.rate * self.decay):add(self.state)\n   else\n      self.params:mul(1 - self.rate * self.decay):add(\n         self.grads:mul(-self.rate))\n   end\nend\n\nreturn Train\n"
  },
  {
    "path": "glyphnet/unittest/data.lua",
    "content": "--[[\nUnit test for GlyphNet data program\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Data = require('data')\n\nlocal image = require('image')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe.init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe.init()\n   local config = {}\n   config.file = 'data/dianping/test_code.t7b'\n   config.unifont = 'unifont/unifont-8.0.01.t7b'\n   config.length = 512\n   config.batch = 16\n\n   joe.config = config\n   joe.data = Data(config)\nend\n\nfunction joe.getBatchTest()\n   local data = joe.data\n   local sample, label = data:getBatch()\n\n   print('Size of sample: ')\n   print(sample:size())\n   print('Size of label: ')\n   print(label:size())\n\n   io.write('Labels:')\n   for i = 1, label:size(1) do\n      io.write(' ', label[i])\n   end\n   io.write('\\n')\n\n   image.display{image = sample[1]:narrow(1, 1, 100),\n                 nrow = 10, zoom = 4}\n\n   joe.sample = sample\n   joe.label = label\nend\n\nfunction joe.iteratorTest()\n   local data = joe.data\n\n   local window\n   local total = 0\n   for sample, label, count in data:iterator() do\n      total = total + count\n      io.write(total, ',', count, ':')\n      for i = 1, count do\n         window = image.display{\n            image = sample[1][1], nrow = 10, zoom = 4, win = window}\n         io.write(' ', label[i])\n      end\n      io.write('\\n')\n      io.flush()\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/driver.lua",
    "content": "--[[\nUnit test for GlyphNet driver component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Driver = require('driver')\n\n--  A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Creating driver')\n   config.train_data.file = 'data/dianping/unittest_code.t7b'\n   config.test_data.file = 'data/dianping/unittest_code.t7b'\n   config.driver.debug = true\n   config.driver.device = 3\n   config.driver.steps = 10\n   config.driver.epoches = 30\n   config.driver.schedule = 4\n   config.driver.variation = 'small'\n   config.driver.location = '/tmp'\n   local driver = Driver(config, config.driver)\n\n   self.config = config\n   self.driver = driver\nend\n\nfunction joe:driverTest()\n   local driver = self.driver\n   print('Training schedule')\n   for i, v in pairs(driver.options.train.rates) do\n      print(i, v)\n   end\n   print('Testing driver')\n   driver:run()\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/model.lua",
    "content": "--[[\nUnit test for GlyphNet model component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal os = require('os')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   local model = Model(config.model)\n\n   local parameters, gradients = model:getParameters()\n   print('Parameter pointers: '..torch.pointer(parameters:storage())..' '..\n            torch.pointer(gradients:storage()))\n   print('Parameter sizes: '..parameters:nElement()..' '..gradients:nElement())\n\n   self.config = config\n   self.model = model\n   self.parameters = parameters\n   self.gradients = gradients\n\n   self:printModel()\nend\n\nfunction joe:printModel(model)\n   local model = model or self.model\n   print('Created spatial model: ')\n   print(model.spatial)\n   print('Created temporal model: ')\n   print(model.temporal)\n   print('Spatial group pointers:')\n   print(0, torch.pointer(model.spatial.modules[1].weight:storage()),\n         torch.pointer(model.spatial.modules[1].gradWeight:storage()))\n   for i, m in ipairs(model.group) do\n      print(i, torch.pointer(m.modules[1].weight:storage()),\n            torch.pointer(m.modules[1].gradWeight:storage()))\n   end\nend\n\nfunction joe:forwardBackwardTest()\n   local model = self.model\n\n   print('Initializing input')\n   local input = torch.rand(4, 512, 16, 16)\n   print('Input size:')\n   print(input:size())\n\n   print('Running forward propagation')\n   local output = model:forward(input)\n   print('Feature size:')\n   print(model.feature:size())\n   print('Output size:')\n   print(output:size())\n\n   print('Initializing output gradients')\n   local grad_output = torch.rand(output:size())\n   print('Running backward propagation')\n   local grad_input = model:backward(input, grad_output)\n   print('Feature gradient size:')\n   print(model.grad_feature:size())\n\n   self.input = input\n   self.grad_input = grad_input\n   self.output = output\n   self.grad_output = grad_output\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   local file = '/tmp/model.t7b'\n   print('Saving to '..file)\n   model:save(file)\n   print('Model saved')\n\n   local config = {}\n   config.file = file\n   config.cudnn = joe.config.model.cudnn\n   config.group = joe.config.model.group\n   print('Loading from '..file)\n   model = Model(config)\n\n   self:printModel(model)\nend\n\nfunction joe:modeTest()\n   local model = self.model\n   print('Setting to testing mode')\n   model:setModeTest()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\n\n   print('Setting to training mode')\n   model:setModeTrain()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/model_cuda.lua",
    "content": "--[[\nUnit test for GlyphNet model component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal os = require('os')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.cudnn = nil\n\n   print('Changing device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n\n   local model = Model(config.model)\n   model:cuda()\n\n   local parameters, gradients = model:getParameters()\n   print('Parameter pointers: '..torch.pointer(parameters:storage())..' '..\n            torch.pointer(gradients:storage()))\n   print('Parameter sizes: '..parameters:nElement()..' '..gradients:nElement())\n\n   self.config = config\n   self.model = model\n   self.parameters = parameters\n   self.gradients = gradients\n\n   self:printModel()\nend\n\nfunction joe:printModel(model)\n   local model = model or self.model\n   print('Type of model: '..model:type())\n   print('Created spatial model: ')\n   print(model.spatial)\n   print('Created temporal model: ')\n   print(model.temporal)\n   print('Spatial group pointers:')\n   print(0, torch.pointer(model.spatial.modules[1].weight:storage()),\n         torch.pointer(model.spatial.modules[1].gradWeight:storage()))\n   for i, m in ipairs(model.group) do\n      print(i, torch.pointer(m.modules[1].weight:storage()),\n            torch.pointer(m.modules[1].gradWeight:storage()))\n   end\nend\n\nfunction joe:forwardBackwardTest()\n   local model = self.model\n\n   print('Initializing input')\n   local input = torch.rand(16, 512, 16, 16):type(model:type())\n   print('Input size:')\n   print(input:size())\n\n   print('Running forward propagation')\n   cutorch.synchronize()\n   sys.tic()\n   local output = model:forward(input)\n   cutorch.synchronize()\n   sys.toc(true)\n   print('Feature size:')\n   print(model.feature:size())\n   print('Output size:')\n   print(output:size())\n\n   print('Initializing output gradients')\n   local grad_output = torch.rand(output:size()):type(model:type())\n   print('Running backward propagation')\n   cutorch.synchronize()\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   cutorch.synchronize()\n   sys.toc(true)\n   print('Feature gradient size:')\n   print(model.grad_feature:size())\n\n   self.input = input\n   self.grad_input = grad_input\n   self.output = output\n   self.grad_output = grad_output\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   local file = '/tmp/model.t7b'\n   print('Saving to '..file)\n   model:save(file)\n   print('Model saved')\n\n   local config = {}\n   config.file = file\n   config.cudnn = joe.config.model.cudnn\n   config.group = joe.config.model.group\n   print('Loading from '..file)\n   model = Model(config)\n\n   self:printModel(model)\nend\n\nfunction joe:modeTest()\n   local model = self.model\n   print('Setting to testing mode')\n   model:setModeTest()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\n\n   print('Setting to training mode')\n   model:setModeTrain()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/model_cudnn.lua",
    "content": "--[[\nUnit test for GlyphNet model component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal os = require('os')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.cudnn = true\n\n   print('Changing device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n\n   local model = Model(config.model)\n   model:cuda()\n\n   local parameters, gradients = model:getParameters()\n   print('Parameter pointers: '..torch.pointer(parameters:storage())..' '..\n            torch.pointer(gradients:storage()))\n   print('Parameter sizes: '..parameters:nElement()..' '..gradients:nElement())\n\n   self.config = config\n   self.model = model\n   self.parameters = parameters\n   self.gradients = gradients\n\n   self:printModel()\nend\n\nfunction joe:printModel(model)\n   local model = model or self.model\n   print('Type of model: '..model:type())\n   print('Created spatial model: ')\n   print(model.spatial)\n   print('Created temporal model: ')\n   print(model.temporal)\n   print('Spatial group pointers:')\n   print(0, torch.pointer(model.spatial.modules[1].weight:storage()),\n         torch.pointer(model.spatial.modules[1].gradWeight:storage()))\n   for i, m in ipairs(model.group) do\n      print(i, torch.pointer(m.modules[1].weight:storage()),\n            torch.pointer(m.modules[1].gradWeight:storage()))\n   end\nend\n\nfunction joe:forwardBackwardTest()\n   local model = self.model\n\n   print('Initializing input')\n   local input = torch.rand(16, 512, 16, 16):type(model:type())\n   print('Input size:')\n   print(input:size())\n\n   print('Running forward propagation')\n   cutorch.synchronize()\n   sys.tic()\n   local output = model:forward(input)\n   cutorch.synchronize()\n   sys.toc(true)\n   print('Feature size:')\n   print(model.feature:size())\n   print('Output size:')\n   print(output:size())\n\n   print('Initializing output gradients')\n   local grad_output = torch.rand(output:size()):type(model:type())\n   print('Running backward propagation')\n   cutorch.synchronize()\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   cutorch.synchronize()\n   sys.toc(true)\n   print('Feature gradient size:')\n   print(model.grad_feature:size())\n\n   self.input = input\n   self.grad_input = grad_input\n   self.output = output\n   self.grad_output = grad_output\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   local file = '/tmp/model.t7b'\n   print('Saving to '..file)\n   model:save(file)\n   print('Model saved')\n\n   local config = {}\n   config.file = file\n   config.cudnn = joe.config.model.cudnn\n   config.group = joe.config.model.group\n   print('Loading from '..file)\n   model = Model(config)\n\n   self:printModel(model)\nend\n\nfunction joe:modeTest()\n   local model = self.model\n   print('Setting to testing mode')\n   model:setModeTest()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\n\n   print('Setting to training mode')\n   model:setModeTrain()\n   print('Temporal mode:')\n   for i, m in ipairs(model.temporal.modules) do\n      print(i, torch.type(m), m.train)\n   end\n   print('Spatial mode:')\n   for i, m in ipairs(model.spatial.modules) do\n      print(i, torch.type(m), m.train)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/modules_temporal.lua",
    "content": "--[[\nUnit test for modules\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal nn = require('modules')\n\nlocal cunn = require('cunn')\nlocal cutorch = require('cutorch')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe.init(joe)\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local device = 1\n   cutorch.setDevice(device)\n   print('Device set to '..device)\n   self.jacobian = nn.Jacobian\nend\n\nfunction joe:noBatchCPU(kernel, stride, pad)\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = nn.TemporalConvolutionMM(\n      input_feature, output_feature, kernel, stride, pad)\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n   \n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(input_feature, input_length)\n   print('Input size:')\n   print(input:size())\n   \n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n   \n   local input_pad = torch.zeros(input_feature, input_length + 2 * pad)\n   input_pad:narrow(2, pad + 1, input_length):copy(input)\n   for i = 1, output:size(2) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n\t 2, input_begin, kernel):contiguous():view(\n\t 1, input_feature, kernel):expand(output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n\t temporal.weight, input_chunk):sum(3):sum(2):squeeze()\n      output_slice:add(1, temporal.bias:viewAs(output_slice))\n      print('Error of output slice '..i..': '..\n\t       output_slice:add(-1, output:select(2, i)):abs():mean())\n   end\n   \n   local grad_output = torch.rand(output:size())\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n   \n   local grad_output_pad = torch.Tensor(\n      output_feature, input_length + kernel - 1):zero()\n   local interlace_length = stride * (grad_output:size(2) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(2) - interlace_length) / 2\n   for i = 1, grad_output:size(2) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n\t and grad_output_pad_begin <= grad_output_pad:size(2) then\n\t grad_output_pad:select(2, grad_output_pad_begin):copy(\n\t    grad_output:select(2, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(temporal.weight:size())\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, temporal.weight)\n   for i = 1, grad_input:size(2) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n\t 2, grad_output_pad_begin, kernel):contiguous():view(\n\t output_feature, 1, kernel):expand(\n\t output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n\t weight_reverse, grad_output_pad_chunk):sum(3):sum(1):squeeze()\n      print('Error of input gradient slice '..i..': '..\n\t       grad_input_slice:add(-1, grad_input:select(2, i)):abs():mean())\n   end\n   \n   local input_unfold = input_pad:unfold(2, kernel, stride)\n   for i = 1, temporal.weight:size(3) do\n      local grad_weight_slice = torch.mm(\n\t grad_output, input_unfold:select(3, i):transpose(1, 2))\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n\t\t  -1, temporal.gradWeight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(2)\n   print('Error of bias gradient: '..grad_bias:add(\n\t       -1, temporal.gradBias):abs():mean())\n   \n   local jacobian = self.jacobian\n   local err = jacobian.testJacobian(temporal, input)\n   print('Error of jacobian test: '..err)\n   local err = jacobian.testJacobianParameters(\n      temporal, input, temporal.weight, temporal.gradWeight)\n   print('Error of jacobian test for weight: '..err)\n   local err = jacobian.testJacobianParameters(\n      temporal, input, temporal.bias, temporal.gradBias)\n   print('Error of jacobian test for bias: '..err)\n   local err = jacobian.testJacobianUpdateParameters(\n      temporal, input, temporal.weight)\n   print('Error of jacobian test for weight update: '..err)\n   local err = jacobian.testJacobianUpdateParameters(\n      temporal, input, temporal.bias)\n   print('Error of jacobian test for bias update: '..err)\n   for t,err in pairs(\n      jacobian.testAllUpdate(temporal, input, 'weight', 'gradWeight')) do\n      print('Error of jacobian test for '..t..' all update: '..err)\n   end\n   for t,err in pairs(\n      jacobian.testAllUpdate(temporal, input, 'bias', 'gradBias')) do\n      print('Error of jacobian test for '..t..' all update: '..err)\n   end\nend\n\nfunction joe:noBatchCPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:noBatchCPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\nfunction joe:batchCPU(kernel, stride, pad)\n   local batch = 4\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = nn.TemporalConvolutionMM(\n      input_feature, output_feature, kernel, stride, pad)\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n   \n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(batch, input_feature, input_length)\n   print('Input size:')\n   print(input:size())\n   \n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n\n   local input_pad = torch.zeros(batch, input_feature, input_length + 2 * pad)\n   input_pad:narrow(3, pad + 1, input_length):copy(input)\n   local weight = temporal.weight:view(\n      1, output_feature, input_feature, kernel):expand(\n      batch, output_feature, input_feature, kernel)\n   for i = 1, output:size(3) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n\t 3, input_begin, kernel):contiguous():view(\n\t batch, 1, input_feature, kernel):expand(\n\t batch, output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n\t weight, input_chunk):sum(4):sum(3):squeeze()\n      output_slice:add(\n\t 1, temporal.bias:view(1, output_feature):expandAs(output_slice))\n      print('Error of output slice '..i..': '..\n\t       output_slice:add(-1, output:select(3, i)):abs():mean())\n   end\n\n   local grad_output = torch.rand(output:size())\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n\n   local grad_output_pad = torch.Tensor(\n      batch, output_feature, input_length + kernel - 1):zero()\n   local interlace_length = stride * (grad_output:size(3) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(3) - interlace_length) / 2\n   for i = 1, grad_output:size(3) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n\t and grad_output_pad_begin <= grad_output_pad:size(3) then\n\t grad_output_pad:select(3, grad_output_pad_begin):copy(\n\t    grad_output:select(3, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(temporal.weight:size())\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, temporal.weight)\n   for i = 1, grad_input:size(3) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n\t 3, grad_output_pad_begin, kernel):contiguous():view(\n\t batch, output_feature, 1, kernel):expand(\n\t batch, output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n\t weight_reverse:view(1, output_feature, input_feature, kernel):expand(\n\t    batch, output_feature, input_feature, kernel),\n\t grad_output_pad_chunk):sum(4):sum(2):squeeze()\n      print('Error of input gradient slice '..i..': '..\n\t       grad_input_slice:add(-1, grad_input:select(3, i)):abs():mean())\n   end\n\n   local input_unfold = input_pad:unfold(3, kernel, stride)\n   for i = 1, temporal.weight:size(3) do\n      local grad_weight_slice = torch.bmm(\n\t grad_output, input_unfold:select(4, i):transpose(2, 3)):sum(\n\t 1):squeeze()\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n\t\t  -1, temporal.gradWeight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(3):sum(1)\n   print('Error of bias gradient: '..grad_bias:add(\n\t       -1, temporal.gradBias):abs():mean())\n\n   local jacobian = self.jacobian\n   local err = jacobian.testJacobian(temporal, input)\n   print('Error of jacobian test: '..err)\n   local err = jacobian.testJacobianParameters(\n      temporal, input, temporal.weight, temporal.gradWeight)\n   print('Error of jacobian test for weight: '..err)\n   local err = jacobian.testJacobianParameters(\n      temporal, input, temporal.bias, temporal.gradBias)\n   print('Error of jacobian test for bias: '..err)\n   local err = jacobian.testJacobianUpdateParameters(\n      temporal, input, temporal.weight)\n   print('Error of jacobian test for weight update: '..err)\n   local err = jacobian.testJacobianUpdateParameters(\n      temporal, input, temporal.bias)\n   print('Error of jacobian test for bias update: '..err)\n   for t,err in pairs(\n      jacobian.testAllUpdate(temporal, input, 'weight', 'gradWeight')) do\n      print('Error of jacobian test for '..t..' all update: '..err)\n   end\n   for t,err in pairs(\n      jacobian.testAllUpdate(temporal, input, 'bias', 'gradBias')) do\n      print('Error of jacobian test for '..t..' all update: '..err)\n   end\nend\n\nfunction joe:batchCPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:batchCPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\nfunction joe:noBatchGPU(kernel, stride, pad)\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = nn.TemporalConvolutionMM(\n      input_feature, output_feature, kernel, stride, pad):cuda()\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n   \n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(input_feature, input_length):cuda()\n   print('Input size:')\n   print(input:size())\n   \n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n   \n   local input_pad = torch.zeros(input_feature, input_length + 2 * pad):cuda()\n   input_pad:narrow(2, pad + 1, input_length):copy(input)\n   for i = 1, output:size(2) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n         2, input_begin, kernel):contiguous():view(\n         1, input_feature, kernel):expand(output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n         temporal.weight, input_chunk):sum(3):sum(2):squeeze()\n      output_slice:add(1, temporal.bias:viewAs(output_slice))\n      print('Error of output slice '..i..': '..\n               output_slice:add(-1, output:select(2, i)):abs():mean())\n   end\n   \n   local grad_output = torch.rand(output:size()):cuda()\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n   \n   local grad_output_pad = torch.Tensor(\n      output_feature, input_length + kernel - 1):zero():cuda()\n   local interlace_length = stride * (grad_output:size(2) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(2) - interlace_length) / 2\n   for i = 1, grad_output:size(2) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n         and grad_output_pad_begin <= grad_output_pad:size(2) then\n            grad_output_pad:select(2, grad_output_pad_begin):copy(\n               grad_output:select(2, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(temporal.weight:size()):cuda()\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, temporal.weight)\n   for i = 1, grad_input:size(2) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n         2, grad_output_pad_begin, kernel):contiguous():view(\n         output_feature, 1, kernel):expand(\n         output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n         weight_reverse, grad_output_pad_chunk):sum(3):sum(1):squeeze()\n      print('Error of input gradient slice '..i..': '..\n               grad_input_slice:add(-1, grad_input:select(2, i)):abs():mean())\n   end\n   \n   local input_unfold = input_pad:unfold(2, kernel, stride)\n   for i = 1, temporal.weight:size(3) do\n      local grad_weight_slice = torch.mm(\n         grad_output, input_unfold:select(3, i):transpose(1, 2))\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n                  -1, temporal.gradWeight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(2)\n   print('Error of bias gradient: '..grad_bias:add(\n               -1, temporal.gradBias):abs():mean())\nend\n\nfunction joe:noBatchGPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:noBatchGPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\nfunction joe:batchGPU(kernel, stride, pad)\n   local batch = 4\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = nn.TemporalConvolutionMM(\n      input_feature, output_feature, kernel, stride, pad):cuda()\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n   \n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(batch, input_feature, input_length):cuda()\n   print('Input size:')\n   print(input:size())\n   \n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n\n   local input_pad = torch.zeros(\n      batch, input_feature, input_length + 2 * pad):cuda()\n   input_pad:narrow(3, pad + 1, input_length):copy(input)\n   local weight = temporal.weight:view(\n      1, output_feature, input_feature, kernel):expand(\n      batch, output_feature, input_feature, kernel)\n   for i = 1, output:size(3) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n         3, input_begin, kernel):contiguous():view(\n         batch, 1, input_feature, kernel):expand(\n         batch, output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n         weight, input_chunk):sum(4):sum(3):squeeze()\n      output_slice:add(\n         1, temporal.bias:view(1, output_feature):expandAs(output_slice))\n      print('Error of output slice '..i..': '..\n               output_slice:add(-1, output:select(3, i)):abs():mean())\n   end\n\n   local grad_output = torch.rand(output:size()):cuda()\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n\n   local grad_output_pad = torch.Tensor(\n      batch, output_feature, input_length + kernel - 1):zero():cuda()\n   local interlace_length = stride * (grad_output:size(3) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(3) - interlace_length) / 2\n   for i = 1, grad_output:size(3) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n         and grad_output_pad_begin <= grad_output_pad:size(3) then\n            grad_output_pad:select(3, grad_output_pad_begin):copy(\n               grad_output:select(3, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(temporal.weight:size()):cuda()\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, temporal.weight)\n   for i = 1, grad_input:size(3) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n         3, grad_output_pad_begin, kernel):contiguous():view(\n         batch, output_feature, 1, kernel):expand(\n         batch, output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n         weight_reverse:view(1, output_feature, input_feature, kernel):expand(\n            batch, output_feature, input_feature, kernel),\n         grad_output_pad_chunk):sum(4):sum(2):squeeze()\n      print('Error of input gradient slice '..i..': '..\n               grad_input_slice:add(-1, grad_input:select(3, i)):abs():mean())\n   end\n\n   local input_unfold = input_pad:unfold(3, kernel, stride)\n   for i = 1, temporal.weight:size(3) do\n      local grad_weight_slice = torch.bmm(\n         grad_output, input_unfold:select(4, i):transpose(2, 3)):sum(\n         1):squeeze()\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n                  -1, temporal.gradWeight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(3):sum(1)\n   print('Error of bias gradient: '..grad_bias:add(\n               -1, temporal.gradBias):abs():mean())\nend\n\nfunction joe:batchGPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:batchGPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/modules_temporal_cudnn.lua",
    "content": "--[[\nUnit test for modules\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal nn = require('modules')\n\nlocal cudnn = require('cudnn')\nlocal cunn = require('cunn')\nlocal cutorch = require('cutorch')\nlocal torch = require('torch')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe.init(joe)\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local device = 2\n   cutorch.setDevice(device)\n   print('Device set to '..device)\n   self.jacobian = nn.Jacobian\nend\n\nfunction joe:noBatchGPU(kernel, stride, pad)\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = cudnn.TemporalConvolutionCudnn(\n      input_feature, output_feature, kernel, stride, pad):cuda()\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n\n   local weight = temporal.weight:view(output_feature, input_feature, kernel)\n   local grad_weight = temporal.gradWeight:view(\n      output_feature, input_feature, kernel)\n\n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(input_feature, input_length):cuda()\n   print('Input size:')\n   print(input:size())\n\n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n   \n   local input_pad = torch.zeros(input_feature, input_length + 2 * pad):cuda()\n   input_pad:narrow(2, pad + 1, input_length):copy(input)\n   for i = 1, output:size(2) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n         2, input_begin, kernel):contiguous():view(\n         1, input_feature, kernel):expand(output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n         weight, input_chunk):sum(3):sum(2):squeeze()\n      output_slice:add(1, temporal.bias:viewAs(output_slice))\n      print('Error of output slice '..i..': '..\n               output_slice:add(-1, output:select(2, i)):abs():mean())\n   end\n   \n   local grad_output = torch.rand(output:size()):cuda()\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n   \n   local grad_output_pad = torch.Tensor(\n      output_feature, input_length + kernel - 1):zero():cuda()\n   local interlace_length = stride * (grad_output:size(2) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(2) - interlace_length) / 2\n   for i = 1, grad_output:size(2) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n         and grad_output_pad_begin <= grad_output_pad:size(2) then\n            grad_output_pad:select(2, grad_output_pad_begin):copy(\n               grad_output:select(2, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(weight:size()):cuda()\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, weight)\n   for i = 1, grad_input:size(2) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n         2, grad_output_pad_begin, kernel):contiguous():view(\n         output_feature, 1, kernel):expand(\n         output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n         weight_reverse, grad_output_pad_chunk):sum(3):sum(1):squeeze()\n      print('Error of input gradient slice '..i..': '..\n               grad_input_slice:add(-1, grad_input:select(2, i)):abs():mean())\n   end\n   \n   local input_unfold = input_pad:unfold(2, kernel, stride)\n   for i = 1, weight:size(3) do\n      local grad_weight_slice = torch.mm(\n         grad_output, input_unfold:select(3, i):transpose(1, 2))\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n                  -1, grad_weight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(2)\n   print('Error of bias gradient: '..grad_bias:add(\n               -1, temporal.gradBias):abs():mean())\nend\n\nfunction joe:noBatchGPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:noBatchGPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\nfunction joe:batchGPU(kernel, stride, pad)\n   local batch = 4\n   local input_feature = 2\n   local output_feature = 4\n   local kernel = kernel or 3\n   local stride = stride or 1\n   local pad = pad or 0\n   local temporal = nn.TemporalConvolutionMM(\n      input_feature, output_feature, kernel, stride, pad):cuda()\n   print('Created module: '..tostring(temporal))\n   temporal.gradWeight:zero()\n   temporal.gradBias:zero()\n\n   local temporal_weight = temporal.weight:view(input_feature, output_feature, kernel)\n   local temporal_grad_weight = temporal.gradWeight:view(\n      input_feature, output_feature, kernel)\n   \n   local output_length = 16\n   local input_length = (output_length - 1) * stride + kernel - 2 * pad\n   \n   local input = torch.rand(batch, input_feature, input_length):cuda()\n   print('Input size:')\n   print(input:size())\n   \n   print('Executing forward propagation')\n   local output = temporal:forward(input)\n   print('Output size: ')\n   print(output:size())\n\n   local input_pad = torch.zeros(\n      batch, input_feature, input_length + 2 * pad):cuda()\n   input_pad:narrow(3, pad + 1, input_length):copy(input)\n   local weight = temporal_weight:view(\n      1, output_feature, input_feature, kernel):expand(\n      batch, output_feature, input_feature, kernel)\n   for i = 1, output:size(3) do\n      local input_begin = (i - 1) * stride + 1\n      local input_chunk = input_pad:narrow(\n         3, input_begin, kernel):contiguous():view(\n         batch, 1, input_feature, kernel):expand(\n         batch, output_feature, input_feature, kernel)\n      local output_slice = torch.cmul(\n         weight, input_chunk):sum(4):sum(3):squeeze()\n      output_slice:add(\n         1, temporal.bias:view(1, output_feature):expandAs(output_slice))\n      print('Error of output slice '..i..': '..\n               output_slice:add(-1, output:select(3, i)):abs():mean())\n   end\n\n   local grad_output = torch.rand(output:size()):cuda()\n   print('Executing backward propagation')\n   local grad_input = temporal:backward(input, grad_output)\n   print('Input gradient size: ')\n   print(grad_input:size())\n\n   local grad_output_pad = torch.Tensor(\n      batch, output_feature, input_length + kernel - 1):zero():cuda()\n   local interlace_length = stride * (grad_output:size(3) - 1) + 1\n   local interlace_shift = (grad_output_pad:size(3) - interlace_length) / 2\n   for i = 1, grad_output:size(3) do\n      local grad_output_pad_begin = (i - 1) * stride + 1 + interlace_shift\n      if grad_output_pad_begin >= 1\n         and grad_output_pad_begin <= grad_output_pad:size(3) then\n            grad_output_pad:select(3, grad_output_pad_begin):copy(\n               grad_output:select(3, i))\n      end\n   end\n   local weight_reverse = torch.Tensor(temporal_weight:size()):cuda()\n   local weight_index = torch.LongTensor(kernel)\n   for i = 1, weight_index:size(1) do\n      weight_index[i] = kernel - i + 1\n   end\n   weight_reverse:indexCopy(3, weight_index, temporal_weight)\n   for i = 1, grad_input:size(3) do\n      local grad_output_pad_begin = i\n      local grad_output_pad_chunk = grad_output_pad:narrow(\n         3, grad_output_pad_begin, kernel):contiguous():view(\n         batch, output_feature, 1, kernel):expand(\n         batch, output_feature, input_feature, kernel)\n      local grad_input_slice = torch.cmul(\n         weight_reverse:view(1, output_feature, input_feature, kernel):expand(\n            batch, output_feature, input_feature, kernel),\n         grad_output_pad_chunk):sum(4):sum(2):squeeze()\n      print('Error of input gradient slice '..i..': '..\n               grad_input_slice:add(-1, grad_input:select(3, i)):abs():mean())\n   end\n\n   local input_unfold = input_pad:unfold(3, kernel, stride)\n   for i = 1, temporal_weight:size(3) do\n      local grad_weight_slice = torch.bmm(\n         grad_output, input_unfold:select(4, i):transpose(2, 3)):sum(\n         1):squeeze()\n      print('Error of weight gradient slice '..i..': '..grad_weight_slice:add(\n                  -1, temporal_grad_weight:select(3, i)):abs():mean())\n   end\n   \n   local grad_bias = grad_output:sum(3):sum(1)\n   print('Error of bias gradient: '..grad_bias:add(\n               -1, temporal.gradBias):abs():mean())\nend\n\nfunction joe:batchGPUTest()\n   for _, kernel in ipairs({3, 5}) do\n      for _, stride in ipairs({1, 2, 3, 5}) do\n\t for _, pad in ipairs({0, 1, 2, 3, 5}) do\n\t    self:batchGPU(kernel, stride, pad)\n\t end\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/test.lua",
    "content": "--[[\nUnit test for GlyphNet test component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   local data = Data(config.test_data)\n   print('Create model')\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/test_cuda.lua",
    "content": "--[[\nUnit test for GlyphNet test component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   local data = Data(config.test_data)\n   print('Create model')\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/train.lua",
    "content": "--[[\nUnit test for GlyphNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   local data = Data(config.test_data)\n   print('Create model')\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create trainer')\n   config.train.rates[4] = 1e-5\n   local train = Train(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 10 steps')\n   train:run(10, callback)\nend\n\nfunction joe:callback()\n   return function (train, i)\n      print('stp: '..train.step..', rat: '..train.rate..\n               ', obj: '..train.objective..', dat: '..train.time.data..\n               ', fwd: '..train.time.forward..', bwd: '..train.time.backward..\n               ', upd: '..train.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/unittest/train_cuda.lua",
    "content": "--[[\nUnit test for GlyphNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   local data = Data(config.test_data)\n   print('Create model')\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create trainer')\n   config.train.rates[79] = 1e-5\n   config.train.rates[85] = config.train.rates[1]\n   local train = Train(data, model, loss, config.train)\n\n   print('pmn: '..train.params:mean()..', psd: '..train.params:std()..\n            ', gmn: '..train.grads:mean()..', gsd: '..train.grads:std()..\n            ', smn: '..train.state:mean()..', ssd: '..train.state:std())\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 100 steps')\n   train:run(100, callback)\nend\n\nfunction joe:callback()\n   return function (train, i)\n      print('stp: '..train.step..', rat: '..train.rate..', err: '..train.error..\n               ', obj: '..train.objective..', dat: '..train.time.data..\n               ', fwd: '..train.time.forward..', bwd: '..train.time.backward..\n               ', upd: '..train.time.update..', pmn: '..train.params:mean()..\n               ', psd: '..train.params:std()..', gmn: '..train.grads:mean()..\n               ', gsd: '..train.grads:std()..', smn: '..train.state:mean()..\n               ', ssd: '..train.state:std())\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "glyphnet/visualizer.lua",
    "content": "--[[\nVisualization module for glyphnet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal torch = require('torch')\n\nlocal Scroll = require('scroll')\n\nlocal Visualizer = class()\n\n-- Constructor\n--  config: configuration table\n--    .width: (optional) width of scrollable window\n--    .scale: (optional) scale of visualizing weights\n--    .title: (optional) title of the scrollable window\n--    .height: (optional) maximum height of visualization for a module\nfunction Visualizer:_init(config)\n   local config = config or {}\n   local config = config or {}\n   self.width = config.width or 800\n   self.scale = config.scale or 4\n   self.title = config.title or \"Visualizer\"\n   self.height = config.height or 64\n   self.win = Scroll(self.width, self.title)\nend\n\n-- Save wrapper\nfunction Visualizer:save(...)\n   return self.win:save(...)\nend\n\n-- Visualize the weights of a sequential model\n-- model: the sequential model\nfunction Visualizer:drawSequential(model)\n   self.win:clear()\n   for i, m in ipairs(model.modules) do\n      self.win:drawText(tostring(i)..\": \"..tostring(m))\n      if self.drawModule[torch.type(m)] then\n         self.drawModule[torch.type(m)](self, m)\n      end\n   end\nend\n\n-- Draw an image with height hints\nfunction Visualizer:drawImage(im, y_zero, max, min)\n   local win = self.win\n   local y = win:hintImageHeight(im, self.scale)\n   if y - y_zero > self.height then\n      return false\n   end\n   local max = max or im:max()\n   local min = min or im:min()\n   local normalized = torch.Tensor(im:size()):copy(im):add(-min)\n   if max - min > 0 then\n      normalized:div(max - min)\n   end\n   win:drawImage(normalized, self.scale)\n   return true\nend\n\n-- A table for reading modules\nVisualizer.drawModule = {}\nVisualizer.drawModule['nn.Linear'] = function (self, m)\n   local weight = m.weight\n   local y_zero = self.win.y\n\n   for i = 1, m.weight:size(1) do\n      local w = weight[i]:view(1, weight:size(2))\n      if not self:drawImage(w, y_zero) then\n         return\n      end\n   end\n\n   self:drawImage(m.bias:view(1, m.bias:size(1)), y_zero)\nend\nVisualizer.drawModule['nn.SpatialConvolution'] = function (self, m)\n   local weight = m.weight:view(m.nOutputPlane, m.nInputPlane, m.kH, m.kW)\n   local height = m.kH\n   local width = m.kW\n   local y_zero = self.win.y\n   local max = weight:max()\n   local min = weight:min()\n   \n   if m.nInputPlane == 3 then\n      for i = 1, m.nOutputPlane do\n         local w = weight[i]\n         if not self:drawImage(w, y_zero, max, min) then\n            return\n         end\n      end\n   else\n      for i = 1, m.nOutputPlane do\n         for j = 1, m.nInputPlane do\n            local w = weight[i][j]\n            if not self:drawImage(w, y_zero, max, min) then\n               return\n            end\n         end\n      end\n   end\n\n   self:drawImage(m.bias:view(1, m.nOutputPlane), y_zero)\nend\nVisualizer.drawModule['nn.SpatialConvolutionMM'] =\n   Visualizer.drawModule['nn.SpatialConvolution']\nVisualizer.drawModule['cudnn.SpatialConvolution'] =\n   Visualizer.drawModule['nn.SpatialConvolution']\nVisualizer.drawModule['nn.TemporalConvolutionMM'] = function (self, m)\n   local weight = m.weight:view(m.output_feature, m.input_feature, m.kernel)\n   local y_zero = self.win.y\n   local max = weight:max()\n   local min = weight:min()\n\n   for i = 1, m.output_feature do\n      local w = weight[i]:transpose(2, 1)\n      if not self:drawImage(w, y_zero, max, min) then\n         return\n      end\n   end\nend\nVisualizer.drawModule['cudnn.TemporalConvolutionCudnn'] = function (self, m)\n   local weight = m.weight:view(m.nOutputPlane, m.nInputPlane, m.kW)\n   local y_zero = self.win.y\n   local max = weight:max()\n   local min = weight:min()\n\n   for i = 1, m.nOutputPlane do\n      local w = weight[i]:transpose(2, 1)\n      if not self:drawImage(w, y_zero, max, min) then\n         return\n      end\n   end\nend\n\nreturn Visualizer\n"
  },
  {
    "path": "linearnet/archive/11stbinary_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/charbag -train_data_file data/11st/sentiment/binary_train_charbag.t7b -test_data_file data/11st/sentiment/binary_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/charbagtfidf -train_data_file data/11st/sentiment/binary_train_charbagtfidf.t7b -test_data_file data/11st/sentiment/binary_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/chargram -train_data_file data/11st/sentiment/binary_train_chargram.t7b -test_data_file data/11st/sentiment/binary_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/chargramtfidf -train_data_file data/11st/sentiment/binary_train_chargramtfidf.t7b -test_data_file data/11st/sentiment/binary_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordbag -train_data_file data/11st/sentiment/binary_train_wordbag.t7b -test_data_file data/11st/sentiment/binary_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordbagroman -train_data_file data/11st/sentiment/binary_train_rr_wordbag.t7b -test_data_file data/11st/sentiment/binary_test_rr_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordbagtfidf -train_data_file data/11st/sentiment/binary_train_wordbagtfidf.t7b -test_data_file data/11st/sentiment/binary_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordbagtfidfroman -train_data_file data/11st/sentiment/binary_train_rr_wordbagtfidf.t7b -test_data_file data/11st/sentiment/binary_test_rr_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordgram -train_data_file data/11st/sentiment/binary_train_wordgram.t7b -test_data_file data/11st/sentiment/binary_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordgramroman -train_data_file data/11st/sentiment/binary_train_rr_wordgram.t7b -test_data_file data/11st/sentiment/binary_test_rr_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordgramtfidf -train_data_file data/11st/sentiment/binary_train_wordgramtfidf.t7b -test_data_file data/11st/sentiment/binary_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stbinary_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stbinary/wordgramtfidfroman -train_data_file data/11st/sentiment/binary_train_rr_wordgramtfidf.t7b -test_data_file data/11st/sentiment/binary_test_rr_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/charbag -train_data_file data/11st/sentiment/full_train_charbag.t7b -test_data_file data/11st/sentiment/full_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/charbagtfidf -train_data_file data/11st/sentiment/full_train_charbagtfidf.t7b -test_data_file data/11st/sentiment/full_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/chargram -train_data_file data/11st/sentiment/full_train_chargram.t7b -test_data_file data/11st/sentiment/full_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/chargramtfidf -train_data_file data/11st/sentiment/full_train_chargramtfidf.t7b -test_data_file data/11st/sentiment/full_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordbag -train_data_file data/11st/sentiment/full_train_wordbag.t7b -test_data_file data/11st/sentiment/full_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordbagroman -train_data_file data/11st/sentiment/full_train_rr_wordbag.t7b -test_data_file data/11st/sentiment/full_test_rr_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordbagtfidf -train_data_file data/11st/sentiment/full_train_wordbagtfidf.t7b -test_data_file data/11st/sentiment/full_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordbagtfidfroman -train_data_file data/11st/sentiment/full_train_rr_wordbagtfidf.t7b -test_data_file data/11st/sentiment/full_test_rr_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordgram -train_data_file data/11st/sentiment/full_train_wordgram.t7b -test_data_file data/11st/sentiment/full_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordgramroman -train_data_file data/11st/sentiment/full_train_rr_wordgram.t7b -test_data_file data/11st/sentiment/full_test_rr_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordgramtfidf -train_data_file data/11st/sentiment/full_train_wordgramtfidf.t7b -test_data_file data/11st/sentiment/full_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/11stfull_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/11stfull/wordgramtfidfroman -train_data_file data/11st/sentiment/full_train_rr_wordgramtfidf.t7b -test_data_file data/11st/sentiment/full_test_rr_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/charbag -train_data_file data/amazon/binary_train_charbag.t7b -test_data_file data/amazon/binary_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/charbagtfidf -train_data_file data/amazon/binary_train_charbagtfidf.t7b -test_data_file data/amazon/binary_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/chargram -train_data_file data/amazon/binary_train_chargram.t7b -test_data_file data/amazon/binary_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/chargramtfidf -train_data_file data/amazon/binary_train_chargramtfidf.t7b -test_data_file data/amazon/binary_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/wordbag -train_data_file data/amazon/binary_train_wordbag.t7b -test_data_file data/amazon/binary_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/wordbagtfidf -train_data_file data/amazon/binary_train_wordbagtfidf.t7b -test_data_file data/amazon/binary_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/wordgram -train_data_file data/amazon/binary_train_wordgram.t7b -test_data_file data/amazon/binary_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonbinary_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonbinary/wordgramtfidf -train_data_file data/amazon/binary_train_wordgramtfidf.t7b -test_data_file data/amazon/binary_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/charbag -train_data_file data/amazon/full_train_charbag.t7b -test_data_file data/amazon/full_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/charbagtfidf -train_data_file data/amazon/full_train_charbagtfidf.t7b -test_data_file data/amazon/full_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/chargram -train_data_file data/amazon/full_train_chargram.t7b -test_data_file data/amazon/full_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/chargramtfidf -train_data_file data/amazon/full_train_chargramtfidf.t7b -test_data_file data/amazon/full_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/wordbag -train_data_file data/amazon/full_train_wordbag.t7b -test_data_file data/amazon/full_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/wordbagtfidf -train_data_file data/amazon/full_train_wordbagtfidf.t7b -test_data_file data/amazon/full_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/wordgram -train_data_file data/amazon/full_train_wordgram.t7b -test_data_file data/amazon/full_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/amazonfull_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/amazonfull/wordgramtfidf -train_data_file data/amazon/full_train_wordgramtfidf.t7b -test_data_file data/amazon/full_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/charbag -train_data_file data/chinanews/topic/train_charbag.t7b -test_data_file data/chinanews/topic/test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/charbagtfidf -train_data_file data/chinanews/topic/train_charbagtfidf.t7b -test_data_file data/chinanews/topic/test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/chargram -train_data_file data/chinanews/topic/train_chargram.t7b -test_data_file data/chinanews/topic/test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/chargramtfidf -train_data_file data/chinanews/topic/train_chargramtfidf.t7b -test_data_file data/chinanews/topic/test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordbag -train_data_file data/chinanews/topic/train_wordbag.t7b -test_data_file data/chinanews/topic/test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordbagroman -train_data_file data/chinanews/topic/train_pinyin_wordbag.t7b -test_data_file data/chinanews/topic/test_pinyin_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordbagtfidf -train_data_file data/chinanews/topic/train_wordbagtfidf.t7b -test_data_file data/chinanews/topic/test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordbagtfidfroman -train_data_file data/chinanews/topic/train_pinyin_wordbagtfidf.t7b -test_data_file data/chinanews/topic/test_pinyin_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordgram -train_data_file data/chinanews/topic/train_wordgram.t7b -test_data_file data/chinanews/topic/test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordgramroman -train_data_file data/chinanews/topic/train_pinyin_wordgram.t7b -test_data_file data/chinanews/topic/test_pinyin_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordgramtfidf -train_data_file data/chinanews/topic/train_wordgramtfidf.t7b -test_data_file data/chinanews/topic/test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/chinanews_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/chinanews/wordgramtfidfroman -train_data_file data/chinanews/topic/train_pinyin_wordgramtfidf.t7b -test_data_file data/chinanews/topic/test_pinyin_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/charbagtfidf -train_data_file data/dianping/train_charbagtfidf.t7b -test_data_file data/dianping/test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/chargram -train_data_file data/dianping/train_chargram.t7b -test_data_file data/dianping/test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/chargramtfidf -train_data_file data/dianping/train_chargramtfidf.t7b -test_data_file data/dianping/test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordbag -train_data_file data/dianping/train_wordbag.t7b -test_data_file data/dianping/test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordbagroman -train_data_file data/dianping/train_pinyin_wordbag.t7b -test_data_file data/dianping/test_pinyin_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordbagtfidf -train_data_file data/dianping/train_wordbagtfidf.t7b -test_data_file data/dianping/test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordbagtfidfroman -train_data_file data/dianping/train_pinyin_wordbagtfidf.t7b -test_data_file data/dianping/test_pinyin_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordgram -train_data_file data/dianping/train_wordgram.t7b -test_data_file data/dianping/test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordgramroman -train_data_file data/dianping/train_pinyin_wordgram.t7b -test_data_file data/dianping/test_pinyin_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordgramtfidf -train_data_file data/dianping/train_wordgramtfidf.t7b -test_data_file data/dianping/test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/dianping_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/dianping/wordgramtfidfroman -train_data_file data/dianping/train_pinyin_wordgramtfidf.t7b -test_data_file data/dianping/test_pinyin_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/charbag -train_data_file data/ifeng/topic/train_charbag.t7b -test_data_file data/ifeng/topic/test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/charbagtfidf -train_data_file data/ifeng/topic/train_charbagtfidf.t7b -test_data_file data/ifeng/topic/test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/chargram -train_data_file data/ifeng/topic/train_chargram.t7b -test_data_file data/ifeng/topic/test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/chargramtfidf -train_data_file data/ifeng/topic/train_chargramtfidf.t7b -test_data_file data/ifeng/topic/test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordbag -train_data_file data/ifeng/topic/train_wordbag.t7b -test_data_file data/ifeng/topic/test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordbagroman -train_data_file data/ifeng/topic/train_pinyin_wordbag.t7b -test_data_file data/ifeng/topic/test_pinyin_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordbagtfidf -train_data_file data/ifeng/topic/train_wordbagtfidf.t7b -test_data_file data/ifeng/topic/test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordbagtfidfroman -train_data_file data/ifeng/topic/train_pinyin_wordbagtfidf.t7b -test_data_file data/ifeng/topic/test_pinyin_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordgram -train_data_file data/ifeng/topic/train_wordgram.t7b -test_data_file data/ifeng/topic/test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordgramroman -train_data_file data/ifeng/topic/train_pinyin_wordgram.t7b -test_data_file data/ifeng/topic/test_pinyin_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordgramtfidf -train_data_file data/ifeng/topic/train_wordgramtfidf.t7b -test_data_file data/ifeng/topic/test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/ifeng_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/ifeng/wordgramtfidfroman -train_data_file data/ifeng/topic/train_pinyin_wordgramtfidf.t7b -test_data_file data/ifeng/topic/test_pinyin_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/charbag -train_data_file data/jd/sentiment/binary_train_charbag.t7b -test_data_file data/jd/sentiment/binary_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/charbagtfidf -train_data_file data/jd/sentiment/binary_train_charbagtfidf.t7b -test_data_file data/jd/sentiment/binary_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/chargram -train_data_file data/jd/sentiment/binary_train_chargram.t7b -test_data_file data/jd/sentiment/binary_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/chargramtfidf -train_data_file data/jd/sentiment/binary_train_chargramtfidf.t7b -test_data_file data/jd/sentiment/binary_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordbag -train_data_file data/jd/sentiment/binary_train_wordbag.t7b -test_data_file data/jd/sentiment/binary_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordbagroman -train_data_file data/jd/sentiment/binary_train_pinyin_wordbag.t7b -test_data_file data/jd/sentiment/binary_test_pinyin_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordbagtfidf -train_data_file data/jd/sentiment/binary_train_wordbagtfidf.t7b -test_data_file data/jd/sentiment/binary_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordbagtfidfroman -train_data_file data/jd/sentiment/binary_train_pinyin_wordbagtfidf.t7b -test_data_file data/jd/sentiment/binary_test_pinyin_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordgram -train_data_file data/jd/sentiment/binary_train_wordgram.t7b -test_data_file data/jd/sentiment/binary_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordgramroman -train_data_file data/jd/sentiment/binary_train_pinyin_wordgram.t7b -test_data_file data/jd/sentiment/binary_test_pinyin_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordgramtfidf -train_data_file data/jd/sentiment/binary_train_wordgramtfidf.t7b -test_data_file data/jd/sentiment/binary_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdbinary_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdbinary/wordgramtfidfroman -train_data_file data/jd/sentiment/binary_train_pinyin_wordgramtfidf.t7b -test_data_file data/jd/sentiment/binary_test_pinyin_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/charbag -train_data_file data/jd/sentiment/full_train_charbag.t7b -test_data_file data/jd/sentiment/full_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/charbagtfidf -train_data_file data/jd/sentiment/full_train_charbagtfidf.t7b -test_data_file data/jd/sentiment/full_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/chargram -train_data_file data/jd/sentiment/full_train_chargram.t7b -test_data_file data/jd/sentiment/full_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/chargramtfidf -train_data_file data/jd/sentiment/full_train_chargramtfidf.t7b -test_data_file data/jd/sentiment/full_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordbag -train_data_file data/jd/sentiment/full_train_wordbag.t7b -test_data_file data/jd/sentiment/full_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordbagroman -train_data_file data/jd/sentiment/full_train_pinyin_wordbag.t7b -test_data_file data/jd/sentiment/full_test_pinyin_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordbagtfidf -train_data_file data/jd/sentiment/full_train_wordbagtfidf.t7b -test_data_file data/jd/sentiment/full_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordbagtfidfroman -train_data_file data/jd/sentiment/full_train_pinyin_wordbagtfidf.t7b -test_data_file data/jd/sentiment/full_test_pinyin_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordgram -train_data_file data/jd/sentiment/full_train_wordgram.t7b -test_data_file data/jd/sentiment/full_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordgramroman -train_data_file data/jd/sentiment/full_train_pinyin_wordgram.t7b -test_data_file data/jd/sentiment/full_test_pinyin_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordgramtfidf -train_data_file data/jd/sentiment/full_train_wordgramtfidf.t7b -test_data_file data/jd/sentiment/full_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jdfull_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jdfull/wordgramtfidfroman -train_data_file data/jd/sentiment/full_train_pinyin_wordgramtfidf.t7b -test_data_file data/jd/sentiment/full_test_pinyin_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/charbag -train_data_file data/joint/binary_train_charbag.t7b -test_data_file data/joint/binary_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/charbagtfidf -train_data_file data/joint/binary_train_charbagtfidf.t7b -test_data_file data/joint/binary_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/chargram -train_data_file data/joint/binary_train_chargram.t7b -test_data_file data/joint/binary_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/chargramtfidf -train_data_file data/joint/binary_train_chargramtfidf.t7b -test_data_file data/joint/binary_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordbag -train_data_file data/joint/binary_train_wordbag.t7b -test_data_file data/joint/binary_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordbagroman -train_data_file data/joint/binary_train_roman_wordbag.t7b -test_data_file data/joint/binary_test_roman_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordbagtfidf -train_data_file data/joint/binary_train_wordbagtfidf.t7b -test_data_file data/joint/binary_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordbagtfidfroman -train_data_file data/joint/binary_train_roman_wordbagtfidf.t7b -test_data_file data/joint/binary_test_roman_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordgram -train_data_file data/joint/binary_train_wordgram.t7b -test_data_file data/joint/binary_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordgramroman -train_data_file data/joint/binary_train_roman_wordgram.t7b -test_data_file data/joint/binary_test_roman_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordgramtfidf -train_data_file data/joint/binary_train_wordgramtfidf.t7b -test_data_file data/joint/binary_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointbinary_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointbinary/wordgramtfidfroman -train_data_file data/joint/binary_train_roman_wordgramtfidf.t7b -test_data_file data/joint/binary_test_roman_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/charbag -train_data_file data/joint/full_train_charbag.t7b -test_data_file data/joint/full_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/charbagtfidf -train_data_file data/joint/full_train_charbagtfidf.t7b -test_data_file data/joint/full_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/chargram -train_data_file data/joint/full_train_chargram.t7b -test_data_file data/joint/full_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/chargramtfidf -train_data_file data/joint/full_train_chargramtfidf.t7b -test_data_file data/joint/full_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordbag -train_data_file data/joint/full_train_wordbag.t7b -test_data_file data/joint/full_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordbagroman -train_data_file data/joint/full_train_roman_wordbag.t7b -test_data_file data/joint/full_test_roman_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordbagtfidf -train_data_file data/joint/full_train_wordbagtfidf.t7b -test_data_file data/joint/full_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordbagtfidfroman -train_data_file data/joint/full_train_roman_wordbagtfidf.t7b -test_data_file data/joint/full_test_roman_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordgram -train_data_file data/joint/full_train_wordgram.t7b -test_data_file data/joint/full_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordgramroman -train_data_file data/joint/full_train_roman_wordgram.t7b -test_data_file data/joint/full_test_roman_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordgramtfidf -train_data_file data/joint/full_train_wordgramtfidf.t7b -test_data_file data/joint/full_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/jointfull_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/jointfull/wordgramromantfidf -train_data_file data/joint/full_train_roman_wordgramtfidf.t7b -test_data_file data/joint/full_test_roman_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/charbag -train_data_file data/nytimes/topic/train_charbag.t7b -test_data_file data/nytimes/topic/test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/charbagtfidf -train_data_file data/nytimes/topic/train_charbagtfidf.t7b -test_data_file data/nytimes/topic/test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/chargram -train_data_file data/nytimes/topic/train_chargram.t7b -test_data_file data/nytimes/topic/test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/chargramtfidf -train_data_file data/nytimes/topic/train_chargramtfidf.t7b -test_data_file data/nytimes/topic/test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/wordbag -train_data_file data/nytimes/topic/train_wordbag.t7b -test_data_file data/nytimes/topic/test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/wordbagtfidf -train_data_file data/nytimes/topic/train_wordbagtfidf.t7b -test_data_file data/nytimes/topic/test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/wordgram -train_data_file data/nytimes/topic/train_wordgram.t7b -test_data_file data/nytimes/topic/test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/nytimes_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/nytimes/wordgramtfidf -train_data_file data/nytimes/topic/train_wordgramtfidf.t7b -test_data_file data/nytimes/topic/test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/charbag -train_data_file data/rakuten/sentiment/binary_train_charbag.t7b -test_data_file data/rakuten/sentiment/binary_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/charbagtfidf -train_data_file data/rakuten/sentiment/binary_train_charbagtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/chargram -train_data_file data/rakuten/sentiment/binary_train_chargram.t7b -test_data_file data/rakuten/sentiment/binary_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/chargramtfidf -train_data_file data/rakuten/sentiment/binary_train_chargramtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordbag -train_data_file data/rakuten/sentiment/binary_train_wordbag.t7b -test_data_file data/rakuten/sentiment/binary_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordbagroman -train_data_file data/rakuten/sentiment/binary_train_hepburn_wordbag.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordbagtfidf -train_data_file data/rakuten/sentiment/binary_train_wordbagtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordbagtfidfroman -train_data_file data/rakuten/sentiment/binary_train_hepburn_wordbagtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordgram -train_data_file data/rakuten/sentiment/binary_train_wordgram.t7b -test_data_file data/rakuten/sentiment/binary_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordgramroman -train_data_file data/rakuten/sentiment/binary_train_hepburn_wordgram.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordgramtfidf -train_data_file data/rakuten/sentiment/binary_train_wordgramtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenbinary_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenbinary/wordgramtfidfroman -train_data_file data/rakuten/sentiment/binary_train_hepburn_wordgramtfidf.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_charbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/charbag -train_data_file data/rakuten/sentiment/full_train_charbag.t7b -test_data_file data/rakuten/sentiment/full_test_charbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_charbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/charbagtfidf -train_data_file data/rakuten/sentiment/full_train_charbagtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_charbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_chargram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/chargram -train_data_file data/rakuten/sentiment/full_train_chargram.t7b -test_data_file data/rakuten/sentiment/full_test_chargram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_chargramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/chargramtfidf -train_data_file data/rakuten/sentiment/full_train_chargramtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_chargramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordbag.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordbag -train_data_file data/rakuten/sentiment/full_train_wordbag.t7b -test_data_file data/rakuten/sentiment/full_test_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordbagroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordbagroman -train_data_file data/rakuten/sentiment/full_train_hepburn_wordbag.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn_wordbag.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordbagtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordbagtfidf -train_data_file data/rakuten/sentiment/full_train_wordbagtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordbagtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordbagtfidfroman -train_data_file data/rakuten/sentiment/full_train_hepburn_wordbagtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn_wordbagtfidf.t7b \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordgram.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordgram -train_data_file data/rakuten/sentiment/full_train_wordgram.t7b -test_data_file data/rakuten/sentiment/full_test_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordgramroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordgramroman -train_data_file data/rakuten/sentiment/full_train_hepburn_wordgram.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn_wordgram.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordgramtfidf.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordgramtfidf -train_data_file data/rakuten/sentiment/full_train_wordgramtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/archive/rakutenfull_wordgramtfidfroman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nth main.lua -driver_location models/rakutenfull/wordgramtfidfroman -train_data_file data/rakuten/sentiment/full_train_hepburn_wordgramtfidf.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn_wordgramtfidf.t7b -model_size 1000001 \"$@\";\n"
  },
  {
    "path": "linearnet/config.lua",
    "content": "--[[\nConfiguration for LinearNet\nCopyright 2016 Xiang Zhang\n--]]\n\n-- Name space\nlocal config = {}\n\n-- Training data configuration\nconfig.train_data = {}\nconfig.train_data.file = 'data/dianping/train_charbag.t7b'\n\n-- Testing data configuration\nconfig.test_data = {}\nconfig.test_data.file = 'data/dianping/test_charbag.t7b'\n\n-- Model configuration\nconfig.model = {}\nconfig.model.size = 200001\nconfig.model.dimension = 2\nconfig.model.decay = 1e-5\n\n-- Trainer configuration\nconfig.train = {}\nconfig.train.rate = 1e-3\n\n-- Tester configuration\nconfig.test = {}\n\n-- Driver configuration\nconfig.driver = {}\nconfig.driver.loss = 'nn.ClassNLLCriterion'\nconfig.driver.threads = 10\nconfig.driver.buffer = 100\nconfig.driver.steps = 100000\nconfig.driver.epoches = 1000\nconfig.driver.interval = 5\nconfig.driver.location = 'models/dianping/charbag'\nconfig.driver.initialization = 1e-2\nconfig.driver.plot = true\nconfig.driver.debug = false\nconfig.driver.resume = false\n\nreturn config\n"
  },
  {
    "path": "linearnet/data.lua",
    "content": "--[[\nData class for LinearNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal torch = require('torch')\n\nlocal Data = class()\n\n-- Constructor for Data\n-- config: configuration table\n--   .file: the data file location\n-- data_table: if present, will use the data_table instead of load from file\nfunction Data:_init(config, data_table)\n   self.data = data_table or torch.load(config.file)\nend\n\nfunction Data:getClasses()\n   return #self.data.bag\nend\n\nfunction Data:getSample(sample, label)\n   local bag, bag_index, bag_value =\n      self.data.bag, self.data.bag_index, self.data.bag_value\n\n   -- Sample a non-empty example\n   local class = torch.random(#bag)\n   local item = torch.random(bag[class]:size(1))\n   while bag[class][item][2] == 0 do\n      class = torch.random(#bag)\n      item = torch.random(bag[class]:size(1))\n   end\n\n   local start = bag[class][item][1]\n   local length = bag[class][item][2]\n   local sample = sample or torch.Tensor(bag[class][item][2] ,2)\n   sample:resize(bag[class][item][2], 2)\n   sample:select(2, 1):copy(bag_index:narrow(1, start, length))\n   sample:select(2, 2):copy(bag_value:narrow(1, start, length))\n\n   local label = label or torch.Tensor(1)\n   label[1] = class\n\n   return sample, label\nend\n\n-- Iterator\nfunction Data:iterator(sample, label)\n   local bag, bag_index, bag_value =\n      self.data.bag, self.data.bag_index, self.data.bag_value\n   local sample = sample or torch.Tensor(1, 2)\n   local label = label or torch.Tensor(1)\n\n   local class = 1\n   local item = 0\n   local count = 0\n\n   return function()\n      item = item + 1\n      if item > bag[class]:size(1) then\n         class = class + 1\n         item = 1\n         if bag[class] == nil then return end\n      end\n      while bag[class][item][2] == 0 do\n         item = item + 1\n         if item > bag[class]:size(1) then\n            class = class + 1\n            item = 1\n            if bag[class] == nil then return end\n         end\n      end\n      local start = bag[class][item][1]\n      local length = bag[class][item][2]\n      sample:resize(length, 2)\n      sample:select(2, 1):copy(bag_index:narrow(1, start, length))\n      sample:select(2, 2):copy(bag_value:narrow(1, start, length))\n      label[1] = class\n      return sample, label\n   end\nend\n\n-- Get data table for share\nfunction Data:getTable()\n   return self.data\nend\n\nreturn Data\n"
  },
  {
    "path": "linearnet/driver.lua",
    "content": "--[[\nDriver for LinearNet using HogWILD!\nCopyright 2015 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal os = require('os')\nlocal paths = require('paths')\nlocal threads = require('threads')\nlocal torch = require('torch')\n\nlocal Data = require('data')\nlocal Model = require('model')\nlocal Qeueu = require('queue')\nlocal Train = require('train')\nlocal Test = require('test')\n\n-- Library configurations\nthreads.serialization('threads.sharedserialize')\n\nlocal Driver = class()\n\n-- Constructor for driver\n-- options: configuration table for others\n-- config: configuration table\n--    .loss: the loss used for classification task\n--    .threads: number of threads\n--    .buffer: buffer size for RPC queues\n--    .steps: steps for each training run\n--    .epoches: number of testing epoches before stopping\n--    .interval: print time interval\n--    .location: save location\n--    .initialization: initialization parameter for model\n--    .plot: whether to plot the output\n--    .debug: whether to debug\n--    .resume: whether to resume\nfunction Driver:_init(options, config)\n   local config = config or {}\n   self.loss = config.loss or 'nn.ClassNLLCriterion'\n   self.threads = config.threads or 10\n   self.buffer = config.buffer or 100\n   self.steps = config.steps or 100000\n   self.epoches = config.epoches or 1000\n   self.interval = config.interval or 5\n   self.location = config.location or '.'\n   self.initialization = config.initialization or 1e-2\n   self.plot = config.plot\n   self.debug = config.debug\n   self.resume = config.resume\n   self.options = options or {}\n   self.config = config\n\n   math.randomseed(os.time())\n   torch.manualSeed(os.time())\n\n   print('Driver loading training data')\n   self.train_data = Data(self.options.train_data)\n   print('Driver loading testing data')\n   self.test_data = Data(self.options.test_data)\n   self.options.model.dimension = self.train_data:getClasses()\n   print('Driver changed model output dimension to '..\n            self.options.model.dimension)\n\n   if self.resume then\n      local record_file = paths.concat(self.location, 'record.t7b')\n      print('Driver loading resumption record from '..record_file)\n      self.record = torch.load(record_file)\n      local model_file = paths.concat(\n         self.location, 'model_'..#self.record..'.t7b')\n      print('Driver loading model from '..model_file)\n      self.model = Model(self.options.model)\n      self.model:load(model_file)\n      if self.record[#self.record].progress then\n         if self.record[#self.record].progress:size(1) == self.threads then\n            self.progress = self.record[#self.record].progress:clone()\n         else\n            print('Driver resumption number of threads change.')\n            self.progress = torch.LongTensor(self.threads):zero()\n            local total = self.record[#self.record].progress:sum()\n            while self.progress:sum() < total do\n               local thread = math.random(self.threads)\n               self.progress[thread] = self.progress[thread] + self.steps \n            end\n         end\n      else\n         print('Driver resumption progress vector not found')\n         self.progress = torch.LongTensor(self.threads):zero()\n      end\n      print('Driver progress = '..self.progress:sum())\n      for i = 1, #self.record do\n         self:printResult(i)\n      end\n      if self.plot then\n         self:plotRecord()\n      end\n   else\n      self.record = {}\n      print('Driver loading model')\n      self.model = Model(self.options.model)\n      print('Driver initializing model')\n      self.model:reset(self.initialization)\n      self.progress = torch.LongTensor(self.threads):zero()\n      if self.plot then\n         require('gnuplot')\n      end\n   end\n\n   print('Driver loading tester for training data')\n   self.train_test = Test(\n      self.train_data, self.model, nn[self.loss:sub(4)](), self.options.test)\n   print('Driver loading tester for testing data')\n   self.test_test = Test(\n      self.test_data, self.model, nn[self.loss:sub(4)](), self.options.test)\n\n   print('Driver building RPC queues')\n   self.master_queue = Queue(self.buffer)\n   self.slave_queues = {}\n   for i = 1, self.threads do\n      self.slave_queues[i] = Queue(self.buffer)\n   end\n\n   print('Driver creating thread block')\n   local init_thread = self:initThread()\n   self.block = threads.Threads(self.threads, init_thread)\n   self.block:specific(true)\n\n   self.time = os.time()\n   self.step = self.progress:sum()\nend\n\n-- Run the training process\nfunction Driver:run()\n   self:deployThreads()\n\n   local begin_epoch = #self.record + 1\n   local end_epoch = #self.record + self.epoches\n   for i = begin_epoch, end_epoch do\n      print('Driver testing on training data for epoch '..i)\n      self.train_test:run(function (test, step) self:logTest(test, step) end)\n      print('Driver testing on testing data for epoch '..i)\n      self.test_test:run(function (test, step) self:logTest(test, step) end)\n      self:save()\n      self:printResult()\n      if self.plot then\n         self:plotRecord()\n      end\n   end\n\n   for i = 1, self.threads do\n      print('Driver sending RPC to exit thread '..i)\n      self.slave_queues[i]:push{func = 'exit', arg = {}}\n   end\n\n   self.block:synchronize()\n   self.block:terminate()\nend\n\n-- Deploy threads in sequential order to prevent io and memory jam\nfunction Driver:deployThreads()\n   for i = 1, self.threads do\n      print('Driver deploying job for threads '..i)\n      local thread_job = self:threadJob(i)\n      self.block:addjob(i, thread_job)\n      local rpc = self.master_queue:pop()\n      while rpc.func ~= 'notifyDeploy' do\n         self[rpc.func](self, unpack(rpc.arg))\n         rpc = self.master_queue:pop()\n      end\n      print('Driver rpc = notifyDeploy, thread = '..rpc.arg[1])\n   end\nend\n\n-- Thread initialization callback\nfunction Driver:initThread()\n   return function ()\n      local math = require('math')\n      local nn = require('nn')\n      local os = require('os')\n      local torch = require('torch')\n\n      local Queue = require('queue')\n\n      math.randomseed(os.time() + __threadid)\n      torch.manualSeed(os.time() + __threadid)\n   end\nend\n\n-- Thread job callback\nfunction Driver:threadJob(id)\n   local options = self.options\n   local steps = self.steps\n   local data_table = self.train_data:getTable()\n   local modules = self.model:getModules()\n   local loss = self.loss\n   local master_queue = self.master_queue\n   local slave_queue = self.slave_queues[id]\n   local progress = self.progress[id]\n   return function()\n      local os = require('os')\n      local nn = require('nn')\n      local torch = require('torch')\n\n      local Data = require('data')\n      local Model = require('model')\n      local Train = require('train')\n\n      local train_data = Data(options.train_data, data_table)\n      local model = Model(options.model, modules)\n\n      options.train.step = progress\n      local train = Train(train_data, model, nn[loss:sub(4)](), options.train)\n      master_queue:push{func = 'notifyDeploy', arg = {__threadid}}\n\n      local exit = false\n      while not exit do\n         train:run(steps)\n         -- Tell main thread to update progress\n         master_queue:push{\n            func = 'updateProgress',\n            arg = {__threadid, train.step, train.objective}}\n         -- Handle RPC requests from main thread\n         local rpc = slave_queue:pop_async()\n         while rpc do\n            if rpc.func == 'exit' then\n               exit = true\n            end\n            rpc = slave_queue:pop_async()\n         end\n      end\n   end\nend\n\n-- Update progress\nfunction Driver:updateProgress(thread, step, objective)\n   self.progress[thread] = step\n   print('Driver rpc = updateProgress, thread = '..thread..', objective = '..\n            objective..', progress = '..self.progress[thread]..', total = '..\n            self.progress:sum())\nend\n\n-- Log for testing\nfunction Driver:logTest(test, step)\n   if os.difftime(os.time(), self.time) >= self.interval then\n      local message = 'Test step = '..step..\n         ', total_error = '..test.total_error..\n         ', total_objective = '..test.total_objective..\n         ', label = '..test.label[1]..\n         ', decision = '..test.decision[1]\n      if self.debug then\n         local weight = {\n            weight = test.model.linear.weight, bias = test.model.linear.bias}\n         for key, w in pairs(weight) do\n            message = message..', '..key..':mean() = '..w:mean()..', '..\n               key..':std() = '..w:std()\n         end\n      end\n      print(message)\n\n      -- Handle rpc\n      local rpc = self.master_queue:pop_async()\n      while rpc do\n         self[rpc.func](self, unpack(rpc.arg))\n         rpc = self.master_queue:pop_async()\n      end\n      self.time = os.time()\n   end\nend\n\n-- Save for model\nfunction Driver:save(epoch)\n   local epoch = epoch or #self.record + 1\n\n   -- Make a backup for the record\n   print('Driver backing up record.t7b')\n   local record_file = paths.concat(self.location, 'record.t7b')\n   os.rename(record_file, record_file..'.backup')\n\n   -- Save the new record\n   print('Driver saving new records to '..record_file)\n   self.record[epoch] = {\n      train_loss = self.train_test.total_objective,\n      test_loss = self.test_test.total_objective,\n      train_error = self.train_test.total_error,\n      test_error = self.test_test.total_error,\n      progress = self.progress:clone()\n   }\n   torch.save(record_file, self.record)\n\n   -- Save the model\n   local model_file = paths.concat(self.location, 'model_'..epoch..'.t7b')\n   print('Driver saving model to '..model_file)\n   self.model:save(model_file)\nend\n\n-- Print current result\nfunction Driver:printResult(epoch)\n   local epoch = epoch or #self.record\n   print('Driver epoch = '..epoch..\n            ', train_error = '..self.record[epoch].train_error..\n            ', test_error = '..self.record[epoch].test_error..\n            ', train_loss = '..self.record[epoch].train_loss..\n            ', test_loss = '..self.record[epoch].test_loss)\nend\n\n-- Plot the record\nfunction Driver:plotRecord()\n   require('gnuplot')\n   self.error_figure = self.error_figure or gnuplot.figure()\n   self.loss_figure = self.loss_figure or gnuplot.figure()\n\n   local epoch = torch.linspace(1, #self.record, #self.record)\n   local train_error = torch.Tensor(epoch:size())\n   local test_error = torch.Tensor(epoch:size())\n   local train_loss = torch.Tensor(epoch:size())\n   local test_loss = torch.Tensor(epoch:size())\n   for i = 1, #self.record do\n      train_error[i] = self.record[i].train_error\n      test_error[i] = self.record[i].test_error\n      train_loss[i] = self.record[i].train_loss\n      test_loss[i] = self.record[i].test_loss\n   end\n\n   gnuplot.figure(self.error_figure)\n   gnuplot.plot({'Training error', epoch, train_error},\n                {'Testing error', epoch, test_error})\n   gnuplot.title('Training and testing error')\n   gnuplot.figure(self.loss_figure)\n   gnuplot.plot({'Training loss', epoch, train_loss},\n                {'Testing loss', epoch, test_loss})\n   gnuplot.title('Training and testing loss')\nend\n\nreturn Driver\n"
  },
  {
    "path": "linearnet/model.lua",
    "content": "--[[\nModel class for LinearNet, using SparseLinear\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal nn = require('nn')\nlocal torch = require('torch')\n\nlocal Model = class()\n\n-- Constructor for model\n-- config: configuration table\n--   .size: size of input index\n--   .dimension: dimension of output\n--   .decay: weight decay. Optional.\n-- modules: share weights with the given modules. Optional.\nfunction Model:_init(config, modules)\n   self.size = config.size\n   self.dimension = config.dimension\n   self.decay = config.decay or 0\n\n   if modules then\n      self.linear = modules.linear:clone('weight', 'bias')\n   else\n      self.linear = nn.SparseLinear(self.size, self.dimension)\n   end\n\n   self.sequential = nn.Sequential()\n   self.sequential:add(self.linear)\n   self.sequential:add(nn.LogSoftMax())\nend\n\n-- Forward propagation\nfunction Model:forward(input)\n   return self.sequential:forward(input)\nend\n\n-- Backward propagation\nfunction Model:backward(input, grad_output)\n   local grad_input = self.sequential:backward(input, grad_output)\n   -- Apply weight decay to linear module\n   if self.decay > 0 then\n      self.linear_index = self.linear_index or torch.LongTensor(input:size(1))\n      self.linear_index:resize(input:size(1)):copy(input:select(2, 1))\n      self.linear_decay = self.linear_decay or self.linear.gradWeight:new()\n      self.linear_decay:index(self.linear.weight, 2, self.linear_index)\n      self.linear.gradWeight:indexAdd(\n         2, self.linear_index, self.linear_decay:mul(self.decay))\n      self.linear.gradBias:add(self.decay, self.linear.bias)\n   end\n   return grad_input\nend\n\n-- Update parameters\nfunction Model:updateParameters(rate)\n   return self.linear:updateParameters(rate)\nend\n\n-- Zero grad parameters\nfunction Model:zeroGradParameters()\n   return self.linear:zeroGradParameters()\nend\n\n-- Set the type\nfunction Model:type(tensortype)\n   local tensortype = tensortype or self.linear.weight:type()\n   if tensor_type ~= self.linear.weight:type() then\n      self.linear:type(tensortype)\n   end\n   return tensortype\nend\n\n-- Reset the weights\nfunction Model:reset(sigma)\n   self.linear.weight:normal(0, sigma)\n   self.linear.bias:zero()\nend\n\n-- Get the modules\nfunction Model:getModules()\n   return {linear = self.linear}\nend\n\n-- Share given modules\nfunction Model:shareModules(modules)\n   self.linear:share(modules.linear, 'weight', 'bias')\nend\n\n-- Save to file\nfunction Model:save(file)\n   torch.save(file, self.linear)\nend\n\n-- Load from file\nfunction Model:load(file)\n   local linear = torch.load(file)\n   self.linear.weight:copy(linear.weight)\n   self.linear.bias:copy(linear.bias)\nend\n\nreturn Model\n"
  },
  {
    "path": "linearnet/queue.lua",
    "content": "--[[\nMultithreaded queue based on tds\nCopyright 2015 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal ffi = require('ffi')\nlocal serialize = require('threads.sharedserialize')\nlocal tds = require('tds')\nlocal threads = require('threads')\nlocal torch = require('torch')\n\n-- Append an underscore to distinguish between metatable and class name\nlocal Queue_ = torch.class('Queue')\n\n-- Constructor\n-- n: buffer size\nfunction Queue_:__init(size)\n   self.data = tds.hash()\n   self.pointer = torch.LongTensor(3):fill(1)\n   self.pointer[3] = 0\n   self.size = size or 10\n   self.mutex = threads.Mutex()\n   self.added_condition = threads.Condition()\n   self.removed_condition = threads.Condition()\nend\n\nfunction Queue_:push(item)\n   local storage = serialize.save(item)\n   self.mutex:lock()\n   while self.pointer[3] == self.size do\n      self.removed_condition:wait(self.mutex)\n   end\n   self.data[self.pointer[1]] = storage:string()\n   self.pointer[1] = math.fmod(self.pointer[1], self.size) + 1\n   self.pointer[3] = self.pointer[3] + 1\n   self.mutex:unlock()\n   self.added_condition:signal()\nend\n\nfunction Queue_:pop()\n   self.mutex:lock()\n   while self.pointer[3] == 0 do\n      self.added_condition:wait(self.mutex)\n   end\n   local storage = torch.CharStorage():string(self.data[self.pointer[2]])\n   self.pointer[2] = math.fmod(self.pointer[2], self.size) + 1\n   self.pointer[3] = self.pointer[3] - 1\n   self.mutex:unlock()\n   self.removed_condition:signal()\n   local item = serialize.load(storage)\n   return item\nend\n\nfunction Queue_:push_async(item)\n   if self.pointer[3] == self.size then\n      return\n   end\n   local storage = serialize.save(item)\n   self.mutex:lock()\n   if self.pointer[3] == self.size then\n      self.mutex:unlock()\n      return\n   end\n   self.data[self.pointer[1]] = storage:string()\n   self.pointer[1] = math.fmod(self.pointer[1], self.size) + 1\n   self.pointer[3] = self.pointer[3] + 1\n   self.mutex:unlock()\n   self.added_condition:signal()\n   return item\nend\n\nfunction Queue_:pop_async()\n   if self.pointer[3] == 0 then\n      return\n   end\n   self.mutex:lock()\n   if self.pointer[3] == 0 then\n      self.mutex:unlock()\n      return\n   end\n   local storage = torch.CharStorage():string(self.data[self.pointer[2]])\n   self.pointer[2] = math.fmod(self.pointer[2], self.size) + 1\n   self.pointer[3] = self.pointer[3] - 1\n   self.mutex:unlock()\n   self.removed_condition:signal()\n   local item = serialize.load(storage)\n   return item\nend\n\nfunction Queue_:free()\n   self.mutex:free()\n   self.added_condition:free()\n   self.removed_condition:free()\nend\n\nfunction Queue_:__write(f)\n   local data = self.data\n   f:writeLong(torch.pointer(data))\n   tds.C.tds_hash_retain(data)\n\n   local pointer = self.pointer\n   f:writeLong(torch.pointer(pointer))\n   pointer:retain()\n\n   f:writeObject(self.size)\n   f:writeObject(self.mutex:id())\n   f:writeObject(self.added_condition:id())\n   f:writeObject(self.removed_condition:id())\nend\n\nfunction Queue_:__read(f)\n   local data = f:readLong()\n   data = ffi.cast('tds_hash&', data)\n   ffi.gc(data, tds.C.tds_hash_free)\n   self.data = data\n\n   local pointer = f:readLong()\n   pointer = torch.pushudata(pointer, 'torch.LongTensor')\n   self.pointer = pointer\n   \n   self.size = f:readObject()\n   self.mutex = threads.Mutex(f:readObject())\n   self.added_condition = threads.Condition(f:readObject())\n   self.removed_condition = threads.Condition(f:readObject())\nend\n\n-- Return class name, not the underscored metatable\nreturn Queue\n"
  },
  {
    "path": "linearnet/test.lua",
    "content": "--[[\nTester for LinearNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal torch = require('torch')\n\nlocal Test = class()\n\n-- Constructor\n-- data: the data instance\n-- model: the model instance\n-- loss: the loss instance\n-- config: configuration table\nfunction Test:_init(data, model, loss, config)\n   self.data = data\n   self.model = model\n   self.loss = loss\n\n   self.type = model:type()\nend\n\n-- Run the tester\n-- callback: a function to execute after each step\nfunction Test:run(callback)\n   self.total_objective = 0\n   self.total_error = 1\n   self.step = 0\n   for sample, label in self.data:iterator() do\n      self:runStep(sample, label)\n      self.step = self.step + 1\n      if callback then\n         callback(self, self.step)\n      end\n   end\nend\n\n-- Run for one step\nfunction Test:runStep(sample, label)\n   -- Get sample\n   self.sample, self.label = sample, label\n\n   -- Forward propagation\n   self.output = self.model:forward(self.sample)\n   self.objective = self.loss:forward(self.output, self.label)\n\n   -- Compute decision\n   self.max, self.decision = self.output:max(1)\n   self.error = (self.decision[1] == self.label[1]) and 0 or 1\n\n   -- Accumulate errors\n   self.total_objective = (self.total_objective * self.step + self.objective) /\n      (self.step + 1)\n   self.total_error = (self.total_error * self.step + self.error) /\n      (self.step + 1)\nend\n\nreturn Test\n"
  },
  {
    "path": "linearnet/train.lua",
    "content": "--[[\nTraining class for LinearNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal math = require('math')\nlocal nn = require('nn')\n\nlocal Train = class()\n\n-- Constructor\n-- data: the data instance\n-- model: the model instance\n-- loss: the loss instance\n-- config: the configuration table\n--   .rate: learning rate\n--   .step: current finished steps. Starting from 0\nfunction Train:_init(data, model, loss, config)\n   self.data = data\n   self.model = model\n   self.loss = loss\n\n   local config = config or {}\n   self.rate = config.rate or 1e-3\n   self.step = config.step or 0\n\n   self.type = model:type()\nend\n\n-- Run for a number of steps\n-- steps: number of steps to run\n-- callback: a function to execute after each step\nfunction Train:run(steps, callback)\n   for i = 1, steps do\n      self:runStep()\n      self.step = self.step + 1\n      if callback then\n         callback(self, i)\n      end\n   end\nend\n\n-- Run for one step\nfunction Train:runStep()\n   -- Get sample\n   self.sample, self.label = self.data:getSample(self.sample, self.label)\n\n   -- Forward propagation\n   self.output = self.model:forward(self.sample)\n   self.objective = self.loss:forward(self.output, self.label)\n\n   -- Backward propagation\n   self.grad_output = self.loss:backward(self.output, self.label)\n   self.grad_input = self.model:backward(self.sample, self.grad_output)\n\n   -- Update parameters\n   self.model:updateParameters(self.rate)\n   self.model:zeroGradParameters()\nend\n\nreturn Train\n"
  },
  {
    "path": "linearnet/unittest/data.lua",
    "content": "--[[\nUnit test for LinearNet data program\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Data = require('data')\n\nlocal math = require('math')\nlocal string = require('string')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n         and name:match('[%g]+Test') then\n            print('\\nExecuting '..name)\n            func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.file = 'data/dianping/unittest_charbag.t7b'\n   self.config = config\n   print('Loading data from '..config.train_data.file)\n   self.data = Data(config.train_data)\nend\n\nfunction joe:getSampleTest()\n   local data = self.data\n   print('Getting 10 samples')\n   for i = 1, 10 do\n      local sample, label = data:getSample(sample, label)\n      io.write(label[1], ' ', sample:size(1))\n      for j = 1, sample:size(1) do\n         io.write(' ', sample[j][1], ':', string.format('%.2g', sample[j][2]))\n      end\n      io.write('\\n')\n      io.flush()\n   end\nend\n\nfunction joe:iteratorTest()\n   local data = self.data\n   print('Iterating through data')\n   local count = 0\n   for sample, label in data:iterator() do\n      io.write(label[1], ' ', sample:size(1))\n      count = count + 1\n      if math.fmod(count, 16) == 0 then\n         io.write('\\n')\n         io.flush()\n      else\n         io.write(', ')\n      end\n   end\n   if math.fmod(count, 16) ~= 0 then\n      io.write('\\n')\n      io.flush()\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "linearnet/unittest/driver.lua",
    "content": "--[[\nUnit test for driver\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Driver = require('driver')\n\n--  A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n         and name:match('[%g]+Test') then\n            print('\\nExecuting '..name)\n            func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Creating driver')\n   config.train_data.file = 'data/dianping/unittest_charbag.t7b'\n   config.test_data.file = 'data/dianping/unittest_charbag.t7b'\n   config.driver.steps = 10000\n   config.driver.epoches = 30\n   config.driver.interval = 1\n   config.driver.location = '/tmp'\n   config.driver.debug = true\n   local driver = Driver(config, config.driver)\n\n   self.config = config\n   self.driver = driver\nend\n\nfunction joe:driverTest()\n   local driver = self.driver\n   print('Testing driver')\n   driver:run()\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "linearnet/unittest/model.lua",
    "content": "--[[\nUnit test for LinearNet model program\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal math = require('math')\nlocal string = require('string')\nlocal sys = require('sys')\n\nlocal Data = require('data')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.file = 'data/dianping/unittest_charbag.t7b'\n   print('Loading data from '..config.train_data.file)\n   self.data = Data(config.train_data)\n   print('Loading the model')\n   self.model = Model(config.model)\n   print(self.model.linear)\n   print('Resetting model')\n   self.model:reset(1e-3)\n   print(self.model.linear.weight:std())\nend\n\nfunction joe:propagationTest()\n   local data = self.data\n   local model = self.model\n   local weight = self.model.linear.weight\n   local bias = self.model.linear.bias\n\n   print('Testing forward and backward propagation for 10 times')\n   for i = 1, 10 do\n      print('Zero gradient of parameters')\n      sys.tic()\n      model:zeroGradParameters()\n      sys.toc(true)\n\n      local sample, label = data:getSample()\n      print(tostring(i)..', sample '..sample:size(1)..', label '..label[1])\n      print('Forward propagating')\n      sys.tic()\n      local output = model:forward(sample)\n      sys.toc(true)\n\n      print('output '..output:dim()..', '..output:size(1))\n      print('Backward propagating')\n      local grad_output = torch.rand(output:size())\n      sys.tic()\n      local grad_input = model:backward(sample, grad_output)\n      sys.toc(true)\n      print('grad_input '..tostring(grad_input))\n\n      print('Update parameters')\n      sys.tic()\n      model:updateParameters(1e-3)\n      sys.toc(true)\n      print('weight mean '..weight:mean()..', std '..weight:std()..\n               ', bias mean '..bias:mean()..', std '..bias:std())\n   end\nend\n\nfunction joe:shareModuleTest()\n   local model = self.model\n   local linear = model.linear:clone()\n   print(torch.pointer(model.linear.weight:storage()),\n         torch.pointer(linear.weight:storage()),\n         torch.pointer(model.linear.bias:storage()),\n         torch.pointer(linear.bias:storage()))\n   model:shareModules({linear = linear})\n   print(torch.pointer(model.linear.weight:storage()),\n         torch.pointer(linear.weight:storage()),\n         torch.pointer(model.linear.bias:storage()),\n         torch.pointer(linear.bias:storage()))\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   local weight, bias = model.linear.weight, model.linear.bias\n   print('weight mean '..weight:mean()..', std '..weight:std()..\n            ', bias mean '..bias:mean()..', std '..bias:std())\n   print('Saving model to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n   print('Resetting model with sigma 1e-2')\n   model:reset(1e-2)\n   print('weight mean '..weight:mean()..', std '..weight:std()..\n            ', bias mean '..bias:mean()..', std '..bias:std())\n   print('Loading model from /tmp/model.t7b')\n   model:load('/tmp/model.t7b')\n   print('weight mean '..weight:mean()..', std '..weight:std()..\n            ', bias mean '..bias:mean()..', std '..bias:std())\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "linearnet/unittest/test.lua",
    "content": "--[[\nUnit test for LinearNet tester\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal math = require('math')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.file = 'data/dianping/unittest_charbag.t7b'\n   print('Loading data from '..config.train_data.file)\n   self.data = Data(config.train_data)\n   print('Loading the model')\n   self.model = Model(config.model)\n   print(self.model.linear)\n   print('Resetting model')\n   self.model:reset(1e-2)\n   print('Loading the loss')\n   self.loss = nn[config.driver.loss:sub(4)]()\n   print(self.loss)\n   print('Loading the tester')\n   self.test = Test(self.data, self.model, self.loss)\nend\n\nfunction joe:runTest()\n   local callback = function(test, step)\n      print('stp = '..step..\n               ', lss = '..test.total_objective..\n               ', err = '..test.total_error..\n               ', obj = '..test.objective..\n               ', lbl = '..test.label[1]..\n               ', dcs = '..test.decision[1])\n   end\n   print('Starting test')\n   self.test:run(callback)\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "linearnet/unittest/train.lua",
    "content": "--[[\nUnit test for LinearNet trainer\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal math = require('math')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.file = 'data/dianping/unittest_charbag.t7b'\n   print('Loading data from '..config.train_data.file)\n   self.data = Data(config.train_data)\n   print('Loading the model')\n   self.model = Model(config.model)\n   print(self.model.linear)\n   print('Resetting model')\n   self.model:reset(1e-2)\n   print('Loading the loss')\n   self.loss = nn[config.driver.loss:sub(4)]()\n   print(self.loss)\n   print('Loading the trainer')\n   self.train = Train(self.data, self.model, self.loss)\nend\n\nfunction joe:runTest()\n   local callback = function(train, step)\n      local model = train.model\n      if math.fmod(step, 1000) == 0 then\n         local max, decision = train.output:max(1)\n         print('stp = '..step..\n                  ', lbl = '..train.label[1]..\n                  ', dcs = '..decision[1]..\n                  ', obj = '..train.objective..\n                  ', wmn = '..model.linear.weight:mean()..\n                  ', wsd = '..model.linear.weight:std()..\n                  ', bmn = '..model.linear.bias:mean()..\n                  ', bsd = '..model.linear.bias:std())\n      end\n   end\n   local steps = 1000000\n   local train = self.train\n   print('Training for '..steps..' steps')\n   train:run(steps, callback)\nend\n\njoe.main()\nreturn joe\n\n"
  },
  {
    "path": "models/README.txt",
    "content": "This directory should contain trained models and checkpoints.\n"
  },
  {
    "path": "models/embednet/README.txt",
    "content": "This directory should contain trained models and checkpoints for embednet.\n"
  },
  {
    "path": "models/fasttext/README.txt",
    "content": "This directory should contain trained models and checkpoints for fasttext.\n"
  },
  {
    "path": "models/glyphnet/README.txt",
    "content": "This directory should contain trained models and checkpoints for glyphnet.\n"
  },
  {
    "path": "models/linearnet/README.txt",
    "content": "This directory should contain trained models and checkpoints for linearnet.\n"
  },
  {
    "path": "models/onehotnet/README.txt",
    "content": "This directory should contain trained models and checkpoints for onehotnet.\n"
  },
  {
    "path": "onehotnet/archive/11stbinary_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/11st/sentiment/binary_train.t7b -test_data_file data/11st/sentiment/binary_test.t7b -driver_location models/11stbinary/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stbinary_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/11st/sentiment/binary_train_rr.t7b -test_data_file data/11st/sentiment/binary_test_rr.t7b -driver_location models/11stbinary/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stbinary_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/11st/sentiment/binary_train.t7b -test_data_file data/11st/sentiment/binary_test.t7b -driver_location models/11stbinary/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stbinary_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/11st/sentiment/binary_train_rr.t7b -test_data_file data/11st/sentiment/binary_test_rr.t7b -driver_location models/11stbinary/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stfull_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/11st/sentiment/full_train.t7b -test_data_file data/11st/sentiment/full_test.t7b -driver_location models/11stfull/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stfull_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/11st/sentiment/full_train_rr.t7b -test_data_file data/11st/sentiment/full_test_rr.t7b -driver_location models/11stfull/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stfull_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/11st/sentiment/full_train.t7b -test_data_file data/11st/sentiment/full_test.t7b -driver_location models/11stfull/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/11stfull_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/11st/sentiment/full_train_rr.t7b -test_data_file data/11st/sentiment/full_test_rr.t7b -driver_location models/11stfull/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/amazonbinary_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/amazon/binary_train.t7b -test_data_file data/amazon/binary_test.t7b -driver_location models/amazonbinary/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/amazonbinary_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/amazon/binary_train.t7b -test_data_file data/amazon/binary_test.t7b -driver_location models/amazonbinary/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/amazonfull_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/amazon/full_train.t7b -test_data_file data/amazon/full_test.t7b -driver_location models/amazonfull/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/amazonfull_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/amazon/full_train.t7b -test_data_file data/amazon/full_test.t7b -driver_location models/amazonfull/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/chinanews_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/chinanews/topic/train.t7b -test_data_file data/chinanews/topic/test.t7b -driver_location models/chinanews/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/chinanews_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/chinanews/topic/train_pinyin.t7b -test_data_file data/chinanews/topic/test_pinyin.t7b -driver_location models/chinanews/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/chinanews_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/chinanews/topic/train.t7b -test_data_file data/chinanews/topic/test.t7b -driver_location models/chinanews/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/chinanews_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/chinanews/topic/train_pinyin.t7b -test_data_file data/chinanews/topic/test_pinyin.t7b -driver_location models/chinanews/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/dianping_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/dianping_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_location models/dianping/onehot4temporal12length2048feature256roman -train_data_file data/dianping/train_pinyin_string.t7b -test_data_file data/dianping/test_pinyin_string.t7b \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/dianping_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/dianping_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -driver_location models/dianping/onehot4temporal8length1944feature256roman -train_data_file data/dianping/train_pinyin_string.t7b -test_data_file data/dianping/test_pinyin_string.t7b \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/ifeng_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/ifeng/topic/train.t7b -test_data_file data/ifeng/topic/test.t7b -driver_location models/ifeng/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/ifeng_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/ifeng/topic/train_pinyin.t7b -test_data_file data/ifeng/topic/test_pinyin.t7b -driver_location models/ifeng/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/ifeng_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/ifeng/topic/train.t7b -test_data_file data/ifeng/topic/test.t7b -driver_location models/ifeng/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/ifeng_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/ifeng/topic/train_pinyin.t7b -test_data_file data/ifeng/topic/test_pinyin.t7b -driver_location models/ifeng/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdbinary_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/binary_train.t7b -test_data_file data/jd/sentiment/binary_test.t7b -driver_location models/jdbinary/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdbinary_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/binary_train_pinyin.t7b -test_data_file data/jd/sentiment/binary_test_pinyin.t7b -driver_location models/jdbinary/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdbinary_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/jd/sentiment/binary_train.t7b -test_data_file data/jd/sentiment/binary_test.t7b -driver_location models/jdbinary/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdbinary_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/jd/sentiment/binary_train_pinyin.t7b -test_data_file data/jd/sentiment/binary_test_pinyin.t7b -driver_location models/jdbinary/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdfull_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/full_train.t7b -test_data_file data/jd/sentiment/full_test.t7b -driver_location models/jdfull/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdfull_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/jd/sentiment/full_train_pinyin.t7b -test_data_file data/jd/sentiment/full_test_pinyin.t7b -driver_location models/jdfull/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdfull_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/jd/sentiment/full_train.t7b -test_data_file data/jd/sentiment/full_test.t7b -driver_location models/jdfull/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jdfull_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/jd/sentiment/full_train_pinyin.t7b -test_data_file data/jd/sentiment/full_test_pinyin.t7b -driver_location models/jdfull/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointbinary_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/binary_train.t7b -test_data_file data/joint/binary_test.t7b -driver_steps 400000 -driver_location models/jointbinary/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointbinary_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/binary_train_roman.t7b -test_data_file data/joint/binary_test_roman.t7b -driver_steps 400000 -driver_location models/jointbinary/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointbinary_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/binary_train.t7b -test_data_file data/joint/binary_test.t7b -driver_steps 400000 -driver_location models/jointbinary/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointbinary_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/binary_train_roman.t7b -test_data_file data/joint/binary_test_roman.t7b -driver_steps 400000 -driver_location models/jointbinary/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointfull_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/full_train.t7b -test_data_file data/joint/full_test.t7b -driver_steps 400000 -driver_location models/jointfull/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointfull_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/joint/full_train_roman.t7b -test_data_file data/joint/full_test_roman.t7b -driver_steps 400000 -driver_location models/jointfull/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointfull_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/full_train.t7b -test_data_file data/joint/full_test.t7b -driver_steps 400000 -driver_location models/jointfull/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/jointfull_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/joint/full_train_roman.t7b -test_data_file data/joint/full_test_roman.t7b -driver_steps 400000 -driver_location models/jointfull/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/nytimes_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/nytimes/topic/train.t7b -test_data_file data/nytimes/topic/test.t7b -driver_location models/nytimes/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/nytimes_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/nytimes/topic/train.t7b -test_data_file data/nytimes/topic/test.t7b -driver_location models/nytimes/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenbinary_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/rakuten/sentiment/binary_train.t7b -test_data_file data/rakuten/sentiment/binary_test.t7b -driver_location models/rakutenbinary/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenbinary_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/rakuten/sentiment/binary_train_hepburn.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn.t7b -driver_location models/rakutenbinary/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenbinary_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/rakuten/sentiment/binary_train.t7b -test_data_file data/rakuten/sentiment/binary_test.t7b -driver_location models/rakutenbinary/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenbinary_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/rakuten/sentiment/binary_train_hepburn.t7b -test_data_file data/rakuten/sentiment/binary_test_hepburn.t7b -driver_location models/rakutenbinary/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenfull_onehot4temporal12length2048feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/rakuten/sentiment/full_train.t7b -test_data_file data/rakuten/sentiment/full_test.t7b -driver_location models/rakutenfull/onehot4temporal12length2048feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenfull_onehot4temporal12length2048feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -train_data_file data/rakuten/sentiment/full_train_hepburn.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn.t7b -driver_location models/rakutenfull/onehot4temporal12length2048feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenfull_onehot4temporal8length1944feature256.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/rakuten/sentiment/full_train.t7b -test_data_file data/rakuten/sentiment/full_test.t7b -driver_location models/rakutenfull/onehot4temporal8length1944feature256 \"$@\";\n"
  },
  {
    "path": "onehotnet/archive/rakutenfull_onehot4temporal8length1944feature256roman.sh",
    "content": "#!/bin/bash\n\n# Archived program command-line for experiment\n# Copyright 2016 Xiang Zhang\n#\n# Usage: bash {this_file} [additional_options]\n\nset -x;\nset -e;\n\nqlua main.lua -driver_variation small -train_data_file data/rakuten/sentiment/full_train_hepburn.t7b -test_data_file data/rakuten/sentiment/full_test_hepburn.t7b -driver_location models/rakutenfull/onehot4temporal8length1944feature256roman \"$@\";\n"
  },
  {
    "path": "onehotnet/config.lua",
    "content": "--[[\nConfiguration for EmbedNet\nCopyright Xiang Zhang 2016\n--]]\n\n-- Name space\nlocal config = {}\n\n-- Training data configurations\nconfig.train_data = {}\nconfig.train_data.file = 'data/dianping/train_string.t7b'\nconfig.train_data.batch = 16\nconfig.train_data.size = 256\n\n-- Testing data configurations\nconfig.test_data = {}\nconfig.test_data.file = 'data/dianping/test_string.t7b'\nconfig.test_data.batch = 16\nconfig.test_data.size = 256\n\n-- Model configurations\nconfig.model = {}\nconfig.model.cudnn = true\n\n-- Model variations configuration\nconfig.variation = {}\n\n-- Large model configuration\nlocal onehot = {}\nonehot[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\nonehot[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[16] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[17] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[18] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[19] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[20] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[21] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[22] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[23] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[24] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[25] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\ntemporal[26] = {name = 'nn.Reshape', size = 4096, batchMode = true}\ntemporal[27] = {name = 'nn.Linear', inputSize = 4096, outputSize = 1024}\ntemporal[28] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[29] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[30] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[31] = {name = 'nn.LogSoftMax'}\nconfig.variation['large'] =\n   {onehot = onehot, temporal = temporal, length = 2048}\n\n-- Small model configuration\nlocal onehot = {}\nonehot[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\nonehot[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n             outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\nonehot[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\nonehot[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 2, dW = 2}\nlocal temporal = {}\ntemporal[1] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[2] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[3] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[4] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[5] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[6] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[7] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[8] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n               outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[9] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[10] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[11] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[12] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[13] = {name = 'nn.TemporalConvolutionMM', inputFrameSize = 256,\n                outputFrameSize = 256, kW = 3, dW = 1, padW = 1}\ntemporal[14] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[15] = {name = 'nn.TemporalMaxPoolingMM', kW = 3, dW = 3}\ntemporal[16] = {name = 'nn.Reshape', size = 4608, batchMode = true}\ntemporal[17] = {name = 'nn.Linear', inputSize = 4608, outputSize = 1024}\ntemporal[18] = {name = 'nn.Threshold', th = 1e-6, v = 0, ip = true}\ntemporal[19] = {name = 'nn.Dropout', p = 0.5, v2 = true, inplace = true}\ntemporal[20] = {name = 'nn.Linear', inputSize = 1024, outputSize = 2}\ntemporal[21] = {name = 'nn.LogSoftMax'}\nconfig.variation['small'] =\n   {onehot = onehot, temporal = temporal, length = 1944}\n\n-- Trainer settings\nconfig.train = {}\nconfig.train.momentum = 0.9\nconfig.train.decay = 1e-5\n-- These are just multipliers to config.driver.rate\n-- For every config.driver.schedule * config.driver.steps\nconfig.train.rates =\n   {1/1, 1/2, 1/4, 1/8, 1/16, 1/32, 1/64, 1/128, 1/256, 1/512, 1/1024}\n\n-- Tester settings\nconfig.test = {}\n\n-- Visualizer settings\nconfig.visualizer = {}\nconfig.visualizer.width = 1200\nconfig.visualizer.scale = 4\nconfig.visualizer.height = 64\n\n-- Driver configurations\nconfig.driver = {}\nconfig.driver.type = 'torch.CudaTensor'\nconfig.driver.device = 1\nconfig.driver.loss = 'nn.ClassNLLCriterion'\nconfig.driver.variation = 'large'\nconfig.driver.steps = 100000\nconfig.driver.epoches = 100\nconfig.driver.schedule = 8\nconfig.driver.rate = 1e-5\nconfig.driver.interval = 5\nconfig.driver.location = 'models/dianping/onehot4temporal12length2048feature256'\nconfig.driver.plot = true\nconfig.driver.visualize = true\nconfig.driver.debug = false\nconfig.driver.resume = false\n\n-- Main configuration\nconfig.joe = {}\n\nreturn config\n"
  },
  {
    "path": "onehotnet/data.lua",
    "content": "--[[\nData class for OnehotNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal torch = require('torch')\n\nlocal parent = require('glyphnet/data')\n\nlocal Data = class(parent)\n\n-- Constructor for Data\n-- config: configuration table\n--   .file: file for data\n--   .batch: batch of data\n--   .size: size of the quantization\nfunction Data:_init(config)\n   local data = torch.load(config.file)\n   self.data = {code = data.index, code_value = data.content}\n   self.length = config.length or 2048\n   self.size = config.size or 256\n   self.batch = config.batch or 16\nend\n\nfunction Data:initSample(sample, label)\n   local sample = sample or torch.Tensor(self.batch, self.size, self.length)\n   local label = label or torch.Tensor(self.batch)\n   sample:zero()\n   return sample, label\nend\n\nfunction Data:index(sample, class, item)\n   local code, code_value = self.data.code, self.data.code_value\n   local position = 1\n\n   for field = 1, code[class][item]:size(1) do\n      -- Break if current position is larger than sample length\n      if position > sample:size(2) then\n         break\n      end\n      for char = 1, code[class][item][field][2] + 1 do\n         -- Break if current position is larger than sample length\n         if position > sample:size(2) then\n            break\n         end\n         local char_index = code[class][item][field][1] + char - 1\n         sample[code_value[char_index] + 1][position] = 1\n         position = position + 1\n      end\n   end\n\n   return sample\nend\n\nreturn Data\n"
  },
  {
    "path": "onehotnet/driver.lua",
    "content": "--[[\nDriver for OnehotNet training\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\n\nlocal parent = require('glyphnet/driver')\nlocal Driver = class(parent)\n\n-- Initialize variation\nfunction Driver:initVariation()\n   print('Driver using model variation '..self.variation)\n   self.options.model.onehot =\n      self.options.variation[self.variation].onehot\n   self.options.model.temporal = self.options.variation[self.variation].temporal\n\n   print('Driver adjusting data length to '..\n            self.options.variation[self.variation].length)\n   self.options.train_data.length =\n      self.options.variation[self.variation].length\n   self.options.test_data.length =\n      self.options.variation[self.variation].length\nend\n\n-- Visualize the model\nfunction Driver:visualizeModel()\n   local Visualizer = require('visualizer')\n   self.options.visualizer.title = 'Onehot model'\n   self.onehot_visualizer = self.onehot_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = 'Temporal model'\n   self.temporal_visualizer = self.temporal_visualizer or\n      Visualizer(self.options.visualizer)\n   self.options.visualizer.title = nil\n\n   self.onehot_visualizer:drawSequential(self.model.onehot)\n   self.temporal_visualizer:drawSequential(self.model.temporal)\nend\n\nreturn Driver\n"
  },
  {
    "path": "onehotnet/model.lua",
    "content": "--[[\nModel for OnehotNet\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal class = require('pl.class')\nlocal nn = require('nn')\n\nlocal parent = require('glyphnet/model')\n\nlocal Model = class(parent)\n\n-- Model constructor\n-- config: configuration table\n--   .onehot: configuration table of the onehot model\n--   .temporal: configuration table of the temporal model\n--   .file: the model file to load\n--   .cudnn: whether to use NVidia CUDNN\nfunction Model:_init(config)\n   -- Read or create model\n   if config.file then\n      local model = torch.load(config.file)\n      self.onehot = self:makeCleanSequential(model.onehot)\n      self.temporal = self:makeCleanSequential(model.temporal)\n   else\n      self.onehot = self:createCleanSequential(config.onehot)\n      self:initSequential(self.onehot)\n      self.temporal = self:createCleanSequential(config.temporal)\n      self:initSequential(self.temporal)\n   end\n\n   -- Saving configurations\n   self.cudnn = config.cudnn\n   self.config = config\n   self.tensortype = torch.getdefaulttensortype()\nend\n\nfunction Model:forward(input)\n   self.feature = self.onehot:forward(input)\n   self.output = self.temporal:forward(self.feature)\n   return self.output\nend\n\nfunction Model:backward(input, grad_output)\n   self.grad_feature = self.temporal:backward(self.feature, grad_output)\n   self.grad_input = self.onehot:backward(input, self.grad_feature)\n   return self.grad_input\nend\n\nfunction Model:getParameters()\n   return nn.Module.getParameters(self)\nend\n\nfunction Model:parameters()\n   local parameters, gradients = {}, {}\n\n   if not self.pretrain then\n      local onehot_parameters, onehot_gradients =\n         self.onehot:parameters()\n      for i = 1, #onehot_parameters do\n         parameters[#parameters + 1] = onehot_parameters[i]\n         gradients[#gradients + 1] = onehot_gradients[i]\n      end\n   end\n\n   local temporal_parameters, temporal_gradients = self.temporal:parameters()\n   for i = 1, #temporal_parameters do\n      parameters[#parameters + 1] = temporal_parameters[i]\n      gradients[#gradients + 1] = temporal_gradients[i]\n   end\n\n   return parameters, gradients\nend\n\nfunction Model:type(tensortype)\n   if tensortype ~= nil and tensortype ~= self.tensortype then\n      if tensortype == 'torch.CudaTensor' then\n         require('cunn')\n         self.onehot = self:makeCudaSequential(self.onehot)\n         self.temporal = self:makeCudaSequential(self.temporal)\n      else\n         self.onehot = self:makeCleanSequential(self.onehot)\n         self.temporal = self:makeCleanSequential(self.temporal)\n      end\n      self.onehot:type(tensortype)\n      self.temporal:type(tensortype)\n      self.tensortype = tensortype\n   end\n\n   return self.tensortype\nend\n\nfunction Model:setMode(mode)\n   self:setModeSequential(self.onehot, mode)\n   self:setModeSequential(self.temporal, mode)\nend\n\n\nfunction Model:save(file)\n   local onehot = self:clearSequential(\n      self:makeCleanSequential(self.onehot))\n   local temporal = self:clearSequential(\n      self:makeCleanSequential(self.temporal))\n   torch.save(file, {onehot = onehot, temporal = temporal})\nend\n\nreturn Model\n"
  },
  {
    "path": "onehotnet/unittest/data.lua",
    "content": "--[[\nUnit test for OnehotNet data component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Data = require('data')\n\nlocal image = require('image')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.train_data.length = 2048\n   config.test_data.length = 2048\n\n   print('Creating testing data object')\n   local data = Data(config.test_data)\n\n   self.config = config\n   self.data = data\nend\n\nfunction joe:getBatchTest()\n   local data = self.data\n   print('Getting a batch')\n   local sample, label = data:getBatch()\n   local win = image.display{image = sample[1]:narrow(2, 1, 512), zoom = 3}\n   print('Getting a second batch')\n   sample, label = data:getBatch(sample, label)\n   win = image.display{\n      win = win, image = sample[1]:narrow(2, 1, 512), zoom = 3}\nend\n\nfunction joe:iteratorTest()\n   local data = self.data\n   local win\n   for sample, label, count in data:iterator() do\n      win = image.display{\n         win = win, image = sample[1]:narrow(2, 1, 512), zoom = 3}\n      io.write(count, ':')\n      for i = 1, count do\n         io.write(' ', label[i])\n      end\n      io.write('\\n')\n      io.flush()\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/driver.lua",
    "content": "--[[\nUnit test for OnehotNet driver component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Driver = require('driver')\n\n--  A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Creating driver')\n   config.train_data.file = 'data/dianping/unittest_string.t7b'\n   config.test_data.file = 'data/dianping/unittest_string.t7b'\n   config.driver.debug = true\n   config.driver.device = 3\n   config.driver.steps = 10\n   config.driver.epoches = 5\n   local driver = Driver(config, config.driver)\n\n   self.config = config\n   self.driver = driver\nend\n\nfunction joe:driverTest()\n   local driver = self.driver\n   print('Testing driver')\n   driver:run()\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/model.lua",
    "content": "--[[\nUnit Test for OnehotNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n\n   local model = Model(config.model)\n   print('Onehot model:')\n   print(model.onehot)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(2, 256, 2048)\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size())\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Onehot model')\n   print(loaded.onehot)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/model_cuda.lua",
    "content": "--[[\nUnit Test for OnehotNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   config.model.cudnn = false\n   local model = Model(config.model)\n   model:cuda()\n   print('Onehot model:')\n   print(model.onehot)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(2, 256, 2048):cuda()\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size()):cuda()\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Onehot model')\n   print(loaded.onehot)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/model_cudnn.lua",
    "content": "--[[\nUnit Test for OnehotNet model\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Model = require('model')\n\nlocal cutorch = require('cutorch')\nlocal sys = require('sys')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   config.model.cudnn = true\n   local model = Model(config.model)\n   model:cuda()\n   print('Onehot model:')\n   print(model.onehot)\n   print('Temporal model:')\n   print(model.temporal)\n\n   self.config = config\n   self.model = model\nend\n\nfunction joe:modelTest()\n   local model = self.model\n\n   local params, grads = model:getParameters()\n   grads:zero()\n   print('Number of elements in parameters and gradients: '..\n            params:nElement()..', '..grads:nElement())\n\n   print('Creating input')\n   local input = torch.rand(2, 256, 2048):cuda()\n   print(input:size())\n\n   print('Forward propagating')\n   sys.tic()\n   local output = model:forward(input)\n   sys.toc(true)\n   print(output:size())\n\n   print('Creating output gradients')\n   local grad_output = torch.rand(output:size()):cuda()\n   print(grad_output:size())\n\n   print('Backward propagating')\n   sys.tic()\n   local grad_input = model:backward(input, grad_output)\n   sys.toc(true)\n   print(grad_input:size())\nend\n\nfunction joe:modeTest()\n   local model = self.model\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to train')\n   model:setModeTrain()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\n\n   print('Setting model to test')\n   model:setModeTest()\n   for i, m in ipairs(model.temporal.modules) do\n      if torch.type(m) == 'nn.Dropout' then\n         print(i, torch.type(m), m.train)\n      end\n   end\nend\n\nfunction joe:saveTest()\n   local model = self.model\n   print('Saving to /tmp/model.t7b')\n   model:save('/tmp/model.t7b')\n\n   print('Loading from /tmp/model.t7b')\n   local config = self.config\n   config.model.file = '/tmp/model.t7b'\n   local loaded = Model(config.model)\n\n   print('Onehot model')\n   print(loaded.onehot)\n   print('Temporal model')\n   print(loaded.temporal)\n\n   config.model.file = nil\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/test.lua",
    "content": "--[[\nUnit test for OnehotNet test component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/test_cuda.lua",
    "content": "--[[\nUnit test for OnehotNet test component\nCopyright 2016 Xiang Zhang\n--]]\n\nlocal Test = require('test')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create tester')\n   local test = Test(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.test = test\n   self.config = config\nend\n\nfunction joe:testTest()\n   local test = self.test\n   local callback = self:callback()\n\n   print('Running tests')\n   test:run(callback)\nend\n\nfunction joe:callback()\n   return function (test, i)\n      print('cnt: '..test.total_count..', err: '..test.total_error..\n               ', lss: '..test.total_objective..', obj: '..test.objective..\n               ', crr: '..test.error..', dat: '..test.time.data..\n               ', fwd: '..test.time.forward..', upd: '..test.time.update)\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/train.lua",
    "content": "--[[\nUnit test for OnehotNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   config.test_data.batch = 2\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   print('Create trainer')\n   for i, v in pairs(config.train.rates) do\n      config.train.rates[i] = v * config.driver.rate\n   end\n   local train = Train(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 10 steps')\n   train:run(100, callback)\nend\n\nfunction joe:callback()\n   self.time = os.time()\n   return function (train, i)\n      if os.difftime(os.time(), self.time) >= 5 then\n         print('stp: '..train.step..', rat: '..train.rate..\n                  ', err: '..train.error..', obj: '..train.objective..\n                  ', dat: '..train.time.data..', fwd: '..train.time.forward..\n                  ', bwd: '..train.time.backward..', upd: '..train.time.update)\n         self.time = os.time()\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "onehotnet/unittest/train_cuda.lua",
    "content": "--[[\nUnit test for OnehotNet train component\nCopyright 2015-2016 Xiang Zhang\n--]]\n\nlocal Train = require('train')\n\nlocal cutorch = require('cutorch')\nlocal nn = require('nn')\nlocal os = require('os')\n\nlocal Data = require('data')\nlocal Model = require('model')\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   if joe.init then\n      print('Initializing testing environment')\n      joe:init()\n   end\n   for name, func in pairs(joe) do\n      if type(name) == 'string' and type(func) == 'function'\n      and name:match('[%g]+Test') then\n         print('\\nExecuting '..name)\n         func(joe)\n      end\n   end\nend\n\nfunction joe:init()\n   local config = dofile('config.lua')\n   print('Setting device to '..config.driver.device)\n   cutorch.setDevice(config.driver.device)\n   print('Creating data')\n   config.test_data.length = config.variation['large'].length\n   local data = Data(config.test_data)\n   print('Create model')\n   config.model.onehot = config.variation['large'].onehot\n   config.model.temporal = config.variation['large'].temporal\n   local model = Model(config.model)\n   model:cuda()\n   print('Create loss')\n   local loss = nn[config.driver.loss:sub(4)]()\n   loss:cuda()\n   print('Create trainer')\n   for i, v in pairs(config.train.rates) do\n      config.train.rates[i] = v * config.driver.rate\n   end\n   local train = Train(data, model, loss, config.train)\n\n   self.data = data\n   self.model = model\n   self.loss = loss\n   self.train = train\n   self.config = config\nend\n\nfunction joe:trainTest()\n   local train = self.train\n   local callback = self:callback()\n\n   print('Running for 100000 steps')\n   train:run(100000, callback)\nend\n\nfunction joe:callback()\n   self.time = os.time()\n   return function (train, i)\n      if os.difftime(os.time(), self.time) >= 5 then\n         print('stp: '..train.step..', rat: '..train.rate..\n                  ', err: '..train.error..', obj: '..train.objective..\n                  ', dat: '..train.time.data..', fwd: '..train.time.forward..\n                  ', bwd: '..train.time.backward..', upd: '..train.time.update)\n         self.time = os.time()\n      end\n   end\nend\n\njoe.main()\nreturn joe\n"
  },
  {
    "path": "unifont/createunifont.lua",
    "content": "--[[\nCreate unifont database from png file\nCopyright 2015 Xiang Zhang\n\nUsage: qlua createunifont.lua [input] [output]\n--]]\n\nlocal image = require('image')\nlocal io = require('io')\nlocal math = require(\"math\")\nlocal torch = require(\"torch\")\n\n-- A Logic Named Joe\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1] or 'unifont/unifont-8.0.01.png'\n   local output = arg[2] or 'unifont/unifont-8.0.01.t7b'\n   local row = arg[3] and tonumber(arg[3]) or 256\n   local startx = arg[4] and tonumber(arg[4]) or 33\n   local starty = arg[5] and tonumber(arg[5]) or 65\n   local width = arg[6] and tonumber(arg[6]) or 16\n   local height = arg[7] and tonumber(arg[7]) or width\n   local num = arg[8] and tonumber(arg[8]) or 65536\n\n   print('Loading data from '..input)\n   local im = image.load(input)\n   im = im[1]:double():mul(-1):add(1)\n   local data = torch.Tensor(num, height, width)\n\n   for i = 1, num do\n      local x = startx + math.fmod(i - 1, row) * width\n      local y = starty + math.floor((i - 1)/row) * height\n      data[i]:copy(im[{{y, y + height - 1},{x, x + width - 1}}])\n      if math.fmod(i, 1000) == 0 then\n         io.write('\\rProcessing character: '..i..'/'..num)\n         joe.win = image.display({image = data[i], win = joe.win, zoom = 8})\n      end\n   end\n   joe.win = image.display({image = data[num], win = joe.win, zoom = 8})\n   print('\\rProcessed characters: '..num..'/'..num)\n\n   print('Saving to '..output)\n   torch.save(output, data)\nend\n\njoe.main()\n"
  },
  {
    "path": "unifont/unifont/README.txt",
    "content": "This directory contains GNU Unifont data\n"
  },
  {
    "path": "unifont/visualize.lua",
    "content": "--[[\nVisualizing argument string using GNU Unifont\nCopyright 2015 Xiang Zhang\n--]]\n\nlocal bit32 = require('bit32')\nlocal image = require('image')\nlocal torch = require('torch')\n\nlocal joe = {}\n\nfunction joe.main()\n   local input = arg[1]\n   local unifont = arg[2] or 'unifont/unifont-8.0.01.t7b'\n\n   print('Loading unifont from '..unifont)\n   local data = torch.load(unifont)\n   local sequence = joe.utf8to32(input)\n   local im = torch.Tensor(data:size(2), data:size(3) * #sequence)\n   for i, c in ipairs(sequence) do\n      im:narrow(2, 1 + (i-1)*data:size(3), data:size(3)):copy(data[c + 1])\n   end\n   print('Visualizing')\n   image.display({image = im, zoom = 4})\nend\n\n-- Ref: http://lua-users.org/wiki/LuaUnicode\nfunction joe.utf8to32(utf8str)\n   assert(type(utf8str) == \"string\")\n   local res, seq, val = {}, 0, nil\n   for i = 1, #utf8str do\n      local c = string.byte(utf8str, i)\n      if seq == 0 then\n\t table.insert(res, val)\n\t seq = c < 0x80 and 1 or c < 0xE0 and 2 or c < 0xF0 and 3 or\n\t    c < 0xF8 and 4 or --c < 0xFC and 5 or c < 0xFE and 6 or\n\t    error(\"invalid UTF-8 character sequence\")\n\t val = bit32.band(c, 2^(8-seq) - 1)\n      else\n\t val = bit32.bor(bit32.lshift(val, 6), bit32.band(c, 0x3F))\n      end\n      seq = seq - 1\n   end\n   table.insert(res, val)\n   table.insert(res, 0)\n   return res\nend\n\njoe.main()"
  }
]