[
  {
    "path": ".gitattributes",
    "content": "*.cpp linguist-language=c++\n*.h linguist-language=c++"
  },
  {
    "path": "Makefile",
    "content": "\nCLA=clang++\nCXX=g++\nCXXFLAGS=-std=c++11 -Ofast -DFINAL_CHECK -DSPECIAL_HP -fpermissive\nDEPS=src/beam_cky_parser.cc src/beam_cky_parser.h src/backtrace_iter.cc src/Utils/reader.h src/Utils/network.h src/Utils/codon.h src/Utils/utility_v.h src/Utils/common.h src/Utils/base.h \nBIN=bin/LinearDesign_2D \nUNAME_S := $(shell uname -s)\nUNAME_M := $(shell uname -m)\n\nlineardesign_2D: $(DEPS)\n\t@echo \"Compiling\" $@ \"from\" $< \"...\"\n\tchmod +x lineardesign\n\tmkdir -p ./bin\n\texport LD_LIBRARY_PATH=.:$LD_LIBRARY_PATH\n\nifeq ($(UNAME_S), Linux)\n\tif $(CXX) $(CXXFLAGS) src/linear_design.cpp -o bin/LinearDesign_2D src/Utils/libraries/LinearDesign_linux64.so; then \\\n\t\techo \"Linux system; compiled with g++; finished.\"; \\\n\t\techo \"Compilation Succeed!\"; \\\n\telse \\\n\t\techo \"Try another .so file.\"; \\\n\t\tif $(CXX) $(CXXFLAGS) src/linear_design.cpp -o bin/LinearDesign_2D src/Utils/libraries/LinearDesign_linux64_old.so; then \\\n\t\t\techo \"Linux system; compiled with g++; finished.\"; \\\n\t\t\techo \"Compilation Succeed!\"; \\\n\t\telse \\\n\t\t\techo \"Compilation failed! Make sure it is either Linux-64 or Mac.\"; \\\n\t\tfi \\\n\tfi \nelse\n\tif [[ $(UNAME_M) == 'arm64' ]]; then \\\n\t\tif \t$(CLA) $(CXXFLAGS) src/linear_design.cpp -o bin/LinearDesign_2D src/Utils/libraries/LinearDesign_Mac_M1.so; then \\\n\t\t\techo \"Mac M1 system; compiled with clang++; finished.\"; \\\n\t\t\techo \"Compilation Succeed!\"; \\\n\t\t\techo \"You may encounter a pop-up message at the first run. If so, please go to System Preferences -> Security & Privacy -> General to allow LinearDesign_Mac_M1.so to open. See README.md for details.\"; \\\n\t\telse \\\n\t\t\techo \"Compilation failed! Make sure it is either Linux-64 or Mac.\"; \\\n\t\tfi \\\n\telse \\\n\t\tif \t$(CLA) $(CXXFLAGS) src/linear_design.cpp -o bin/LinearDesign_2D src/Utils/libraries/LinearDesign_Mac_x86.so; then \\\n\t\t\techo \"Mac x86_64 system; compiled with clang++; finished.\"; \\\n\t\t\techo \"Compilation Succeed!\"; \\\n\t\t\techo \"You may encounter a pop-up message at the first run. If so, please go to System Preferences -> Security & Privacy -> General to allow LinearDesign_Mac_x86.so to open. See README.md for details.\"; \\\n\t\telse \\\n\t\t\techo \"Compilation failed! Make sure it is either Linux-64 or Mac.\"; \\\n\t\tfi \\\n\tfi\nendif\n\n\n.PHONY : clean\t\n\nclean:\n\trm -f $(BIN)\n"
  },
  {
    "path": "README.md",
    "content": "<img src=\"/pic/baidu_research_logo.jpg\"  width=\"40%\" alt=\"Baidu Research Logo\">\n\n#  Algorithm for Optimized mRNA Design Improves Stability and Immunogenicity (LinearDesign)\n![GitHub all releases](https://img.shields.io/github/downloads/LinearDesignSoftware/LinearDesign/total)\n\n\nThis repository contains the source code for the LinearDesign project.\n\nHe Zhang†, Liang Zhang†, Ang Lin†, Congcong Xu†, Ziyu Li, Kaibo Liu, Boxiang Liu, Xiaopin Ma, Fanfan Zhao, Huiling Jiang, Chunxiu Chen, Haifa Shen, Hangwen Li*, David H. Mathews*, Yujian Zhang*, Liang Huang†*<sup>#</sup>. Algorithm for Optimized mRNA Design Improves Stability and Immunogenicity. Nature [https://doi.org/10.1038/s41586-023-06127-z](https://doi.org/10.1038/s41586-023-06127-z) (2023)\n\n† contributed equally, \n\\* corresponding authors, \n<sup>#</sup> lead corresponding author\n\nFor questions, please contact the lead corresponding author at <liang.huang.sh@gmail.com>.\n\n## Dependencies\nClang 11.0.0 (or above) or GCC 4.8.5 (or above)\n\npython2.7\n\n## To Compile\n```\nmake\n```\n\n## To Run\nThe LinearDesign program can be run with:\n```\necho SEQUENCE | ./lineardesign [OPTIONS]\n\nOR\n\ncat FASTA_FILE | ./lineardesign [OPTIONS]\n```\n\nOPTIONS:\n```\n--lambda LAMBDA or -l LAMBDA\n```\nSet LAMBDA, a hyperparameter balancing MFE and CAI. (default 0.0)\n```\n--codonusage FILE_NAME or -c FILE_NAME\n```\nImport a Codon Usage Frequency Table. See \"codon_usage_freq_table_human.csv\" for the format.\n(default: using human codon usage frequency table)\n```\n--verbose or -v\n```\nPrint out more details. (default False)\n\nFor Macbook, users may encounter a pop-up message at the first run.\nFor Mac-M1 system, the message is:\n```\n\"LinearDesign_Mac_M1.so\" can't be opened because Apple cannot check it for malicious software.\n```\nFor Mac-Intel system, the message is:\n```\n\"LinearDesign_Mac_Intel.so\" cannot be opened because it is from an unidentified developer.\n```\nIf so, please go to \"System Preferences -> Security & Privacy -> General\" to allow LinearDesign-Mac-M1.so (or LinearDesign-Mac-Intel.so) to open.\n\n## Example: Single Sequence Design\n```\necho MNDTEAI | ./lineardesign\nmRNA sequence:  AUGAACGAUACGGAGGCGAUC\nmRNA structure: ......(((.((....)))))\nmRNA folding free energy: -1.10 kcal/mol; mRNA CAI: 0.695\n```\n\n## Example: Multiple Sequences Design with Option --lambda (-l)\n```\ncat testseq | ./lineardesign --lambda 3\n>seq1\nmRNA sequence:  AUGCCAAACACCCUGGCAUGCCCC\nmRNA structure: ((((((.......)))))).....\nmRNA folding free energy: -6.00 kcal/mol; mRNA CAI: 0.910\n\n>seq2\nmRNA sequence:  AUGCUGGAUCAGGUGAACAAGCUGAAGUACCCAGAGGUGAGCCUGACCUGA\nmRNA structure: .....((.((((((..((...(((.......)))..))..))))))))...\nmRNA folding free energy: -13.50 kcal/mol; mRNA CAI: 0.979\n```\n\n## Example: Option --codonusage (-c)\n```\necho MNDTEAI | ./lineardesign -l 0.3 --codonusage codon_usage_freq_table_yeast.csv\nmRNA sequence:  AUGAAUGAUACGGAAGCGAUC\nmRNA structure: ......(((.((....)))))\nmRNA folding free energy: -1.10 kcal/mol; mRNA CAI: 0.670\n```\n\n## Example: Option --verbose (-v)\n```\necho MNDTEAI | ./lineardesign --verbose\nInput protein: MNDTEAI\nUsing lambda = 0; Using codon frequency table = codon_usage_freq_table_human.csv\nmRNA sequence:  AUGAACGAUACGGAGGCGAUC\nmRNA structure: ......(((.((....)))))\nmRNA folding free energy: -1.10 kcal/mol; mRNA CAI: 0.695\nRuntime: 0.002 seconds\n```\n\n\n## Declarations\nBaidu Research has filed a patent for the LinearDesign algorithm that lists He Zhang, Liang Zhang, Ziyu Li, Kaibo Liu, Boxiang Liu, and Liang Huang as inventors.\n"
  },
  {
    "path": "coding_wheel.txt",
    "content": "Phe\tU U CU\nLeu\tC U GCUA\tU U GA\t\nSer\tU C GCUA\tA G CU\nTyr\tU A CU\nSTOP\tU A GA\tU G A\nCys\tU G CU\nTrp\tU G G\nPro\tC C GCUA\nHis\tC A CU\nGln\tC A GA\nArg\tC G GCUA\tA G GA\nIle\tA U CUA\nMet\tA U G\nThr\tA C GCUA\nAsn\tA A CU\nLys\tA A GA\nVal\tG U GCUA\nAsp\tG A CU\nGlu\tG A GA\nGly\tG G GCUA\nAla\tG C GCUA"
  },
  {
    "path": "codon_usage_freq_table_human.csv",
    "content": "﻿#,,\r\nUAA,*,0.28\r\nUAG,*,0.2\r\nUGA,*,0.52\r\nGCU,A,0.26\r\nGCC,A,0.4\r\nGCA,A,0.23\r\nGCG,A,0.11\r\nUGU,C,0.45\r\nUGC,C,0.55\r\nGAU,D,0.46\r\nGAC,D,0.54\r\nGAA,E,0.42\r\nGAG,E,0.58\r\nUUU,F,0.45\r\nUUC,F,0.55\r\nGGU,G,0.16\r\nGGC,G,0.34\r\nGGA,G,0.25\r\nGGG,G,0.25\r\nCAU,H,0.41\r\nCAC,H,0.59\r\nAUU,I,0.36\r\nAUC,I,0.48\r\nAUA,I,0.16\r\nAAA,K,0.42\r\nAAG,K,0.58\r\nUUA,L,0.07\r\nUUG,L,0.13\r\nCUU,L,0.13\r\nCUC,L,0.2\r\nCUA,L,0.07\r\nCUG,L,0.41\r\nAUG,M,1\r\nAAU,N,0.46\r\nAAC,N,0.54\r\nCCU,P,0.28\r\nCCC,P,0.33\r\nCCA,P,0.27\r\nCCG,P,0.11\r\nCAA,Q,0.25\r\nCAG,Q,0.75\r\nCGU,R,0.08\r\nCGC,R,0.19\r\nCGA,R,0.11\r\nCGG,R,0.21\r\nAGA,R,0.2\r\nAGG,R,0.2\r\nUCU,S,0.18\r\nUCC,S,0.22\r\nUCA,S,0.15\r\nUCG,S,0.06\r\nAGU,S,0.15\r\nAGC,S,0.24\r\nACU,T,0.24\r\nACC,T,0.36\r\nACA,T,0.28\r\nACG,T,0.12\r\nGUU,V,0.18\r\nGUC,V,0.24\r\nGUA,V,0.11\r\nGUG,V,0.47\r\nUGG,W,1\r\nUAU,Y,0.43\r\nUAC,Y,0.57\r\n"
  },
  {
    "path": "codon_usage_freq_table_yeast.csv",
    "content": "﻿#,,\r\nUAA,*,0.48\r\nUAG,*,0.24\r\nUGA,*,0.29\r\nGCU,A,0.38\r\nGCC,A,0.22\r\nGCA,A,0.29\r\nGCG,A,0.11\r\nUGU,C,0.63\r\nUGC,C,0.37\r\nGAU,D,0.65\r\nGAC,D,0.35\r\nGAA,E,0.71\r\nGAG,E,0.29\r\nUUU,F,0.59\r\nUUC,F,0.41\r\nGGU,G,0.47\r\nGGC,G,0.19\r\nGGA,G,0.22\r\nGGG,G,0.12\r\nCAU,H,0.64\r\nCAC,H,0.36\r\nAUU,I,0.46\r\nAUC,I,0.26\r\nAUA,I,0.27\r\nAAA,K,0.58\r\nAAG,K,0.42\r\nUUA,L,0.28\r\nUUG,L,0.29\r\nCUU,L,0.13\r\nCUC,L,0.06\r\nCUA,L,0.14\r\nCUG,L,0.11\r\nAUG,M,1\r\nAAU,N,0.59\r\nAAC,N,0.41\r\nCCU,P,0.31\r\nCCC,P,0.15\r\nCCA,P,0.41\r\nCCG,P,0.12\r\nCAA,Q,0.69\r\nCAG,Q,0.31\r\nCGU,R,0.15\r\nCGC,R,0.06\r\nCGA,R,0.07\r\nCGG,R,0.04\r\nAGA,R,0.48\r\nAGG,R,0.21\r\nUCU,S,0.26\r\nUCC,S,0.16\r\nUCA,S,0.21\r\nUCG,S,0.1\r\nAGU,S,0.16\r\nAGC,S,0.11\r\nACU,T,0.35\r\nACC,T,0.22\r\nACA,T,0.3\r\nACG,T,0.13\r\nGUU,V,0.39\r\nGUC,V,0.21\r\nGUA,V,0.21\r\nGUG,V,0.19\r\nUGG,W,1\r\nUAU,Y,0.56\r\nUAC,Y,0.44\r\n"
  },
  {
    "path": "gflags.py",
    "content": "#!/usr/bin/env python\n\n# Copyright (c) 2007, Google Inc.\n# All rights reserved.\n#\n# Redistribution and use in source and binary forms, with or without\n# modification, are permitted provided that the following conditions are\n# met:\n#\n#     * Redistributions of source code must retain the above copyright\n# notice, this list of conditions and the following disclaimer.\n#     * Redistributions in binary form must reproduce the above\n# copyright notice, this list of conditions and the following disclaimer\n# in the documentation and/or other materials provided with the\n# distribution.\n#     * Neither the name of Google Inc. nor the names of its\n# contributors may be used to endorse or promote products derived from\n# this software without specific prior written permission.\n#\n# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS\n# \"AS IS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT\n# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR\n# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT\n# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL,\n# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT\n# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE,\n# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY\n# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT\n# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE\n# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n#\n# ---\n# Author: Chad Lester\n# Design and style contributions by:\n#   Amit Patel, Bogdan Cocosel, Daniel Dulitz, Eric Tiedemann,\n#   Eric Veach, Laurence Gonsalves, Matthew Springer\n# Code reorganized a bit by Craig Silverstein\n\n\"\"\"This module is used to define and parse command line flags.\n\nThis module defines a *distributed* flag-definition policy: rather than\nan application having to define all flags in or near main(), each python\nmodule defines flags that are useful to it.  When one python module\nimports another, it gains access to the other's flags.  (This is\nimplemented by having all modules share a common, global registry object\ncontaining all the flag information.)\n\nFlags are defined through the use of one of the DEFINE_xxx functions.\nThe specific function used determines how the flag is parsed, checked,\nand optionally type-converted, when it's seen on the command line.\n\n\nIMPLEMENTATION: DEFINE_* creates a 'Flag' object and registers it with a\n'FlagValues' object (typically the global FlagValues FLAGS, defined\nhere).  The 'FlagValues' object can scan the command line arguments and\npass flag arguments to the corresponding 'Flag' objects for\nvalue-checking and type conversion.  The converted flag values are\navailable as attributes of the 'FlagValues' object.\n\nCode can access the flag through a FlagValues object, for instance\ngflags.FLAGS.myflag.  Typically, the __main__ module passes the\ncommand line arguments to gflags.FLAGS for parsing.\n\nAt bottom, this module calls getopt(), so getopt functionality is\nsupported, including short- and long-style flags, and the use of -- to\nterminate flags.\n\nMethods defined by the flag module will throw 'FlagsError' exceptions.\nThe exception argument will be a human-readable string.\n\n\nFLAG TYPES: This is a list of the DEFINE_*'s that you can do.  All flags\ntake a name, default value, help-string, and optional 'short' name\n(one-letter name).  Some flags have other arguments, which are described\nwith the flag.\n\nDEFINE_string: takes any input, and interprets it as a string.\n\nDEFINE_bool or\nDEFINE_boolean: typically does not take an argument: say --myflag to\n                set FLAGS.myflag to true, or --nomyflag to set\n                FLAGS.myflag to false.  Alternately, you can say\n                   --myflag=true  or --myflag=t or --myflag=1  or\n                   --myflag=false or --myflag=f or --myflag=0\n\nDEFINE_float: takes an input and interprets it as a floating point\n              number.  Takes optional args lower_bound and upper_bound;\n              if the number specified on the command line is out of\n              range, it will raise a FlagError.\n\nDEFINE_integer: takes an input and interprets it as an integer.  Takes\n                optional args lower_bound and upper_bound as for floats.\n\nDEFINE_enum: takes a list of strings which represents legal values.  If\n             the command-line value is not in this list, raise a flag\n             error.  Otherwise, assign to FLAGS.flag as a string.\n\nDEFINE_list: Takes a comma-separated list of strings on the commandline.\n             Stores them in a python list object.\n\nDEFINE_spaceseplist: Takes a space-separated list of strings on the\n                     commandline.  Stores them in a python list object.\n                     Example: --myspacesepflag \"foo bar baz\"\n\nDEFINE_multistring: The same as DEFINE_string, except the flag can be\n                    specified more than once on the commandline.  The\n                    result is a python list object (list of strings),\n                    even if the flag is only on the command line once.\n\nDEFINE_multi_int: The same as DEFINE_integer, except the flag can be\n                  specified more than once on the commandline.  The\n                  result is a python list object (list of ints), even if\n                  the flag is only on the command line once.\n\n\nSPECIAL FLAGS: There are a few flags that have special meaning:\n   --help          prints a list of all the flags in a human-readable fashion\n   --helpshort     prints a list of all key flags (see below).\n   --helpxml       prints a list of all flags, in XML format.  DO NOT parse\n                   the output of --help and --helpshort.  Instead, parse\n                   the output of --helpxml.  As we add new flags, we may\n                   add new XML elements.  Hence, make sure your parser\n                   does not crash when it encounters new XML elements.\n   --flagfile=foo  read flags from foo.\n   --undefok=f1,f2 ignore unrecognized option errors for f1,f2.\n                   For boolean flags, you should use --undefok=boolflag, and\n                   --boolflag and --noboolflag will be accepted.  Do not use\n                   --undefok=noboolflag.\n   --              as in getopt(), terminates flag-processing\n\n\nNOTE ON --flagfile:\n\nFlags may be loaded from text files in addition to being specified on\nthe commandline.\n\nAny flags you don't feel like typing, throw them in a file, one flag per\nline, for instance:\n   --myflag=myvalue\n   --nomyboolean_flag\nYou then specify your file with the special flag '--flagfile=somefile'.\nYou CAN recursively nest flagfile= tokens OR use multiple files on the\ncommand line.  Lines beginning with a single hash '#' or a double slash\n'//' are comments in your flagfile.\n\nAny flagfile=<file> will be interpreted as having a relative path from\nthe current working directory rather than from the place the file was\nincluded from:\n   myPythonScript.py --flagfile=config/somefile.cfg\n\nIf somefile.cfg includes further --flagfile= directives, these will be\nreferenced relative to the original CWD, not from the directory the\nincluding flagfile was found in!\n\nThe caveat applies to people who are including a series of nested files\nin a different dir than they are executing out of.  Relative path names\nare always from CWD, not from the directory of the parent include\nflagfile. We do now support '~' expanded directory names.\n\nAbsolute path names ALWAYS work!\n\n\nEXAMPLE USAGE:\n\n  import gflags\n  FLAGS = gflags.FLAGS\n\n  # Flag names are globally defined!  So in general, we need to be\n  # careful to pick names that are unlikely to be used by other libraries.\n  # If there is a conflict, we'll get an error at import time.\n  gflags.DEFINE_string('name', 'Mr. President', 'your name')\n  gflags.DEFINE_integer('age', None, 'your age in years', lower_bound=0)\n  gflags.DEFINE_boolean('debug', False, 'produces debugging output')\n  gflags.DEFINE_enum('gender', 'male', ['male', 'female'], 'your gender')\n\n  def main(argv):\n    try:\n      argv = FLAGS(argv)  # parse flags\n    except gflags.FlagsError, e:\n      print '%s\\\\nUsage: %s ARGS\\\\n%s' % (e, sys.argv[0], FLAGS)\n      sys.exit(1)\n    if FLAGS.debug: print 'non-flag arguments:', argv\n    print 'Happy Birthday', FLAGS.name\n    if FLAGS.age is not None:\n      print 'You are a %s, who is %d years old' % (FLAGS.gender, FLAGS.age)\n\n  if __name__ == '__main__':\n    main(sys.argv)\n\n\nKEY FLAGS:\n\nAs we already explained, each module gains access to all flags defined\nby all the other modules it transitively imports.  In the case of\nnon-trivial scripts, this means a lot of flags ...  For documentation\npurposes, it is good to identify the flags that are key (i.e., really\nimportant) to a module.  Clearly, the concept of \"key flag\" is a\nsubjective one.  When trying to determine whether a flag is key to a\nmodule or not, assume that you are trying to explain your module to a\npotential user: which flags would you really like to mention first?\n\nWe'll describe shortly how to declare which flags are key to a module.\nFor the moment, assume we know the set of key flags for each module.\nThen, if you use the app.py module, you can use the --helpshort flag to\nprint only the help for the flags that are key to the main module, in a\nhuman-readable format.\n\nNOTE: If you need to parse the flag help, do NOT use the output of\n--help / --helpshort.  That output is meant for human consumption, and\nmay be changed in the future.  Instead, use --helpxml; flags that are\nkey for the main module are marked there with a <key>yes</key> element.\n\nThe set of key flags for a module M is composed of:\n\n1. Flags defined by module M by calling a DEFINE_* function.\n\n2. Flags that module M explictly declares as key by using the function\n\n     DECLARE_key_flag(<flag_name>)\n\n3. Key flags of other modules that M specifies by using the function\n\n     ADOPT_module_key_flags(<other_module>)\n\n   This is a \"bulk\" declaration of key flags: each flag that is key for\n   <other_module> becomes key for the current module too.\n\nNotice that if you do not use the functions described at points 2 and 3\nabove, then --helpshort prints information only about the flags defined\nby the main module of our script.  In many cases, this behavior is good\nenough.  But if you move part of the main module code (together with the\nrelated flags) into a different module, then it is nice to use\nDECLARE_key_flag / ADOPT_module_key_flags and make sure --helpshort\nlists all relevant flags (otherwise, your code refactoring may confuse\nyour users).\n\nNote: each of DECLARE_key_flag / ADOPT_module_key_flags has its own\npluses and minuses: DECLARE_key_flag is more targeted and may lead a\nmore focused --helpshort documentation.  ADOPT_module_key_flags is good\nfor cases when an entire module is considered key to the current script.\nAlso, it does not require updates to client scripts when a new flag is\nadded to the module.\n\n\nEXAMPLE USAGE 2 (WITH KEY FLAGS):\n\nConsider an application that contains the following three files (two\nauxiliary modules and a main module):\n\nFile libfoo.py:\n\n  import gflags\n\n  gflags.DEFINE_integer('num_replicas', 3, 'Number of replicas to start')\n  gflags.DEFINE_boolean('rpc2', True, 'Turn on the usage of RPC2.')\n\n  ... some code ...\n\nFile libbar.py:\n\n  import gflags\n\n  gflags.DEFINE_string('bar_gfs_path', '/gfs/path',\n                       'Path to the GFS files for libbar.')\n  gflags.DEFINE_string('email_for_bar_errors', 'bar-team@google.com',\n                       'Email address for bug reports about module libbar.')\n  gflags.DEFINE_boolean('bar_risky_hack', False,\n                        'Turn on an experimental and buggy optimization.')\n\n  ... some code ...\n\nFile myscript.py:\n\n  import gflags\n  import libfoo\n  import libbar\n\n  gflags.DEFINE_integer('num_iterations', 0, 'Number of iterations.')\n\n  # Declare that all flags that are key for libfoo are\n  # key for this module too.\n  gflags.ADOPT_module_key_flags(libfoo)\n\n  # Declare that the flag --bar_gfs_path (defined in libbar) is key\n  # for this module.\n  gflags.DECLARE_key_flag('bar_gfs_path')\n\n  ... some code ...\n\nWhen myscript is invoked with the flag --helpshort, the resulted help\nmessage lists information about all the key flags for myscript:\n--num_iterations, --num_replicas, --rpc2, and --bar_gfs_path (in\naddition to the special flags --help and --helpshort).\n\nOf course, myscript uses all the flags declared by it (in this case,\njust --num_replicas) or by any of the modules it transitively imports\n(e.g., the modules libfoo, libbar).  E.g., it can access the value of\nFLAGS.bar_risky_hack, even if --bar_risky_hack is not declared as a key\nflag for myscript.\n\"\"\"\n\n#import cgi\nimport getopt\nimport os\nimport re\nimport string\nimport sys\n\n# Are we running at least python 2.2?                                           \ntry:\n  if tuple(sys.version_info[:3]) < (2,2,0):\n    raise NotImplementedError(\"requires python 2.2.0 or later\")\nexcept AttributeError:   # a very old python, that lacks sys.version_info       \n  raise NotImplementedError(\"requires python 2.2.0 or later\")\n\n# If we're not running at least python 2.2.1, define True, False, and bool.     \n# Thanks, Guido, for the code.                                                  \ntry:\n  True, False, bool\nexcept NameError:\n  False = 0\n  True = 1\n  def bool(x):\n    if x:\n      return True\n    else:\n      return False\n\n# Are we running under pychecker?\n_RUNNING_PYCHECKER = 'pychecker.python' in sys.modules\n\n\ndef _GetCallingModule():\n  \"\"\"Returns the name of the module that's calling into this module.\n\n  We generally use this function to get the name of the module calling a\n  DEFINE_foo... function.\n  \"\"\"\n  # Walk down the stack to find the first globals dict that's not ours.\n  for depth in range(1, sys.getrecursionlimit()):\n    if not sys._getframe(depth).f_globals is globals():\n      module_name = __GetModuleName(sys._getframe(depth).f_globals)\n      if module_name is not None:\n        return module_name\n  raise AssertionError(\"No module was found\")\n\n\n# module exceptions:\nclass FlagsError(Exception):\n  \"\"\"The base class for all flags errors.\"\"\"\n  pass\n\n\nclass DuplicateFlag(FlagsError):\n  \"\"\"Raised if there is a flag naming conflict.\"\"\"\n  pass\n\n\n# A DuplicateFlagError conveys more information than a\n# DuplicateFlag. Since there are external modules that create\n# DuplicateFlags, the interface to DuplicateFlag shouldn't change.\nclass DuplicateFlagError(DuplicateFlag):\n\n  def __init__(self, flagname, flag_values):\n    self.flagname = flagname\n    message = \"The flag '%s' is defined twice.\" % self.flagname\n    flags_by_module = flag_values.FlagsByModuleDict()\n    for module in flags_by_module:\n      for flag in flags_by_module[module]:\n        if flag.name == flagname or flag.short_name == flagname:\n          message = message + \" First from \" + module + \",\"\n          break\n    message = message + \" Second from \" + _GetCallingModule()\n    DuplicateFlag.__init__(self, message)\n\n\nclass IllegalFlagValue(FlagsError):\n  \"\"\"The flag command line argument is illegal.\"\"\"\n  pass\n\n\nclass UnrecognizedFlag(FlagsError):\n  \"\"\"Raised if a flag is unrecognized.\"\"\"\n  pass\n\n\n# An UnrecognizedFlagError conveys more information than an\n# UnrecognizedFlag. Since there are external modules that create\n# DuplicateFlags, the interface to DuplicateFlag shouldn't change.\nclass UnrecognizedFlagError(UnrecognizedFlag):\n  def __init__(self, flagname):\n    self.flagname = flagname\n    UnrecognizedFlag.__init__(\n        self, \"Unknown command line flag '%s'\" % flagname)\n\n\n# Global variable used by expvar\n_exported_flags = {}\n_help_width = 80  # width of help output\n\n\ndef GetHelpWidth():\n  \"\"\"Returns: an integer, the width of help lines that is used in TextWrap.\"\"\"\n  return _help_width\n\n\ndef CutCommonSpacePrefix(text):\n  \"\"\"Removes a common space prefix from the lines of a multiline text.\n\n  If the first line does not start with a space, it is left as it is and\n  only in the remaining lines a common space prefix is being searched\n  for. That means the first line will stay untouched. This is especially\n  useful to turn doc strings into help texts. This is because some\n  people prefer to have the doc comment start already after the\n  apostrophy and then align the following lines while others have the\n  apostrophies on a seperately line.\n\n  The function also drops trailing empty lines and ignores empty lines\n  following the initial content line while calculating the initial\n  common whitespace.\n\n  Args:\n    text: text to work on\n\n  Returns:\n    the resulting text\n  \"\"\"\n  text_lines = text.splitlines()\n  # Drop trailing empty lines\n  while text_lines and not text_lines[-1]:\n    text_lines = text_lines[:-1]\n  if text_lines:\n    # We got some content, is the first line starting with a space?\n    if text_lines[0] and text_lines[0][0].isspace():\n      text_first_line = []\n    else:\n      text_first_line = [text_lines.pop(0)]\n    # Calculate length of common leading whitesppace (only over content lines)\n    common_prefix = os.path.commonprefix([line for line in text_lines if line])\n    space_prefix_len = len(common_prefix) - len(common_prefix.lstrip())\n    # If we have a common space prefix, drop it from all lines\n    if space_prefix_len:\n      for index in xrange(len(text_lines)):\n        if text_lines[index]:\n          text_lines[index] = text_lines[index][space_prefix_len:]\n    return '\\n'.join(text_first_line + text_lines)\n  return ''\n\n\ndef TextWrap(text, length=None, indent='', firstline_indent=None, tabs='    '):\n  \"\"\"Wraps a given text to a maximum line length and returns it.\n\n  We turn lines that only contain whitespaces into empty lines.  We keep\n  new lines and tabs (e.g., we do not treat tabs as spaces).\n\n  Args:\n    text:             text to wrap\n    length:           maximum length of a line, includes indentation\n                      if this is None then use GetHelpWidth()\n    indent:           indent for all but first line\n    firstline_indent: indent for first line; if None, fall back to indent\n    tabs:             replacement for tabs\n\n  Returns:\n    wrapped text\n\n  Raises:\n    FlagsError: if indent not shorter than length\n    FlagsError: if firstline_indent not shorter than length\n  \"\"\"\n  # Get defaults where callee used None\n  if length is None:\n    length = GetHelpWidth()\n  if indent is None:\n    indent = ''\n  if len(indent) >= length:\n    raise FlagsError('Indent must be shorter than length')\n  # In line we will be holding the current line which is to be started\n  # with indent (or firstline_indent if available) and then appended\n  # with words.\n  if firstline_indent is None:\n    firstline_indent = ''\n    line = indent\n  else:\n    line = firstline_indent\n    if len(firstline_indent) >= length:\n      raise FlagsError('First iline indent must be shorter than length')\n\n  # If the callee does not care about tabs we simply convert them to\n  # spaces If callee wanted tabs to be single space then we do that\n  # already here.\n  if not tabs or tabs == ' ':\n    text = text.replace('\\t', ' ')\n  else:\n    tabs_are_whitespace = not tabs.strip()\n\n  line_regex = re.compile('([ ]*)(\\t*)([^ \\t]+)', re.MULTILINE)\n\n  # Split the text into lines and the lines with the regex above. The\n  # resulting lines are collected in result[]. For each split we get the\n  # spaces, the tabs and the next non white space (e.g. next word).\n  result = []\n  for text_line in text.splitlines():\n    # Store result length so we can find out whether processing the next\n    # line gave any new content\n    old_result_len = len(result)\n    # Process next line with line_regex. For optimization we do an rstrip().\n    # - process tabs (changes either line or word, see below)\n    # - process word (first try to squeeze on line, then wrap or force wrap)\n    # Spaces found on the line are ignored, they get added while wrapping as\n    # needed.\n    for spaces, current_tabs, word in line_regex.findall(text_line.rstrip()):\n      # If tabs weren't converted to spaces, handle them now\n      if current_tabs:\n        # If the last thing we added was a space anyway then drop\n        # it. But let's not get rid of the indentation.\n        if (((result and line != indent) or\n             (not result and line != firstline_indent)) and line[-1] == ' '):\n          line = line[:-1]\n        # Add the tabs, if that means adding whitespace, just add it at\n        # the line, the rstrip() code while shorten the line down if\n        # necessary\n        if tabs_are_whitespace:\n          line += tabs * len(current_tabs)\n        else:\n          # if not all tab replacement is whitespace we prepend it to the word\n          word = tabs * len(current_tabs) + word\n      # Handle the case where word cannot be squeezed onto current last line\n      if len(line) + len(word) > length and len(indent) + len(word) <= length:\n        result.append(line.rstrip())\n        line = indent + word\n        word = ''\n        # No space left on line or can we append a space?\n        if len(line) + 1 >= length:\n          result.append(line.rstrip())\n          line = indent\n        else:\n          line += ' '\n      # Add word and shorten it up to allowed line length. Restart next\n      # line with indent and repeat, or add a space if we're done (word\n      # finished) This deals with words that caanot fit on one line\n      # (e.g. indent + word longer than allowed line length).\n      while len(line) + len(word) >= length:\n        line += word\n        result.append(line[:length])\n        word = line[length:]\n        line = indent\n      # Default case, simply append the word and a space\n      if word:\n        line += word + ' '\n    # End of input line. If we have content we finish the line. If the\n    # current line is just the indent but we had content in during this\n    # original line then we need to add an emoty line.\n    if (result and line != indent) or (not result and line != firstline_indent):\n      result.append(line.rstrip())\n    elif len(result) == old_result_len:\n      result.append('')\n    line = indent\n\n  return '\\n'.join(result)\n\n\ndef DocToHelp(doc):\n  \"\"\"Takes a __doc__ string and reformats it as help.\"\"\"\n\n  # Get rid of starting and ending white space. Using lstrip() or even\n  # strip() could drop more than maximum of first line and right space\n  # of last line.\n  doc = doc.strip()\n\n  # Get rid of all empty lines\n  whitespace_only_line = re.compile('^[ \\t]+$', re.M)\n  doc = whitespace_only_line.sub('', doc)\n\n  # Cut out common space at line beginnings\n  doc = CutCommonSpacePrefix(doc)\n\n  # Just like this module's comment, comments tend to be aligned somehow.\n  # In other words they all start with the same amount of white space\n  # 1) keep double new lines\n  # 2) keep ws after new lines if not empty line\n  # 3) all other new lines shall be changed to a space\n  # Solution: Match new lines between non white space and replace with space.\n  doc = re.sub('(?<=\\S)\\n(?=\\S)', ' ', doc, re.M)\n\n  return doc\n\n\ndef __GetModuleName(globals_dict):\n  \"\"\"Given a globals dict, returns the name of the module that defines it.\n\n  Args:\n    globals_dict: A dictionary that should correspond to an environment\n      providing the values of the globals.\n\n  Returns:\n    A string (the name of the module) or None (if the module could not\n    be identified.\n  \"\"\"\n  for name, module in sys.modules.iteritems():\n    if getattr(module, '__dict__', None) is globals_dict:\n      if name == '__main__':\n        return sys.argv[0]\n      return name\n  return None\n\n\ndef _GetMainModule():\n  \"\"\"Returns the name of the module from which execution started.\"\"\"\n  for depth in range(1, sys.getrecursionlimit()):\n    try:\n      globals_of_main = sys._getframe(depth).f_globals\n    except ValueError:\n      return __GetModuleName(globals_of_main)\n  raise AssertionError(\"No module was found\")\n\n# lhuang: main entry here\nclass FlagValues:\n  \"\"\"Registry of 'Flag' objects.\n\n  A 'FlagValues' can then scan command line arguments, passing flag\n  arguments through to the 'Flag' objects that it owns.  It also\n  provides easy access to the flag values.  Typically only one\n  'FlagValues' object is needed by an application: gflags.FLAGS\n\n  This class is heavily overloaded:\n\n  'Flag' objects are registered via __setitem__:\n       FLAGS['longname'] = x   # register a new flag\n\n  The .value attribute of the registered 'Flag' objects can be accessed\n  as attributes of this 'FlagValues' object, through __getattr__.  Both\n  the long and short name of the original 'Flag' objects can be used to\n  access its value:\n       FLAGS.longname          # parsed flag value\n       FLAGS.x                 # parsed flag value (short name)\n\n  Command line arguments are scanned and passed to the registered 'Flag'\n  objects through the __call__ method.  Unparsed arguments, including\n  argv[0] (e.g. the program name) are returned.\n       argv = FLAGS(sys.argv)  # scan command line arguments\n\n  The original registered Flag objects can be retrieved through the use\n  of the dictionary-like operator, __getitem__:\n       x = FLAGS['longname']   # access the registered Flag object\n\n  The str() operator of a 'FlagValues' object provides help for all of\n  the registered 'Flag' objects.\n  \"\"\"\n\n  def __init__(self):\n    # Since everything in this class is so heavily overloaded, the only\n    # way of defining and using fields is to access __dict__ directly.\n\n    # Dictionary: flag name (string) -> Flag object.\n    self.__dict__['__flags'] = {}\n    # Dictionary: module name (string) -> list of Flag objects that are defined\n    # by that module.\n    self.__dict__['__flags_by_module'] = {}\n    # Dictionary: module name (string) -> list of Flag objects that are\n    # key for that module.\n    self.__dict__['__key_flags_by_module'] = {}\n\n  def FlagDict(self):\n    return self.__dict__['__flags']\n\n  def FlagsByModuleDict(self):\n    \"\"\"Returns the dictionary of module_name -> list of defined flags.\n\n    Returns:\n      A dictionary.  Its keys are module names (strings).  Its values\n      are lists of Flag objects.\n    \"\"\"\n    return self.__dict__['__flags_by_module']\n\n  def KeyFlagsByModuleDict(self):\n    \"\"\"Returns the dictionary of module_name -> list of key flags.\n\n    Returns:\n      A dictionary.  Its keys are module names (strings).  Its values\n      are lists of Flag objects.\n    \"\"\"\n    return self.__dict__['__key_flags_by_module']\n\n  def _RegisterFlagByModule(self, module_name, flag):\n    \"\"\"Records the module that defines a specific flag.\n\n    We keep track of which flag is defined by which module so that we\n    can later sort the flags by module.\n\n    Args:\n      module_name: A string, the name of a Python module.\n      flag: A Flag object, a flag that is key to the module.\n    \"\"\"\n    flags_by_module = self.FlagsByModuleDict()\n    flags_by_module.setdefault(module_name, []).append(flag)\n\n  def _RegisterKeyFlagForModule(self, module_name, flag):\n    \"\"\"Specifies that a flag is a key flag for a module.\n\n    Args:\n      module_name: A string, the name of a Python module.\n      flag: A Flag object, a flag that is key to the module.\n    \"\"\"\n    key_flags_by_module = self.KeyFlagsByModuleDict()\n    # The list of key flags for the module named module_name.\n    key_flags = key_flags_by_module.setdefault(module_name, [])\n    # Add flag, but avoid duplicates.\n    if flag not in key_flags:\n      key_flags.append(flag)\n\n  def _GetFlagsDefinedByModule(self, module):\n    \"\"\"Returns the list of flags defined by a module.\n\n    Args:\n      module: A module object or a module name (a string).\n\n    Returns:\n      A new list of Flag objects.  Caller may update this list as he\n      wishes: none of those changes will affect the internals of this\n      FlagValue object.\n    \"\"\"\n    if not isinstance(module, str):\n      module = module.__name__\n\n    return list(self.FlagsByModuleDict().get(module, []))\n\n  def _GetKeyFlagsForModule(self, module):\n    \"\"\"Returns the list of key flags for a module.\n\n    Args:\n      module: A module object or a module name (a string)\n\n    Returns:\n      A new list of Flag objects.  Caller may update this list as he\n      wishes: none of those changes will affect the internals of this\n      FlagValue object.\n    \"\"\"\n    if not isinstance(module, str):\n      module = module.__name__\n\n    # Any flag is a key flag for the module that defined it.  NOTE:\n    # key_flags is a fresh list: we can update it without affecting the\n    # internals of this FlagValues object.\n    key_flags = self._GetFlagsDefinedByModule(module)\n\n    # Take into account flags explicitly declared as key for a module.\n    for flag in self.KeyFlagsByModuleDict().get(module, []):\n      if flag not in key_flags:\n        key_flags.append(flag)\n    return key_flags\n\n  def AppendFlagValues(self, flag_values):\n    \"\"\"Appends flags registered in another FlagValues instance.\n\n    Args:\n      flag_values: registry to copy from\n    \"\"\"\n    for flag_name, flag in flag_values.FlagDict().iteritems():\n      # Each flags with shortname appears here twice (once under its\n      # normal name, and again with its short name).  To prevent\n      # problems (DuplicateFlagError) with double flag registration, we\n      # perform a check to make sure that the entry we're looking at is\n      # for its normal name.\n      if flag_name == flag.name:\n        self[flag_name] = flag\n\n  def __setitem__(self, name, flag):\n    \"\"\"Registers a new flag variable.\"\"\"\n    fl = self.FlagDict()\n    if not isinstance(flag, Flag):\n      raise IllegalFlagValue(flag)\n    if not isinstance(name, type(\"\")):\n      raise FlagsError(\"Flag name must be a string\")\n    if len(name) == 0:\n      raise FlagsError(\"Flag name cannot be empty\")\n    # If running under pychecker, duplicate keys are likely to be\n    # defined.  Disable check for duplicate keys when pycheck'ing.\n    if (fl.has_key(name) and not flag.allow_override and\n        not fl[name].allow_override and not _RUNNING_PYCHECKER):\n      raise DuplicateFlagError(name, self)\n    short_name = flag.short_name\n    if short_name is not None:\n      if (fl.has_key(short_name) and not flag.allow_override and\n          not fl[short_name].allow_override and not _RUNNING_PYCHECKER):\n        raise DuplicateFlagError(short_name, self)\n      fl[short_name] = flag\n    fl[name] = flag\n    global _exported_flags\n    _exported_flags[name] = flag\n\n  def __getitem__(self, name):\n    \"\"\"Retrieves the Flag object for the flag --name.\"\"\"\n    return self.FlagDict()[name]\n\n  def __getattr__(self, name):\n    \"\"\"Retrieves the 'value' attribute of the flag --name.\"\"\"\n    fl = self.FlagDict()\n    if not fl.has_key(name):\n      raise AttributeError(name)\n    return fl[name].value\n\n  def __setattr__(self, name, value):\n    \"\"\"Sets the 'value' attribute of the flag --name.\"\"\"\n    fl = self.FlagDict()\n    fl[name].value = value\n    return value\n\n  def _FlagIsRegistered(self, flag_obj):\n    \"\"\"Checks whether a Flag object is registered under some name.\n\n    Note: this is non trivial: in addition to its normal name, a flag\n    may have a short name too.  In self.FlagDict(), both the normal and\n    the short name are mapped to the same flag object.  E.g., calling\n    only \"del FLAGS.short_name\" is not unregistering the corresponding\n    Flag object (it is still registered under the longer name).\n\n    Args:\n      flag_obj: A Flag object.\n\n    Returns:\n      A boolean: True iff flag_obj is registered under some name.\n    \"\"\"\n    flag_dict = self.FlagDict()\n    # Check whether flag_obj is registered under its long name.\n    name = flag_obj.name\n    if flag_dict.get(name, None) == flag_obj:\n      return True\n    # Check whether flag_obj is registered under its short name.\n    short_name = flag_obj.short_name\n    if (short_name is not None and\n        flag_dict.get(short_name, None) == flag_obj):\n      return True\n    # The flag cannot be registered under any other name, so we do not\n    # need to do a full search through the values of self.FlagDict().\n    return False\n\n  def __delattr__(self, flag_name):\n    \"\"\"Deletes a previously-defined flag from a flag object.\n\n    This method makes sure we can delete a flag by using\n\n      del flag_values_object.<flag_name>\n\n    E.g.,\n\n      flags.DEFINE_integer('foo', 1, 'Integer flag.')\n      del flags.FLAGS.foo\n\n    Args:\n      flag_name: A string, the name of the flag to be deleted.\n\n    Raises:\n      AttributeError: When there is no registered flag named flag_name.\n    \"\"\"\n    fl = self.FlagDict()\n    if flag_name not in fl:\n      raise AttributeError(flag_name)\n\n    flag_obj = fl[flag_name]\n    del fl[flag_name]\n\n    if not self._FlagIsRegistered(flag_obj):\n      # If the Flag object indicated by flag_name is no longer\n      # registered (please see the docstring of _FlagIsRegistered), then\n      # we delete the occurences of the flag object in all our internal\n      # dictionaries.\n      self.__RemoveFlagFromDictByModule(self.FlagsByModuleDict(), flag_obj)\n      self.__RemoveFlagFromDictByModule(self.KeyFlagsByModuleDict(), flag_obj)\n\n  def __RemoveFlagFromDictByModule(self, flags_by_module_dict, flag_obj):\n    \"\"\"Removes a flag object from a module -> list of flags dictionary.\n\n    Args:\n      flags_by_module_dict: A dictionary that maps module names to lists of\n        flags.\n      flag_obj: A flag object.\n    \"\"\"\n    for unused_module, flags_in_module in flags_by_module_dict.iteritems():\n      # while (as opposed to if) takes care of multiple occurences of a\n      # flag in the list for the same module.\n      while flag_obj in flags_in_module:\n        flags_in_module.remove(flag_obj)\n\n  def SetDefault(self, name, value):\n    \"\"\"Changes the default value of the named flag object.\"\"\"\n    fl = self.FlagDict()\n    if not fl.has_key(name):\n      raise AttributeError(name)\n    fl[name].SetDefault(value)\n\n  def __contains__(self, name):\n    \"\"\"Returns True if name is a value (flag) in the dict.\"\"\"\n    return name in self.FlagDict()\n\n  has_key = __contains__  # a synonym for __contains__()\n\n  def __iter__(self):\n    return self.FlagDict().iterkeys()\n\n  # lhuang: my stealthy entry point\n  def __call__(self, argv):\n    try:\n\t\t\t# N.B.: return the rest of the command-line! (non-flag arguments)\n      return self.__call2__(argv)\n    except FlagsError, e:\n\t# lhuang: to 2> instead of >\n\timport sys\n        print >> sys.stderr, 'Error: %s\\nUsage: %s [flags]\\n%s' % (e, list(argv)[0], FLAGS)\n        sys.exit(1)\n\n\n  # lhuang: external entry FLAGS(sys.argv) here\n  def __call2__(self, argv):\n    \"\"\"Parses flags from argv; stores parsed flags into this FlagValues object.\n\n    All unparsed arguments are returned.  Flags are parsed using the GNU\n    Program Argument Syntax Conventions, using getopt:\n\n    http://www.gnu.org/software/libc/manual/html_mono/libc.html#Getopt\n\n    Args:\n       argv: argument list. Can be of any type that may be converted to a list.\n\n    Returns:\n       The list of arguments not parsed as options, including argv[0]\n\n    Raises:\n       FlagsError: on any parsing error\n    \"\"\"\n    # Support any sequence type that can be converted to a list\n    argv = list(argv)\n\n    shortopts = \"\"\n    longopts = []\n\n    fl = self.FlagDict()\n\n    # This pre parses the argv list for --flagfile=<> options.\n    argv = self.ReadFlagsFromFiles(argv)\n\n    # Correct the argv to support the google style of passing boolean\n    # parameters.  Boolean parameters may be passed by using --mybool,\n    # --nomybool, --mybool=(true|false|1|0).  getopt does not support\n    # having options that may or may not have a parameter.  We replace\n    # instances of the short form --mybool and --nomybool with their\n    # full forms: --mybool=(true|false).\n    original_argv = list(argv)  # list() makes a copy\n    shortest_matches = None\n    for name, flag in fl.items():\n      if not flag.boolean:\n        continue\n      if shortest_matches is None:\n        # Determine the smallest allowable prefix for all flag names\n        shortest_matches = self.ShortestUniquePrefixes(fl)\n      no_name = 'no' + name\n      prefix = shortest_matches[name]\n      no_prefix = shortest_matches[no_name]\n\n      # Replace all occurences of this boolean with extended forms\n      for arg_idx in range(1, len(argv)):\n        arg = argv[arg_idx]\n        if arg.find('=') >= 0: continue\n        if arg.startswith('--'+prefix) and ('--'+name).startswith(arg):\n          argv[arg_idx] = ('--%s=true' % name)\n        elif arg.startswith('--'+no_prefix) and ('--'+no_name).startswith(arg):\n          argv[arg_idx] = ('--%s=false' % name)\n\n    # Loop over all of the flags, building up the lists of short options\n    # and long options that will be passed to getopt.  Short options are\n    # specified as a string of letters, each letter followed by a colon\n    # if it takes an argument.  Long options are stored in an array of\n    # strings.  Each string ends with an '=' if it takes an argument.\n    for name, flag in fl.items():\n      longopts.append(name + \"=\")\n      if len(name) == 1:  # one-letter option: allow short flag type also\n        shortopts += name\n        if not flag.boolean:\n          shortopts += \":\"\n\n    longopts.append('undefok=')\n    undefok_flags = []\n\n    # In case --undefok is specified, loop to pick up unrecognized\n    # options one by one.\n    unrecognized_opts = []\n    args = argv[1:]\n    while True:\n      try:\n        optlist, unparsed_args = getopt.gnu_getopt(args, shortopts, longopts)\n        break\n      except getopt.GetoptError, e:\n        if not e.opt or e.opt in fl:\n          # Not an unrecognized option, reraise the exception as a FlagsError\n          raise FlagsError(e)\n        # Handle an unrecognized option.\n        unrecognized_opts.append(e.opt)\n        # Remove offender from args and try again\n        for arg_index in range(len(args)):\n          if ((args[arg_index] == '--' + e.opt) or\n              (args[arg_index] == '-' + e.opt) or\n              args[arg_index].startswith('--' + e.opt + '=')):\n            args = args[0:arg_index] + args[arg_index+1:]\n            break\n        else:\n          # We should have found the option, so we don't expect to get\n          # here.  We could assert, but raising the original exception\n          # might work better.\n          raise FlagsError(e)\n\n    for name, arg in optlist:\n      if name == '--undefok':\n        flag_names = arg.split(',')\n        undefok_flags.extend(flag_names)\n        # For boolean flags, if --undefok=boolflag is specified, then we should\n        # also accept --noboolflag, in addition to --boolflag.\n        # Since we don't know the type of the undefok'd flag, this will affect\n        # non-boolean flags as well.\n        # NOTE: You shouldn't use --undefok=noboolflag, because then we will\n        # accept --nonoboolflag here.  We are choosing not to do the conversion\n        # from noboolflag -> boolflag because of the ambiguity that flag names\n        # can start with 'no'.\n        undefok_flags.extend('no' + name for name in flag_names)\n        continue\n      if name.startswith('--'):\n        # long option\n        name = name[2:]\n        short_option = 0\n      else:\n        # short option\n        name = name[1:]\n        short_option = 1\n      if fl.has_key(name):\n        flag = fl[name]\n        if flag.boolean and short_option: arg = 1\n        flag.Parse(arg)\n\n    # If there were unrecognized options, raise an exception unless\n    # the options were named via --undefok.\n    for opt in unrecognized_opts:\n      if opt not in undefok_flags:\n        raise UnrecognizedFlagError(opt)\n\n    if unparsed_args:\n      # unparsed_args becomes the first non-flag detected by getopt to\n      # the end of argv.  Because argv may have been modified above,\n      # return original_argv for this region.\n      return argv[:1] + original_argv[-len(unparsed_args):]\n    else:\n      return argv[:1]\n\n  def Reset(self):\n    \"\"\"Resets the values to the point before FLAGS(argv) was called.\"\"\"\n    for f in self.FlagDict().values():\n      f.Unparse()\n\n  def RegisteredFlags(self):\n    \"\"\"Returns: a list of the names and short names of all registered flags.\"\"\"\n    return self.FlagDict().keys()\n\n  def FlagValuesDict(self):\n    \"\"\"Returns: a dictionary that maps flag names to flag values.\"\"\"\n    flag_values = {}\n\n    for flag_name in self.RegisteredFlags():\n      flag = self.FlagDict()[flag_name]\n      flag_values[flag_name] = flag.value\n\n    return flag_values\n\n  def __str__(self):\n    \"\"\"Generates a help string for all known flags.\"\"\"\n    return self.GetHelp()\n\n  def GetHelp(self, prefix=''):\n    \"\"\"Generates a help string for all known flags.\"\"\"\n    helplist = []\n\n    flags_by_module = self.FlagsByModuleDict()\n    if flags_by_module:\n\n      modules = flags_by_module.keys()\n      modules.sort()\n\n      # Print the help for the main module first, if possible.\n      main_module = _GetMainModule()\n      if main_module in modules:\n        modules.remove(main_module)\n        modules = [main_module] + modules\n\n      for module in modules:\n        self.__RenderOurModuleFlags(module, helplist)\n\n      self.__RenderModuleFlags('gflags',\n                               _SPECIAL_FLAGS.FlagDict().values(),\n                               helplist)\n\n    else:\n      # Just print one long list of flags.\n      self.__RenderFlagList(\n          self.FlagDict().values() + _SPECIAL_FLAGS.FlagDict().values(),\n          helplist, prefix)\n\n    return '\\n'.join(helplist)\n\n  def __RenderModuleFlags(self, module, flags, output_lines, prefix=\"\"):\n    \"\"\"Generates a help string for a given module.\"\"\"\n    # output_lines.append('\\n%s%s:' % (prefix, module))\n    self.__RenderFlagList(flags, output_lines, prefix + \"  \")\n\n  def __RenderOurModuleFlags(self, module, output_lines, prefix=\"\"):\n    \"\"\"Generates a help string for a given module.\"\"\"\n    flags = self._GetFlagsDefinedByModule(module)\n    if flags:\n      self.__RenderModuleFlags(module, flags, output_lines, prefix)\n\n  def __RenderOurModuleKeyFlags(self, module, output_lines, prefix=\"\"):\n    \"\"\"Generates a help string for the key flags of a given module.\n\n    Args:\n      module: A module object or a module name (a string).\n      output_lines: A list of strings.  The generated help message\n        lines will be appended to this list.\n      prefix: A string that is prepended to each generated help line.\n    \"\"\"\n    key_flags = self._GetKeyFlagsForModule(module)\n    if key_flags:\n      self.__RenderModuleFlags(module, key_flags, output_lines, prefix)\n\n  def MainModuleHelp(self):\n    \"\"\"Returns: A string describing the key flags of the main module.\"\"\"\n    helplist = []\n    self.__RenderOurModuleKeyFlags(_GetMainModule(), helplist)\n    return '\\n'.join(helplist)\n\n  def __RenderFlagList(self, flaglist, output_lines, prefix=\"  \"):\n    fl = self.FlagDict()\n    special_fl = _SPECIAL_FLAGS.FlagDict()\n    flaglist = [(flag.name, flag) for flag in flaglist]\n    flaglist.sort()\n    flagset = {}\n    for (name, flag) in flaglist:\n      # It's possible this flag got deleted or overridden since being\n      # registered in the per-module flaglist.  Check now against the\n      # canonical source of current flag information, the FlagDict.\n      if fl.get(name, None) != flag and special_fl.get(name, None) != flag:\n        # a different flag is using this name now\n        continue\n      # only print help once\n      if flagset.has_key(flag): continue\n      flagset[flag] = 1\n      flaghelp = \"\"\n      # lhuang:\n      if flag.name in [\"help\", \"helpshort\"]:\n\t\t\t  continue\n\t\t\t\n      if flag.short_name:\n        flaghelp += \"-\" if len(flag.short_name) == 1 else \"--\"  # lhuang: shortname can be long\n        flaghelp += \"%s,\" % flag.short_name\n      if flag.boolean:\n        flaghelp += \"--[no]%s\" % flag.name + \":\"\n      else:\n        flaghelp += \"--%s\" % flag.name + \":\"\n      flaghelp += \"  \"\n      if flag.help:\n        flaghelp += flag.help\n      flaghelp = TextWrap(flaghelp, indent=prefix+\"  \",\n                          firstline_indent=prefix)\n      if flag.default_as_str:\n        flaghelp += \"\\n\" # lhuang\n        flaghelp += TextWrap(\"(default: %s)\" % flag.default_as_str,\n                             indent=prefix+\"  \")\n      if flag.parser.syntactic_help:\n        flaghelp += \"\\t\" # lhuang\n        flaghelp += TextWrap(\"(%s)\" % flag.parser.syntactic_help,\n                             indent=prefix+\"  \")\n      output_lines.append(flaghelp)\n\n  def get(self, name, default):\n    \"\"\"Returns the value of a flag (if not None) or a default value.\n\n    Args:\n      name: A string, the name of a flag.\n      default: Default value to use if the flag value is None.\n    \"\"\"\n\n    value = self.__getattr__(name)\n    if value is not None:  # Can't do if not value, b/c value might be '0' or \"\"\n      return value\n    else:\n      return default\n\n  def ShortestUniquePrefixes(self, fl):\n    \"\"\"Returns: dictionary; maps flag names to their shortest unique prefix.\"\"\"\n    # Sort the list of flag names\n    sorted_flags = []\n    for name, flag in fl.items():\n      sorted_flags.append(name)\n      if flag.boolean:\n        sorted_flags.append('no%s' % name)\n    sorted_flags.sort()\n\n    # For each name in the sorted list, determine the shortest unique\n    # prefix by comparing itself to the next name and to the previous\n    # name (the latter check uses cached info from the previous loop).\n    shortest_matches = {}\n    prev_idx = 0\n    for flag_idx in range(len(sorted_flags)):\n      curr = sorted_flags[flag_idx]\n      if flag_idx == (len(sorted_flags) - 1):\n        next = None\n      else:\n        next = sorted_flags[flag_idx+1]\n        next_len = len(next)\n      for curr_idx in range(len(curr)):\n        if (next is None\n            or curr_idx >= next_len\n            or curr[curr_idx] != next[curr_idx]):\n          # curr longer than next or no more chars in common\n          shortest_matches[curr] = curr[:max(prev_idx, curr_idx) + 1]\n          prev_idx = curr_idx\n          break\n      else:\n        # curr shorter than (or equal to) next\n        shortest_matches[curr] = curr\n        prev_idx = curr_idx + 1  # next will need at least one more char\n    return shortest_matches\n\n  def __IsFlagFileDirective(self, flag_string):\n    \"\"\"Checks whether flag_string contain a --flagfile=<foo> directive.\"\"\"\n    if isinstance(flag_string, type(\"\")):\n      if flag_string.startswith('--flagfile='):\n        return 1\n      elif flag_string == '--flagfile':\n        return 1\n      elif flag_string.startswith('-flagfile='):\n        return 1\n      elif flag_string == '-flagfile':\n        return 1\n      else:\n        return 0\n    return 0\n\n  def ExtractFilename(self, flagfile_str):\n    \"\"\"Returns filename from a flagfile_str of form -[-]flagfile=filename.\n\n    The cases of --flagfile foo and -flagfile foo shouldn't be hitting\n    this function, as they are dealt with in the level above this\n    function.\n    \"\"\"\n    if flagfile_str.startswith('--flagfile='):\n      return os.path.expanduser((flagfile_str[(len('--flagfile=')):]).strip())\n    elif flagfile_str.startswith('-flagfile='):\n      return os.path.expanduser((flagfile_str[(len('-flagfile=')):]).strip())\n    else:\n      raise FlagsError('Hit illegal --flagfile type: %s' % flagfile_str)\n\n  def __GetFlagFileLines(self, filename, parsed_file_list):\n    \"\"\"Returns the useful (!=comments, etc) lines from a file with flags.\n\n    Args:\n      filename: A string, the name of the flag file.\n      parsed_file_list: A list of the names of the files we have\n        already read.  MUTATED BY THIS FUNCTION.\n\n    Returns:\n      List of strings. See the note below.\n\n    NOTE(springer): This function checks for a nested --flagfile=<foo>\n    tag and handles the lower file recursively. It returns a list of\n    all the lines that _could_ contain command flags. This is\n    EVERYTHING except whitespace lines and comments (lines starting\n    with '#' or '//').\n    \"\"\"\n    line_list = []  # All line from flagfile.\n    flag_line_list = []  # Subset of lines w/o comments, blanks, flagfile= tags.\n    try:\n      file_obj = open(filename, 'r')\n    except IOError, e_msg:\n      print e_msg\n      print 'ERROR:: Unable to open flagfile: %s' % (filename)\n      return flag_line_list\n\n    line_list = file_obj.readlines()\n    file_obj.close()\n    parsed_file_list.append(filename)\n\n    # This is where we check each line in the file we just read.\n    for line in line_list:\n      if line.isspace():\n        pass\n      # Checks for comment (a line that starts with '#').\n      elif line.startswith('#') or line.startswith('//'):\n        pass\n      # Checks for a nested \"--flagfile=<bar>\" flag in the current file.\n      # If we find one, recursively parse down into that file.\n      elif self.__IsFlagFileDirective(line):\n        sub_filename = self.ExtractFilename(line)\n        # We do a little safety check for reparsing a file we've already done.\n        if not sub_filename in parsed_file_list:\n          included_flags = self.__GetFlagFileLines(sub_filename,\n                                                   parsed_file_list)\n          flag_line_list.extend(included_flags)\n        else:  # Case of hitting a circularly included file.\n          print >>sys.stderr, ('Warning: Hit circular flagfile dependency: %s'\n                               % sub_filename)\n      else:\n        # Any line that's not a comment or a nested flagfile should get\n        # copied into 2nd position.  This leaves earlier arguements\n        # further back in the list, thus giving them higher priority.\n        flag_line_list.append(line.strip())\n    return flag_line_list\n\n  def ReadFlagsFromFiles(self, argv):\n    \"\"\"Processes command line args, but also allow args to be read from file.\n    Args:\n      argv: A list of strings, usually sys.argv, which may contain one\n        or more flagfile directives of the form --flagfile=\"./filename\".\n\n    Returns:\n\n      A new list which has the original list combined with what we read\n      from any flagfile(s).\n\n    References: Global gflags.FLAG class instance.\n\n    This function should be called before the normal FLAGS(argv) call.\n    This function scans the input list for a flag that looks like:\n    --flagfile=<somefile>. Then it opens <somefile>, reads all valid key\n    and value pairs and inserts them into the input list between the\n    first item of the list and any subsequent items in the list.\n\n    Note that your application's flags are still defined the usual way\n    using gflags DEFINE_flag() type functions.\n\n    Notes (assuming we're getting a commandline of some sort as our input):\n    --> Flags from the command line argv _should_ always take precedence!\n    --> A further \"--flagfile=<otherfile.cfg>\" CAN be nested in a flagfile.\n        It will be processed after the parent flag file is done.\n    --> For duplicate flags, first one we hit should \"win\".\n    --> In a flagfile, a line beginning with # or // is a comment.\n    --> Entirely blank lines _should_ be ignored.\n    \"\"\"\n    parsed_file_list = []\n    rest_of_args = argv\n    new_argv = []\n    while rest_of_args:\n      current_arg = rest_of_args[0]\n      rest_of_args = rest_of_args[1:]\n      if self.__IsFlagFileDirective(current_arg):\n        # This handles the case of -(-)flagfile foo.  In this case the\n        # next arg really is part of this one.\n        if current_arg == '--flagfile' or current_arg == '-flagfile':\n          if not rest_of_args:\n            raise IllegalFlagValue('--flagfile with no argument')\n          flag_filename = os.path.expanduser(rest_of_args[0])\n          rest_of_args = rest_of_args[1:]\n        else:\n          # This handles the case of (-)-flagfile=foo.\n          flag_filename = self.ExtractFilename(current_arg)\n        new_argv = (new_argv[:1] +\n                    self.__GetFlagFileLines(flag_filename, parsed_file_list) +\n                    new_argv[1:])\n      else:\n        new_argv.append(current_arg)\n\n    return new_argv\n\n  def FlagsIntoString(self):\n    \"\"\"Returns a string with the flags assignments from this FlagValues object.\n\n    This function ignores flags whose value is None.  Each flag\n    assignment is separated by a newline.\n\n    NOTE: MUST mirror the behavior of the C++ function\n    CommandlineFlagsIntoString from google3/base/commandlineflags.cc.\n    \"\"\"\n    s = ''\n    for flag in self.FlagDict().values():\n      if flag.value is not None:\n        s += flag.Serialize() + '\\n'\n    return s\n\n  def AppendFlagsIntoFile(self, filename):\n    \"\"\"Appends all flags assignments from this FlagInfo object to a file.\n\n    Output will be in the format of a flagfile.\n\n    NOTE: MUST mirror the behavior of the C++ version of\n    AppendFlagsIntoFile from google3/base/commandlineflags.cc.\n    \"\"\"\n    out_file = open(filename, 'a')\n    out_file.write(self.FlagsIntoString())\n    out_file.close()\n\n  def WriteHelpInXMLFormat(self, outfile=None):\n    \"\"\"Outputs flag documentation in XML format.\n\n    NOTE: We use element names that are consistent with those used by\n    the C++ command-line flag library, from\n    google3/base/commandlineflags_reporting.cc.  We also use a few new\n    elements (e.g., <key>), but we do not interfere / overlap with\n    existing XML elements used by the C++ library.  Please maintain this\n    consistency.\n\n    Args:\n      outfile: File object we write to.  Default None means sys.stdout.\n    \"\"\"\n    outfile = outfile or sys.stdout\n\n    outfile.write('<?xml version=\\\"1.0\\\"?>\\n')\n    outfile.write('<AllFlags>\\n')\n    indent = '  '\n    _WriteSimpleXMLElement(outfile, 'program', os.path.basename(sys.argv[0]),\n                           indent)\n\n    usage_doc = sys.modules['__main__'].__doc__\n    if not usage_doc:\n      usage_doc = '\\nUSAGE: %s [flags]\\n' % sys.argv[0]\n    else:\n      usage_doc = usage_doc.replace('%s', sys.argv[0])\n    _WriteSimpleXMLElement(outfile, 'usage', usage_doc, indent)\n\n    # Get list of key flags for the main module.\n    key_flags = self._GetKeyFlagsForModule(_GetMainModule())\n\n    # Sort flags by declaring module name and next by flag name.\n    flags_by_module = self.FlagsByModuleDict()\n    all_module_names = list(flags_by_module.keys())\n    all_module_names.sort()\n    for module_name in all_module_names:\n      flag_list = [(f.name, f) for f in flags_by_module[module_name]]\n      flag_list.sort()\n      for unused_flag_name, flag in flag_list:\n        is_key = flag in key_flags\n        flag.WriteInfoInXMLFormat(outfile, module_name,\n                                  is_key=is_key, indent=indent)\n\n    outfile.write('</AllFlags>\\n')\n    outfile.flush()\n# end of FlagValues definition\n\n\n# The global FlagValues instance //lhuang\nFLAGS = FlagValues()\n\n\ndef _MakeXMLSafe(s):\n  \"\"\"Escapes <, >, and & from s, and removes XML 1.0-illegal chars.\"\"\"\n  # lhuang: avoid _md5\n  s = s #cgi.escape(s)  # Escape <, >, and &\n  # Remove characters that cannot appear in an XML 1.0 document\n  # (http://www.w3.org/TR/REC-xml/#charsets).\n  #\n  # NOTE: if there are problems with current solution, one may move to\n  # XML 1.1, which allows such chars, if they're entity-escaped (&#xHH;).\n  s = re.sub(r'[\\x00-\\x08\\x0b\\x0c\\x0e-\\x1f]', '', s)\n  return s\n\n\ndef _WriteSimpleXMLElement(outfile, name, value, indent):\n  \"\"\"Writes a simple XML element.\n\n  Args:\n    outfile: File object we write the XML element to.\n    name: A string, the name of XML element.\n    value: A Python object, whose string representation will be used\n      as the value of the XML element.\n    indent: A string, prepended to each line of generated output.\n  \"\"\"\n  value_str = str(value)\n  if isinstance(value, bool):\n    # Display boolean values as the C++ flag library does: no caps.\n    value_str = value_str.lower()\n  outfile.write('%s<%s>%s</%s>\\n' %\n                (indent, name, _MakeXMLSafe(value_str), name))\n\n\nclass Flag:\n  \"\"\"Information about a command-line flag.\n\n  'Flag' objects define the following fields:\n    .name  - the name for this flag\n    .default - the default value for this flag\n    .default_as_str - default value as repr'd string, e.g., \"'true'\" (or None)\n    .value  - the most recent parsed value of this flag; set by Parse()\n    .help  - a help string or None if no help is available\n    .short_name  - the single letter alias for this flag (or None)\n    .boolean  - if 'true', this flag does not accept arguments\n    .present  - true if this flag was parsed from command line flags.\n    .parser  - an ArgumentParser object\n    .serializer - an ArgumentSerializer object\n    .allow_override - the flag may be redefined without raising an error\n\n  The only public method of a 'Flag' object is Parse(), but it is\n  typically only called by a 'FlagValues' object.  The Parse() method is\n  a thin wrapper around the 'ArgumentParser' Parse() method.  The parsed\n  value is saved in .value, and the .present attribute is updated.  If\n  this flag was already present, a FlagsError is raised.\n\n  Parse() is also called during __init__ to parse the default value and\n  initialize the .value attribute.  This enables other python modules to\n  safely use flags even if the __main__ module neglects to parse the\n  command line arguments.  The .present attribute is cleared after\n  __init__ parsing.  If the default value is set to None, then the\n  __init__ parsing step is skipped and the .value attribute is\n  initialized to None.\n\n  Note: The default value is also presented to the user in the help\n  string, so it is important that it be a legal value for this flag.\n  \"\"\"\n\n  def __init__(self, parser, serializer, name, default, help_string,\n               short_name=None, boolean=0, allow_override=0):\n    self.name = name\n\n    if not help_string:\n      help_string = '(no help available)'\n\n    self.help = help_string\n    self.short_name = short_name\n    self.boolean = boolean\n    self.present = 0\n    self.parser = parser\n    self.serializer = serializer\n    self.allow_override = allow_override\n    self.value = None\n\n    self.SetDefault(default)\n\n  def __GetParsedValueAsString(self, value):\n    if value is None:\n      return None\n    if self.serializer:\n      return repr(self.serializer.Serialize(value))\n    if self.boolean:\n      if value:\n        return repr('true')\n      else:\n        return repr('false')\n    return repr(str(value))\n\n  def Parse(self, argument):\n    try:\n      self.value = self.parser.Parse(argument)\n    except ValueError, e:  # recast ValueError as IllegalFlagValue\n      raise IllegalFlagValue(\"flag --%s: %s\" % (self.name, e))\n    self.present += 1\n\n  def Unparse(self):\n    if self.default is None:\n      self.value = None\n    else:\n      self.Parse(self.default)\n    self.present = 0\n\n  def Serialize(self):\n    if self.value is None:\n      return ''\n    if self.boolean:\n      if self.value:\n        return \"--%s\" % self.name\n      else:\n        return \"--no%s\" % self.name\n    else:\n      if not self.serializer:\n        raise FlagsError(\"Serializer not present for flag %s\" % self.name)\n      return \"--%s=%s\" % (self.name, self.serializer.Serialize(self.value))\n\n  def SetDefault(self, value):\n    \"\"\"Changes the default value (and current value too) for this Flag.\"\"\"\n    # We can't allow a None override because it may end up not being\n    # passed to C++ code when we're overriding C++ flags.  So we\n    # cowardly bail out until someone fixes the semantics of trying to\n    # pass None to a C++ flag.  See swig_flags.Init() for details on\n    # this behavior.\n    if value is None and self.allow_override:\n      raise DuplicateFlag(self.name)\n\n    self.default = value\n    self.Unparse()\n    self.default_as_str = self.__GetParsedValueAsString(self.value)\n\n  def Type(self):\n    \"\"\"Returns: a string that describes the type of this Flag.\"\"\"\n    # NOTE: we use strings, and not the types.*Type constants because\n    # our flags can have more exotic types, e.g., 'comma separated list\n    # of strings', 'whitespace separated list of strings', etc.\n    return self.parser.Type()\n\n  def WriteInfoInXMLFormat(self, outfile, module_name, is_key=False, indent=''):\n    \"\"\"Writes common info about this flag, in XML format.\n\n    This is information that is relevant to all flags (e.g., name,\n    meaning, etc.).  If you defined a flag that has some other pieces of\n    info, then please override _WriteCustomInfoInXMLFormat.\n\n    Please do NOT override this method.\n\n    Args:\n      outfile: File object we write to.\n      module_name: A string, the name of the module that defines this flag.\n      is_key: A boolean, True iff this flag is key for main module.\n      indent: A string that is prepended to each generated line.\n    \"\"\"\n    outfile.write(indent + '<flag>\\n')\n    inner_indent = indent + '  '\n    if is_key:\n      _WriteSimpleXMLElement(outfile, 'key', 'yes', inner_indent)\n    _WriteSimpleXMLElement(outfile, 'file', module_name, inner_indent)\n    # Print flag features that are relevant for all flags.\n    _WriteSimpleXMLElement(outfile, 'name', self.name, inner_indent)\n    if self.short_name:\n      _WriteSimpleXMLElement(outfile, 'short_name', self.short_name,\n                             inner_indent)\n    if self.help:\n      _WriteSimpleXMLElement(outfile, 'meaning', self.help, inner_indent)\n    _WriteSimpleXMLElement(outfile, 'default', self.default, inner_indent)\n    _WriteSimpleXMLElement(outfile, 'current', self.value, inner_indent)\n    _WriteSimpleXMLElement(outfile, 'type', self.Type(), inner_indent)\n    # Print extra flag features this flag may have.\n    self._WriteCustomInfoInXMLFormat(outfile, inner_indent)\n    outfile.write(indent + '</flag>\\n')\n\n  def _WriteCustomInfoInXMLFormat(self, outfile, indent):\n    \"\"\"Writes extra info about this flag, in XML format.\n\n    \"Extra\" means \"not already printed by WriteInfoInXMLFormat above.\"\n\n    Args:\n      outfile: File object we write to.\n      indent: A string that is prepended to each generated line.\n    \"\"\"\n    # Usually, the parser knows the extra details about the flag, so\n    # we just forward the call to it.\n    self.parser.WriteCustomInfoInXMLFormat(outfile, indent)\n# End of Flag definition\n\n\nclass ArgumentParser:\n  \"\"\"Base class used to parse and convert arguments.\n\n  The Parse() method checks to make sure that the string argument is a\n  legal value and convert it to a native type.  If the value cannot be\n  converted, it should throw a 'ValueError' exception with a human\n  readable explanation of why the value is illegal.\n\n  Subclasses should also define a syntactic_help string which may be\n  presented to the user to describe the form of the legal values.\n  \"\"\"\n  syntactic_help = \"\"\n\n  def Parse(self, argument):\n    \"\"\"Default implementation: always returns its argument unmodified.\"\"\"\n    return argument\n\n  def Type(self):\n    return 'string'\n\n  def WriteCustomInfoInXMLFormat(self, outfile, indent):\n    pass\n\n\nclass ArgumentSerializer:\n  \"\"\"Base class for generating string representations of a flag value.\"\"\"\n\n  def Serialize(self, value):\n    return str(value)\n\n\nclass ListSerializer(ArgumentSerializer):\n\n  def __init__(self, list_sep):\n    self.list_sep = list_sep\n\n  def Serialize(self, value):\n    return self.list_sep.join([str(x) for x in value])\n\n\n# The DEFINE functions are explained in mode details in the module doc string.\n\n\ndef DEFINE(parser, name, default, help, flag_values=FLAGS, serializer=None,\n           **args):\n  \"\"\"Registers a generic Flag object.\n\n  NOTE: in the docstrings of all DEFINE* functions, \"registers\" is short\n  for \"creates a new flag and registers it\".\n\n  Auxiliary function: clients should use the specialized DEFINE_<type>\n  function instead.\n\n  Args:\n    parser: ArgumentParser that is used to parse the flag arguments.\n    name: A string, the flag name.\n    default: The default value of the flag.\n    help: A help string.\n    flag_values: FlagValues object the flag will be registered with.\n    serializer: ArgumentSerializer that serializes the flag value.\n    args: Dictionary with extra keyword args that are passes to the\n      Flag __init__.\n  \"\"\"\n  DEFINE_flag(Flag(parser, serializer, name, default, help, **args),\n              flag_values)\n\n\ndef DEFINE_flag(flag, flag_values=FLAGS):\n  \"\"\"Registers a 'Flag' object with a 'FlagValues' object.\n\n  By default, the global FLAGS 'FlagValue' object is used.\n\n  Typical users will use one of the more specialized DEFINE_xxx\n  functions, such as DEFINE_string or DEFINE_integer.  But developers\n  who need to create Flag objects themselves should use this function\n  to register their flags.\n  \"\"\"\n  # copying the reference to flag_values prevents pychecker warnings\n  fv = flag_values\n  fv[flag.name] = flag\n  # Tell flag_values who's defining the flag.\n  if isinstance(flag_values, FlagValues):\n    # Regarding the above isinstance test: some users pass funny\n    # values of flag_values (e.g., {}) in order to avoid the flag\n    # registration (in the past, there used to be a flag_values ==\n    # FLAGS test here) and redefine flags with the same name (e.g.,\n    # debug).  To avoid breaking their code, we perform the\n    # registration only if flag_values is a real FlagValues object.\n    flag_values._RegisterFlagByModule(_GetCallingModule(), flag)\n\n\ndef _InternalDeclareKeyFlags(flag_names, flag_values=FLAGS):\n  \"\"\"Declares a flag as key for the calling module.\n\n  Internal function.  User code should call DECLARE_key_flag or\n  ADOPT_module_key_flags instead.\n\n  Args:\n    flag_names: A list of strings that are names of already-registered\n      Flag objects.\n    flag_values: A FlagValue object.  This should almost never need\n      to be overridden.\n\n  Raises:\n    UnrecognizedFlagError: when we refer to a flag that was not\n      defined yet.\n  \"\"\"\n  module = _GetCallingModule()\n\n  for flag_name in flag_names:\n    if flag_name not in flag_values:\n      raise UnrecognizedFlagError(flag_name)\n    flag = flag_values.FlagDict()[flag_name]\n    flag_values._RegisterKeyFlagForModule(module, flag)\n\n\ndef DECLARE_key_flag(flag_name, flag_values=FLAGS):\n  \"\"\"Declares one flag as key to the current module.\n\n  Key flags are flags that are deemed really important for a module.\n  They are important when listing help messages; e.g., if the\n  --helpshort command-line flag is used, then only the key flags of the\n  main module are listed (instead of all flags, as in the case of\n  --help).\n\n  Sample usage:\n\n    flags.DECLARED_key_flag('flag_1')\n\n  Args:\n    flag_name: A string, the name of an already declared flag.\n      (Redeclaring flags as key, including flags implicitly key\n      because they were declared in this module, is a no-op.)\n    flag_values: A FlagValues object.  This should almost never\n      need to be overridden.\n  \"\"\"\n  _InternalDeclareKeyFlags([flag_name], flag_values=flag_values)\n\n\ndef ADOPT_module_key_flags(module, flag_values=FLAGS):\n  \"\"\"Declares that all flags key to a module are key to the current module.\n\n  Args:\n    module: A module object.\n    flag_values: A FlagValues object.  This should almost never need\n      to be overridden.\n\n  Raises:\n    FlagsError: When given an argument that is a module name (a\n    string), instead of a module object.\n  \"\"\"\n  # NOTE(salcianu): an even better test would be if not\n  # isinstance(module, types.ModuleType) but I didn't want to import\n  # types for such a tiny use.\n  if isinstance(module, str):\n    raise FlagsError('Received module name %s; expected a module object.'\n                     % module)\n  _InternalDeclareKeyFlags(\n      [f.name for f in flag_values._GetKeyFlagsForModule(module.__name__)],\n      flag_values=flag_values)\n\n\n#\n# STRING FLAGS\n#\n\n\ndef DEFINE_string(name, default, help, flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value can be any string.\"\"\"\n  parser = ArgumentParser()\n  serializer = ArgumentSerializer()\n  DEFINE(parser, name, default, help, flag_values, serializer, **args)\n\n\n#\n# BOOLEAN FLAGS\n#\n# and the special HELP flags.\n\nclass BooleanParser(ArgumentParser):\n  \"\"\"Parser of boolean values.\"\"\"\n\n  def Convert(self, argument):\n    \"\"\"Converts the argument to a boolean; raise ValueError on errors.\"\"\"\n    if type(argument) == str:\n      if argument.lower() in ['true', 't', '1']:\n        return True\n      elif argument.lower() in ['false', 'f', '0']:\n        return False\n\n    bool_argument = bool(argument)\n    if argument == bool_argument:\n      # The argument is a valid boolean (True, False, 0, or 1), and not just\n      # something that always converts to bool (list, string, int, etc.).\n      return bool_argument\n\n    raise ValueError('Non-boolean argument to boolean flag', argument)\n\n  def Parse(self, argument):\n    val = self.Convert(argument)\n    return val\n\n  def Type(self):\n    return 'bool'\n\n\nclass BooleanFlag(Flag):\n  \"\"\"Basic boolean flag.\n\n  Boolean flags do not take any arguments, and their value is either\n  True (1) or False (0).  The false value is specified on the command\n  line by prepending the word 'no' to either the long or the short flag\n  name.\n\n  For example, if a Boolean flag was created whose long name was\n  'update' and whose short name was 'x', then this flag could be\n  explicitly unset through either --noupdate or --nox.\n  \"\"\"\n\n  def __init__(self, name, default, help, short_name=None, **args):\n    p = BooleanParser()\n    Flag.__init__(self, p, None, name, default, help, short_name, 1, **args)\n    if not self.help: self.help = \"a boolean value\"\n\n\ndef DEFINE_boolean(name, default, help, flag_values=FLAGS, **args):\n  \"\"\"Registers a boolean flag.\n\n  Such a boolean flag does not take an argument.  If a user wants to\n  specify a false value explicitly, the long option beginning with 'no'\n  must be used: i.e. --noflag\n\n  This flag will have a value of None, True or False.  None is possible\n  if default=None and the user does not specify the flag on the command\n  line.\n  \"\"\"\n  DEFINE_flag(BooleanFlag(name, default, help, **args), flag_values)\n\n# Match C++ API to unconfuse C++ people.\nDEFINE_bool = DEFINE_boolean\n\nclass HelpFlag(BooleanFlag):\n  \"\"\"\n  HelpFlag is a special boolean flag that prints usage information and\n  raises a SystemExit exception if it is ever found in the command\n  line arguments.  Note this is called with allow_override=1, so other\n  apps can define their own --help flag, replacing this one, if they want.\n  \"\"\"\n  def __init__(self):\n    BooleanFlag.__init__(self, \"help\", 0, \"show this help\",\n                         short_name=\"h\", allow_override=1)\n  def Parse(self, arg):\n    if arg:\n      doc = sys.modules[\"__main__\"].__doc__\n      flags = str(FLAGS)\n      print doc or (\"\\nUSAGE: echo SEQUENCE | %s [flags]\\n       or\\n       echo FASTA_FILE | %s [flags]\\n\" % (sys.argv[0], sys.argv[0]))\n      if flags:\n        print \"flags:\"\n        print flags\n        print \"\"\n      sys.exit(1)\n\n\nclass HelpXMLFlag(BooleanFlag):\n  \"\"\"Similar to HelpFlag, but generates output in XML format.\"\"\"\n\n  def __init__(self):\n    BooleanFlag.__init__(self, 'helpxml', False,\n                         'like --help, but generates XML output',\n                         allow_override=1)\n\n  def Parse(self, arg):\n    if arg:\n      FLAGS.WriteHelpInXMLFormat(sys.stdout)\n      sys.exit(1)\n\n\nclass HelpshortFlag(BooleanFlag):\n  \"\"\"\n  HelpshortFlag is a special boolean flag that prints usage\n  information for the \"main\" module, and rasies a SystemExit exception\n  if it is ever found in the command line arguments.  Note this is\n  called with allow_override=1, so other apps can define their own\n  --helpshort flag, replacing this one, if they want.\n  \"\"\"\n  def __init__(self):\n    BooleanFlag.__init__(self, \"helpshort\", 0,\n                         \"show usage only for this module\", allow_override=1)\n  def Parse(self, arg):\n    if arg:\n      doc = sys.modules[\"__main__\"].__doc__\n      flags = FLAGS.MainModuleHelp()\n      print doc or (\"\\nUSAGE: %s [flags]\\n\" % sys.argv[0])\n      if flags:\n        print \"flags:\"\n        print flags\n      sys.exit(1)\n\n\n#\n# FLOAT FLAGS\n#\n\nclass FloatParser(ArgumentParser):\n  \"\"\"Parser of floating point values.\n\n  Parsed value may be bounded to a given upper and lower bound.\n  \"\"\"\n  number_article = \"a\"\n  number_name = \"number\"\n  syntactic_help = \" \".join((number_article, number_name))\n\n  def __init__(self, lower_bound=None, upper_bound=None):\n    self.lower_bound = lower_bound\n    self.upper_bound = upper_bound\n    sh = self.syntactic_help\n    if lower_bound != None and upper_bound != None:\n      sh = (\"%s in the range [%s, %s]\" % (sh, lower_bound, upper_bound))\n    elif lower_bound == 1:\n      sh = \"a positive %s\" % self.number_name\n    elif upper_bound == -1:\n      sh = \"a negative %s\" % self.number_name\n    elif lower_bound == 0:\n      sh = \"a non-negative %s\" % self.number_name\n    elif upper_bound != None:\n      sh = \"%s <= %s\" % (self.number_name, upper_bound)\n    elif lower_bound != None:\n      sh = \"%s >= %s\" % (self.number_name, lower_bound)\n    self.syntactic_help = sh\n\n  def Convert(self, argument):\n    \"\"\"Converts argument to a float; raises ValueError on errors.\"\"\"\n    return float(argument)\n\n  def Parse(self, argument):\n    val = self.Convert(argument)\n    if ((self.lower_bound != None and val < self.lower_bound) or\n        (self.upper_bound != None and val > self.upper_bound)):\n      raise ValueError(\"%s is not %s\" % (val, self.syntactic_help))\n    return val\n\n  def Type(self):\n    return 'float'\n\n  def WriteCustomInfoInXMLFormat(self, outfile, indent):\n    if self.lower_bound is not None:\n      _WriteSimpleXMLElement(outfile, 'lower_bound', self.lower_bound, indent)\n    if self.upper_bound is not None:\n      _WriteSimpleXMLElement(outfile, 'upper_bound', self.upper_bound, indent)\n# End of FloatParser\n\n\ndef DEFINE_float(name, default, help, lower_bound=None, upper_bound=None,\n                 flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value must be a float.\n\n  If lower_bound or upper_bound are set, then this flag must be\n  within the given range.\n  \"\"\"\n  parser = FloatParser(lower_bound, upper_bound)\n  serializer = ArgumentSerializer()\n  DEFINE(parser, name, default, help, flag_values, serializer, **args)\n\n\n#\n# INTEGER FLAGS\n#\n\n\nclass IntegerParser(FloatParser):\n  \"\"\"Parser of an integer value.\n\n  Parsed value may be bounded to a given upper and lower bound.\n  \"\"\"\n  number_article = \"an\"\n  number_name = \"integer\"\n  syntactic_help = \" \".join((number_article, number_name))\n\n  def Convert(self, argument):\n    __pychecker__ = 'no-returnvalues'\n    if type(argument) == str:\n      base = 10\n      if len(argument) > 2 and argument[0] == \"0\" and argument[1] == \"x\":\n        base = 16\n      try:\n        return int(argument, base)\n      # ValueError is thrown when argument is a string, and overflows an int.\n      except ValueError:\n        return long(argument, base)\n    else:\n      try:\n        return int(argument)\n      # OverflowError is thrown when argument is numeric, and overflows an int.\n      except OverflowError:\n        return long(argument)\n\n  def Type(self):\n    return 'int'\n\n\ndef DEFINE_integer(name, default, help, lower_bound=None, upper_bound=None,\n                   flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value must be an integer.\n\n  If lower_bound, or upper_bound are set, then this flag must be\n  within the given range.\n  \"\"\"\n  parser = IntegerParser(lower_bound, upper_bound)\n  serializer = ArgumentSerializer()\n  DEFINE(parser, name, default, help, flag_values, serializer, **args)\n\n\n#\n# ENUM FLAGS\n#\n\n\nclass EnumParser(ArgumentParser):\n  \"\"\"Parser of a string enum value (a string value from a given set).\n\n  If enum_values (see below) is not specified, any string is allowed.\n  \"\"\"\n\n  def __init__(self, enum_values=None):\n    self.enum_values = enum_values\n\n  def Parse(self, argument):\n    if self.enum_values and argument not in self.enum_values:\n      raise ValueError(\"value should be one of <%s>\" %\n                       \"|\".join(self.enum_values))\n    return argument\n\n  def Type(self):\n    return 'string enum'\n\n\nclass EnumFlag(Flag):\n  \"\"\"Basic enum flag; its value can be any string from list of enum_values.\"\"\"\n\n  def __init__(self, name, default, help, enum_values=None,\n               short_name=None, **args):\n    enum_values = enum_values or []\n    p = EnumParser(enum_values)\n    g = ArgumentSerializer()\n    Flag.__init__(self, p, g, name, default, help, short_name, **args)\n    if not self.help: self.help = \"an enum string\"\n    self.help = \"<%s>: %s\" % (\"|\".join(enum_values), self.help)\n\n  def _WriteCustomInfoInXMLFormat(self, outfile, indent):\n    for enum_value in self.parser.enum_values:\n      _WriteSimpleXMLElement(outfile, 'enum_value', enum_value, indent)\n\n\ndef DEFINE_enum(name, default, enum_values, help, flag_values=FLAGS,\n                **args):\n  \"\"\"Registers a flag whose value can be any string from enum_values.\"\"\"\n  DEFINE_flag(EnumFlag(name, default, help, enum_values, ** args),\n              flag_values)\n\n\n#\n# LIST FLAGS\n#\n\n\nclass BaseListParser(ArgumentParser):\n  \"\"\"Base class for a parser of lists of strings.\n\n  To extend, inherit from this class; from the subclass __init__, call\n\n    BaseListParser.__init__(self, token, name)\n\n  where token is a character used to tokenize, and name is a description\n  of the separator.\n  \"\"\"\n\n  def __init__(self, token=None, name=None):\n    assert name\n    self._token = token\n    self._name = name\n    self.syntactic_help = \"a %s separated list\" % self._name\n\n  def Parse(self, argument):\n    if argument == '':\n      return []\n    else:\n      return [s.strip() for s in argument.split(self._token)]\n\n  def Type(self):\n    return '%s separated list of strings' % self._name\n\n\nclass ListParser(BaseListParser):\n  \"\"\"Parser for a comma-separated list of strings.\"\"\"\n\n  def __init__(self):\n    BaseListParser.__init__(self, ',', 'comma')\n\n  def WriteCustomInfoInXMLFormat(self, outfile, indent):\n    BaseListParser.WriteCustomInfoInXMLFormat(self, outfile, indent)\n    _WriteSimpleXMLElement(outfile, 'list_separator', repr(','), indent)\n\n\nclass WhitespaceSeparatedListParser(BaseListParser):\n  \"\"\"Parser for a whitespace-separated list of strings.\"\"\"\n\n  def __init__(self):\n    BaseListParser.__init__(self, None, 'whitespace')\n\n  def WriteCustomInfoInXMLFormat(self, outfile, indent):\n    BaseListParser.WriteCustomInfoInXMLFormat(self, outfile, indent)\n    separators = list(string.whitespace)\n    separators.sort()\n    for ws_char in string.whitespace:\n      _WriteSimpleXMLElement(outfile, 'list_separator', repr(ws_char), indent)\n\n\ndef DEFINE_list(name, default, help, flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value is a comma-separated list of strings.\"\"\"\n  parser = ListParser()\n  serializer = ListSerializer(',')\n  DEFINE(parser, name, default, help, flag_values, serializer, **args)\n\n\ndef DEFINE_spaceseplist(name, default, help, flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value is a whitespace-separated list of strings.\n\n  Any whitespace can be used as a separator.\n  \"\"\"\n  parser = WhitespaceSeparatedListParser()\n  serializer = ListSerializer(' ')\n  DEFINE(parser, name, default, help, flag_values, serializer, **args)\n\n\n#\n# MULTI FLAGS\n#\n\n\nclass MultiFlag(Flag):\n  \"\"\"A flag that can appear multiple time on the command-line.\n\n  The value of such a flag is a list that contains the individual values\n  from all the appearances of that flag on the command-line.\n\n  See the __doc__ for Flag for most behavior of this class.  Only\n  differences in behavior are described here:\n\n    * The default value may be either a single value or a list of values.\n      A single value is interpreted as the [value] singleton list.\n\n    * The value of the flag is always a list, even if the option was\n      only supplied once, and even if the default value is a single\n      value\n  \"\"\"\n\n  def __init__(self, *args, **kwargs):\n    Flag.__init__(self, *args, **kwargs)\n    self.help += ';\\n    repeat this option to specify a list of values'\n\n  def Parse(self, arguments):\n    \"\"\"Parses one or more arguments with the installed parser.\n\n    Args:\n      arguments: a single argument or a list of arguments (typically a\n        list of default values); a single argument is converted\n        internally into a list containing one item.\n    \"\"\"\n    if not isinstance(arguments, list):\n      # Default value may be a list of values.  Most other arguments\n      # will not be, so convert them into a single-item list to make\n      # processing simpler below.\n      arguments = [arguments]\n\n    if self.present:\n      # keep a backup reference to list of previously supplied option values\n      values = self.value\n    else:\n      # \"erase\" the defaults with an empty list\n      values = []\n\n    for item in arguments:\n      # have Flag superclass parse argument, overwriting self.value reference\n      Flag.Parse(self, item)  # also increments self.present\n      values.append(self.value)\n\n    # put list of option values back in the 'value' attribute\n    self.value = values\n\n  def Serialize(self):\n    if not self.serializer:\n      raise FlagsError(\"Serializer not present for flag %s\" % self.name)\n    if self.value is None:\n      return ''\n\n    s = ''\n\n    multi_value = self.value\n\n    for self.value in multi_value:\n      if s: s += ' '\n      s += Flag.Serialize(self)\n\n    self.value = multi_value\n\n    return s\n\n  def Type(self):\n    return 'multi ' + self.parser.Type()\n\n\ndef DEFINE_multi(parser, serializer, name, default, help, flag_values=FLAGS,\n                 **args):\n  \"\"\"Registers a generic MultiFlag that parses its args with a given parser.\n\n  Auxiliary function.  Normal users should NOT use it directly.\n\n  Developers who need to create their own 'Parser' classes for options\n  which can appear multiple times can call this module function to\n  register their flags.\n  \"\"\"\n  DEFINE_flag(MultiFlag(parser, serializer, name, default, help, **args),\n              flag_values)\n\n\ndef DEFINE_multistring(name, default, help, flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value can be a list of any strings.\n\n  Use the flag on the command line multiple times to place multiple\n  string values into the list.  The 'default' may be a single string\n  (which will be converted into a single-element list) or a list of\n  strings.\n  \"\"\"\n  parser = ArgumentParser()\n  serializer = ArgumentSerializer()\n  DEFINE_multi(parser, serializer, name, default, help, flag_values, **args)\n\n\ndef DEFINE_multi_int(name, default, help, lower_bound=None, upper_bound=None,\n                     flag_values=FLAGS, **args):\n  \"\"\"Registers a flag whose value can be a list of arbitrary integers.\n\n  Use the flag on the command line multiple times to place multiple\n  integer values into the list.  The 'default' may be a single integer\n  (which will be converted into a single-element list) or a list of\n  integers.\n  \"\"\"\n  parser = IntegerParser(lower_bound, upper_bound)\n  serializer = ArgumentSerializer()\n  DEFINE_multi(parser, serializer, name, default, help, flag_values, **args)\n\n\n# Now register the flags that we want to exist in all applications.\n# These are all defined with allow_override=1, so user-apps can use\n# these flagnames for their own purposes, if they want.\nDEFINE_flag(HelpFlag())\nDEFINE_flag(HelpshortFlag())\n\n# lhuang\n#DEFINE_flag(HelpXMLFlag())\n\n# Define special flags here so that help may be generated for them.\n_SPECIAL_FLAGS = FlagValues()\n\n\n# lhuang\n\n##DEFINE_string(\n##    'flagfile', \"\",\n##    \"Insert flag definitions from the given file into the command line.\",\n##    _SPECIAL_FLAGS)\n\n##DEFINE_string(\n##    'undefok', \"\",\n##    \"comma-separated list of flag names that it is okay to specify \"\n##    \"on the command line even if the program does not define a flag \"\n##    \"with that name.  IMPORTANT: flags in this list that have \"\n##    \"arguments MUST use the --flag=value format.\", _SPECIAL_FLAGS)gf"
  },
  {
    "path": "license.txt",
    "content": "The LinearDesign code is freely accessible to all interested parties. \nIt is free for academic, non-profit, and research use, and can be licensed for commercial use.\n\nTo use this software for the development of a commercial product, including but not limited to software, service, or pharmaceuticals, please contact the lead corresponding author.\n\nRedistribution of the code with or without modification is not permitted without explicit written permission by the lead corresponding author.\n"
  },
  {
    "path": "lineardesign",
    "content": "#!/usr/bin/env python2\n\nimport gflags as flags\nimport subprocess\nimport sys\nimport os\n\nFLAGS = flags.FLAGS\n\ndef setgflags():\n    flags.DEFINE_float('lambda', 0.0, \"set lambda\", short_name='l')\n    flags.DEFINE_boolean('verbose', False, \"print out more details\", short_name='v')\n    flags.DEFINE_string('codonusage', 'codon_usage_freq_table_human.csv', \"import a Codon Usage Frequency Table\", short_name='c')\n    argv = FLAGS(sys.argv)\n\ndef main():\n\n    lambda_ = str(FLAGS.l)\n    verbose_ = '1' if FLAGS.verbose else '0'\n    codon_usage = str(FLAGS.codonusage)\n\n    path = os.path.dirname(os.path.abspath(__file__))\n    cmd = [\"%s/%s\" % (path, ('bin/LinearDesign_2D')), lambda_, verbose_, codon_usage]\n    subprocess.call(cmd, stdin=sys.stdin)\n    \nif __name__ == '__main__':\n    setgflags()\n    main()\n\n"
  },
  {
    "path": "src/Utils/base.h",
    "content": "#ifndef base_h\n#define base_h\n\n#include <type_traits>\n#include <utility>\n#include <memory>\n#include <string>\n#include <vector>\n#include <sstream>\n\n#if defined(__GNUC__) || defined(__clang__)\n#define LINEAR_DESIGN_DEPRECATED __attribute__((deprecated))\n#elif defined(_MSC_VER)\n#define LINEAR_DESIGN_DEPRECATED __declspec(deprecated)\n#else\n#pragma message(\"WARNING: function deprecated\")\n#define LINEAR_DESIGN_DEPRECATED\n#endif\n\n#if defined(__GNUC__) || defined(__clang__)\n#define LINEAR_DESIGN_INLINE inline __attribute__((always_inline))\n#else\n#define LINEAR_DESIGN_INLINE inline\n#endif\n\n#define LINEAR_DESIGN_CACHELINE 64\n\ntemplate <bool B, typename T=void>\nusing enable_if_t = typename std::enable_if<B, T>::type;\n\ntemplate <typename T, enable_if_t<std::is_integral<T>::value, int> = 0>\nstd::ostream& operator<< (std::ostream& out, const std::pair<T, T>& rhs) {\n    out << \"(\" << rhs.first << \", \" << rhs.second << \")\";\n    return out;\n}\n\ntemplate <typename T, enable_if_t<std::is_integral<T>::value, int> = 0>\nstd::ostream& operator<< (std::ostream& out, const std::vector<std::pair<T, T>>& rhs) {\n    out << \"[\";\n    for (size_t i = 0; i < rhs.size(); ++i) {\n        out << rhs[i];\n        if (i < rhs.size() - 1) out << \",\";\n    }\n    out << \"]\";\n    return out;\n}\n\nnamespace LinearDesign {\n\nnamespace util {\n    std::vector<std::string> split(const std::string &s, char delim) {\n        std::vector<std::string> result;\n        std::stringstream ss(s);\n        std::string item;\n        while (getline(ss, item, delim)) \n            result.push_back(item);\n        return result;\n    }\n\n    template <typename T>\n    constexpr T value_min() {\n        static_assert(std::is_integral<T>::value ||\n            std::is_floating_point<T>::value, \"Int or float required.\");\n        return std::numeric_limits<T>::lowest();\n    }\n\n    template <typename T>\n    constexpr T value_max() {\n        static_assert(std::is_integral<T>::value ||\n            std::is_floating_point<T>::value, \"Int or float required.\");\n        return std::numeric_limits<T>::max();\n    }\n} /* util */\n\n\n// template <bool...> struct is_any;\n// template <> struct is_any<> : std::false_type {};\n// template <bool First, bool... Rest> struct is_any<First, Rest...> {\n//     constexpr static bool value = First || is_any<Rest...>::value;\n// };\n\nstruct hash_pair_pair {\n    template <class T1, class T2, class T3>\n    size_t operator()(const std::pair<std::pair<T1, T2>, T3>& p) const {\n        auto hash1 = std::hash<T1>{}(p.first.first);\n        auto hash2 = std::hash<T2>{}(p.first.second);\n        auto hash3 = std::hash<T3>{}(p.second);\n        return hash1 ^ hash2 ^ hash3;\n    }\n};\n\nstruct hash_pair {\n    template <class T1, class T2>\n    size_t operator()(const std::pair<T1, T2>& p) const {\n        auto hash1 = std::hash<T1>{}(p.first);\n        auto hash2 = std::hash<T2>{}(p.second);\n        return hash1 ^ hash2;\n    }\n};\n\n}\n\n\nnamespace Hash {\n    template <class T>\n    LINEAR_DESIGN_INLINE size_t hash_combine(size_t left_seed, const T& right) {\n        return left_seed ^ (std::hash<T>{}(right) << 1);\n    }\n\n    template <class Tuple, size_t Index = std::tuple_size<Tuple>::value - 1>\n    struct TupleHashImpl {\n        static size_t impl(size_t seed, const Tuple& tuple) {\n            size_t h = hash_combine(seed, std::get<Index>(tuple));\n            return TupleHashImpl<Tuple, Index-1>::impl(h, tuple);\n        }\n    };\n\n    template <class Tuple>\n    struct TupleHashImpl<Tuple, 0> {\n        static size_t impl(size_t seed, const Tuple& tuple) {\n            return hash_combine(seed, std::get<0>(tuple));\n        }\n    };\n}\n\n\ntemplate <class... Ts>\nstruct std::hash<std::tuple<Ts...>> {\n    size_t operator()(const std::tuple<Ts...>& ts) const {\n        return Hash::TupleHashImpl<std::tuple<Ts...>>::impl(0, ts);\n    }\n};\n\ntemplate <class T1, class T2>\nstruct std::hash<std::pair<T1, T2>> {\n    size_t operator()(const std::pair<T1, T2>& p) const {\n        size_t h = std::hash<T1>{}(p.first);\n        return Hash::hash_combine(h, p.second);\n    }\n};\n\n\n#endif"
  },
  {
    "path": "src/Utils/codon.h",
    "content": "#ifndef codon_h\n#define codon_h\n\n#include <exception>\n#include <fstream>\n#include <string>\n#include <utility>\n#include <map>\n#include <vector>\n#include <regex>\n#include <cmath>\n\n#include <typeinfo>\n\n#include \"base.h\"\n#include \"constants.h\"\n\nnamespace LinearDesign {\n\n// trim from end (in place)\nstatic inline void rtrim(std::string &s) {\n    s.erase(std::find_if(s.rbegin(), s.rend(), [](unsigned char ch) {\n        return !std::isspace(ch);\n    }).base(), s.end());\n}\n\n\nclass Codon {\npublic:\n\tCodon(const std::string& path) : codon_table_(), aa_table_() {\n        std::ifstream codon_file;\n        codon_file.open(path);\n        if (codon_file.is_open()) {\n            int index = 0;\n            for (std::string line; getline(codon_file, line);){\n\n                rtrim(line);\n\n                if(line.size() == 0 or line.empty())\n                    continue;\n\n                if (index++ == 0)\n                    continue;\n                \n                const auto line_split = util::split(line, ',');\n                if(line_split.size() != 3){\n                    std::cerr << \"Wrong format of codon frequency file!\" << std::endl;\n                    exit(1);\n                }\n                const std::string codon = line_split[0];\n                const std::string aa = line_split[1];\n                const float fraction = std::stof(line_split[2]);\n\n                codon_table_[codon] = make_pair(aa, fraction);\n                aa_table_[aa].push_back(make_pair(codon, fraction));\n                if (!max_aa_table_.count(aa))\n                    max_aa_table_[aa] = fraction;\n                else\n                    max_aa_table_[aa] = fmax(max_aa_table_[aa], fraction);\n            }\n            codon_file.close();\n            if (codon_table_.size() != 64){\n                std::cerr << \"Codon frequency file needs to contain 64 codons!\" << std::endl;\n                exit(1);\n            }\n\n        } else {\n            std::cerr << \"The input codon frequency file does not exist!\" << std::endl;\n            exit(1);\n        }\n    }\n\n    float calc_cai(const std::string& rna_seq) const {\n        if (rna_seq.length() % 3)\n            throw std::runtime_error(\"invalid rna seq\");\n        \n        const int protein_length = static_cast<int>(rna_seq.length() / 3);\n        float cai = 0.0f;\n        \n        for (int index = 3; index < rna_seq.length() + 1; index += 3) {\n            const std::string tri_letter = rna_seq.substr(index - 3, 3);\n            const auto f_ci_aa = codon_table_.at(tri_letter);\n            const auto f_c_max = max_aa_table_.at(f_ci_aa.first);\n            \n            float w_i = f_ci_aa.second / f_c_max;\n            cai += log2f(w_i);\n        }\n        \n        return exp2f(cai / protein_length);\n    }\n\n    std::string find_max_codon(const char aa, \n    \t\t\t\t\t\t   const std::string& match) const {\n    \tauto candidate_condons = aa_table_.at(std::string(1, aa));\n\n    \tfloat max_score = 0;\n    \tstd::string max_condon;\n    \tfor (auto& candidate : candidate_condons) {\n    \t\tif (std::regex_match(candidate.first, std::regex(match)) && \n    \t\t\t\tcandidate.second > max_score) {\n    \t\t\tmax_condon = candidate.first;\n    \t\t\tmax_score = candidate.second;\n    \t\t}\n    \t}\n\n    \tif (max_condon.empty())\n    \t\tthrow std::runtime_error(\"invald search\");\n\n        // assert(codon_table_.at(max_condon).first == std::string(1, aa));\n    \treturn max_condon;\n    }\n\n    std::string cvt_rna_seq_to_aa_seq(const std::string& rna_seq) const {\n        if (rna_seq.length() % 3)\n            throw std::runtime_error(\"invalid rna seq\");\n\n        std::string aa_seq;\n        aa_seq.reserve(rna_seq.length());\n        for (int index = 3; index < rna_seq.length() + 1; index += 3) {\n            const std::string tri_letter = rna_seq.substr(index - 3, 3);\n            auto aa = codon_table_.at(tri_letter).first;\n            if (aa == \"STOP\") {\n                aa_seq.append(\"*\");\n                return aa_seq;\n            }\n            aa_seq.append(codon_table_.at(tri_letter).first);\n        }\n        return aa_seq;\n    }\n\n    float get_weight(const std::string& aa_tri, const std::string& codon) const {\n\n        if (k_map_3_1.count(aa_tri)) {\n            auto codons = aa_table_.at(std::string(1, k_map_3_1[aa_tri]));\n            auto it = std::find_if(codons.begin(), codons.end(), [codon](const std::pair<std::string, float>& e){\n                // std::cout << typeid(e).name() << '\\n';\n                return e.first == codon;\n            });\n            if (it == codons.end()) {\n                throw std::runtime_error(\"invalid codon\");\n            }\n            return it->second;\n        } else if (three_prime_aa_table_.count(aa_tri)) {\n            return three_prime_aa_table_.at(aa_tri).second;\n        }\n\n        return 0.0f;\n    }\n\n\n\n// private:\n    std::vector<std::string> aux_aa_;\n    std::map<std::string, std::pair<std::string, float>> three_prime_codon_table_;\n    std::map<std::string, std::pair<std::string, float>> three_prime_aa_table_;\n\n    \n    std::map<std::string, float> max_aa_table_;\n\tstd::map<std::string, std::pair<std::string, float>> codon_table_;\n\tstd::map<std::string, std::vector<std::pair<std::string, float>>> aa_table_;\n};\n\n}\n\n#endif \n"
  },
  {
    "path": "src/Utils/common.h",
    "content": "#ifndef common_h\n#define common_h\n\n#include <utility>\n#include <functional>\n#include <array>\n#include <string>\n#include <set>\n#include <map>\n#include <exception>\n#include <list>\n#include \"base.h\"\n\nnamespace LinearDesign {\n\n\nusing SizeType               = size_t;\nusing ScoreType              = int32_t;\nusing IndexType              = int32_t; //if less than 10000, only int16_t is needed here\nusing NucType                = int8_t;\nusing NumType                = int32_t;\nusing NucPairType            = int8_t;\nusing PairType               = int8_t;\nusing FinalScoreType         = double;\nusing NodeType               = std::pair<IndexType, NumType>;\nusing NodeNucType            = std::pair<NodeType, NucType>;\nusing NodeNucWType           = std::tuple<NodeType, NucType, double>;\nusing PairType               = int8_t;\n\n\n\n\nenum class Manner : std::uint8_t {\n    NONE = 0,              // 0: empty\n    H,                     // 1: hairpin candidate\n    HAIRPIN,               // 2: hairpin\n    SINGLE,                // 3: single\n    HELIX,                 // 4: helix\n    MULTI,                 // 5: multi = ..M2. [30 restriction on the left and jump on the right]\n    MULTI_eq_MULTI_plus_U, // 6: multi = multi + U\n    P_eq_MULTI,            // 7: P = (multi)\n    M2_eq_M_plus_P,        // 8: M2 = M + P\n    M_eq_M2,               // 9: M = M2\n    M_eq_M_plus_U,         // 10: M = M + U\n    M_eq_P,                // 11: M = P\n    C_eq_C_plus_U,         // 12: C = C + U\n    C_eq_C_plus_P,         // 13: C = C + P\n};\n\nenum class Beam_type : std::uint8_t {\n    BEAM_C = 0,\n    BEAM_P,\n    BEAM_MULTI,\n    BEAM_M2,\n    BEAM_M1\n\n};\n\n\ntemplate <typename ScoreType>\nstruct State {\n    ScoreType score = util::value_min<ScoreType>();\n    double cai_score = util::value_min<double>();\n    NodeType pre_node;\n    double pre_left_cai;\n};\n\nstruct BacktraceResult {\n    std::string seq;\n    std::string structure;\n};\n\ntemplate <typename ScoreType,\n          typename IndexType,\n          typename NodeType = std::pair<IndexType, IndexType>>\nstruct DecoderResult {\n    std::string sequence;\n    std::string structure;\n    ScoreType score;\n    ScoreType cai;\n    ScoreType old_cai;\n    IndexType num_states;\n};\n\ntemplate <typename ScoreType, \n          typename IndexType, \n          typename NodeType = std::pair<IndexType, IndexType>>\nstruct ScoreInnerDate {\n    ScoreType newscore;\n    NodeType j_node;\n    NodeType i_node;\n    int nuc_pair;\n};\n\n\nstruct NodeNucpair {\n    IndexType node_first;\n    NumType node_second;\n    NucPairType nucpair;\n};\n\n\n}\n\n#endif /* common_h */\n"
  },
  {
    "path": "src/Utils/constants.h",
    "content": "#ifndef constants_h\n#define constants_h\n\n#include <map>\n\nnamespace LinearDesign {\n\nconstexpr uint8_t k_void_nuc = 127;\n\n// static std::map<char, std::string> k_map_1_3 = {\n//     {'F',\"Phe\"},\n//     {'L',\"Leu\"},\n//     {'S',\"Ser\"},\n//     {'Y',\"Tyr\"},\n//     {'*',\"STOP\"},\n//     {'C',\"Cys\"},\n//     {'W',\"Trp\"},\n//     {'P',\"Pro\"},\n//     {'H',\"His\"},\n//     {'Q',\"Gln\"},\n//     {'R',\"Arg\"},\n//     {'I',\"Ile\"},\n//     {'M',\"Met\"},\n//     {'T',\"Thr\"},\n//     {'N',\"Asn\"},\n//     {'K',\"Lys\"},\n//     {'V',\"Val\"},\n//     {'D',\"Asp\"},\n//     {'E',\"Glu\"},\n//     {'G',\"Gly\"},\n//     {'A',\"Ala\"}\n// };\n\nstatic std::map<std::string, char> k_map_3_1 = {\n    {\"Phe\", 'F'},\n    {\"Leu\", 'L'},\n    {\"Ser\", 'S'},\n    {\"Tyr\", 'Y'},\n    {\"STOP\", '*'},\n    {\"Cys\", 'C'},\n    {\"Trp\", 'W'},\n    {\"Pro\", 'P'},\n    {\"His\", 'H'},\n    {\"Gln\", 'Q'},\n    {\"Arg\", 'R'},\n    {\"Ile\", 'I'},\n    {\"Met\", 'M'},\n    {\"Thr\", 'T'},\n    {\"Asn\", 'N'},\n    {\"Lys\", 'K'},\n    {\"Val\", 'V'},\n    {\"Asp\", 'D'},\n    {\"Glu\", 'E'},\n    {\"Gly\", 'G'},\n    {\"Ala\", 'A'}\n};\n\n}\n\n#endif /* constants_h */\n"
  },
  {
    "path": "src/Utils/flat.h",
    "content": "#ifndef flat_h\n#define flat_h\n\n#include <type_traits>\n#include <vector>\n\n#include \"base.h\"\n\nnamespace detail {\n    template <class Key>\n    struct DefaultIndex {\n        inline size_t operator()(const Key key) const {\n            return static_cast<size_t>(key);\n        }\n    };\n}\n\ntemplate <class Key, class T, class IndexFn = detail::DefaultIndex<Key>>\nclass Flat {\npublic:\n    using self_type       = Flat;\n    using storage_type    = std::vector<T>;\n    using key_type        = Key;\n    using reference       = T&;\n    using const_reference = const T&;\n    using iterator        = typename storage_type::iterator;\n\n\n    LINEAR_DESIGN_INLINE iterator begin() {\n        return data_.begin();\n    }\n\n    LINEAR_DESIGN_INLINE iterator end() {\n        return data_.end();\n    }\n\n    LINEAR_DESIGN_INLINE bool empty() const {\n        return false;\n    }\n    \n    LINEAR_DESIGN_INLINE void reserve(const size_t n) {\n        data_.reserve(n);\n    }\n\n    LINEAR_DESIGN_INLINE void resize(const size_t n) {\n        data_.resize(n);\n    }\n\n    template <enable_if_t<!std::is_integral<key_type>::value, int> = 0>\n\tLINEAR_DESIGN_INLINE reference operator[](size_t index) {\n        return data_[index];\n    }\n\n    template <enable_if_t<!std::is_integral<key_type>::value, int> = 0>\n    LINEAR_DESIGN_INLINE const_reference operator[](size_t index) const {\n        return data_[index];\n    }\n\n    LINEAR_DESIGN_INLINE reference operator[](key_type key) {\n        return data_[index_(key)];\n    }\n    \n    LINEAR_DESIGN_INLINE const_reference operator[](key_type key) const {\n        return data_[index_(key)];\n    }\n\n    LINEAR_DESIGN_INLINE size_t size() const {\n        return data_.size();\n    }\n    \nprivate:\n    IndexFn index_;\n    storage_type data_;\n};\n\n#endif /* flat_h */\n"
  },
  {
    "path": "src/Utils/network.h",
    "content": "\n#include <map>\n#include <unordered_map>\n#include <vector>\n#include <fstream>\n#include <string>\n#include <sstream>\n#include <algorithm>\n#include <iterator>\n#include <iostream>\n#include <memory>\n#include <string>\n#include <limits>\n#include <vector>\n#include <unordered_map>\n#include <algorithm>\n#include <array>\n#include \"utility_v.h\"\n#include \"common.h\"\n#include \"codon.h\"\n\nusing namespace std;\n\n// #define is_verbose\n\nnamespace LinearDesign {\n\ntemplate <typename IndexType,\n          typename IndexWType = tuple<IndexType, double>,\n          typename NodeType = pair<IndexType, NumType>,\n          typename NodeNucWType = tuple<NodeType, NucType, double>>\nclass Lattice {\npublic:\n    unordered_map<IndexType, vector<NodeType>> nodes;\n    unordered_map<NodeType, vector<NodeNucWType>, hash_pair> left_edges;\n    unordered_map<NodeType, vector<NodeNucWType>, hash_pair> right_edges;\n    \n    Lattice(): nodes(), left_edges(), right_edges() {};\n\n    void add_edge(NodeType n1, NodeType n2, NucType nuc, double weight = 0.0f){\n        right_edges[n1].push_back(make_tuple(n2, nuc, weight));\n        left_edges[n2].push_back(make_tuple(n1, nuc, weight));\n    }\n\n    void add_node(NodeType n1){\n        IndexType pos = get<0>(n1);\n        nodes[pos].push_back(n1);\n    }\n};\n\ntemplate <typename IndexType,\n          typename IndexWType = pair<IndexType, double>,\n          typename NodeType = pair<IndexType, NumType>,\n          typename NodeNucWType = tuple<NodeType, NucType, double>>\nclass DFA {\npublic:\n    unordered_map<IndexType, vector<NodeType>> nodes;\n    unordered_map<NodeType, vector<NodeNucWType>, hash_pair> left_edges;\n    unordered_map<NodeType, vector<NodeNucWType>, hash_pair> right_edges;\n    unordered_map<NodeType, unordered_map<NodeType, vector<IndexWType>, hash_pair>, hash_pair> auxiliary_left_edges;\n    unordered_map<NodeType, unordered_map<NodeType, vector<IndexWType>, hash_pair>, hash_pair> auxiliary_right_edges;\n    unordered_map<NodeType, unordered_map<IndexType, double>, hash_pair> node_rightedge_weights;\n\n    DFA(): nodes(), left_edges(), right_edges(), auxiliary_left_edges(), auxiliary_right_edges() {};\n\n    void add_edge(NodeType n1, NodeType n2, IndexType nuc, double weight = 0.0f){\n        right_edges[n1].push_back(make_tuple(n2, nuc, weight));\n        left_edges[n2].push_back(make_tuple(n1, nuc, weight));\n        auxiliary_right_edges[n1][n2].push_back(make_pair(nuc, weight));\n        auxiliary_left_edges[n2][n1].push_back(make_pair(nuc, weight));\n        node_rightedge_weights[n1][nuc] = weight;\n    }\n\n    void add_node(NodeType n1){\n        IndexType pos = get<0>(n1);\n        nodes[pos].push_back(n1);\n    }    \n};\n\ntemplate <typename IndexType,\n          typename NodeType = pair<IndexType, NumType>,\n          typename LatticeType = Lattice<IndexType>>\nunordered_map<string, LatticeType> read_wheel(string const &filename) {\n    unordered_map<string, LatticeType> aa_graphs;\n    ifstream inFile;\n    inFile.open(filename);\n    if (!inFile) {\n        printf(\"Unable to open coding_wheel file\\n\");\n        exit(1);   // call system to stop\n    }\n\n    vector<string> stuff;\n    vector<string> option_splited;\n    string aa;\n    IndexType i;\n\n    for (string line; getline(inFile, line);) {\n        stuff = util::split(line, '\\t');\n\n        aa = stuff[0];\n        LatticeType graph = LatticeType();\n        graph.add_node(make_pair(0,0)); // always initialize with node (0,0)\n\n        char last_first = 0;\n        vector<string>::iterator iter = stuff.begin();\n        ++iter; // position 0 is aa name\n        i = 0;\n        while(iter != stuff.end()){\n            string option = *iter;\n            option_splited = util::split(option, ' ');\n            char first = option_splited[0][0];\n            char second = option_splited[1][0];\n            string thirds = option_splited[2];\n            NodeType n2 = make_pair(2, i);\n            graph.add_node(n2);\n            NodeType n1;\n            if (first != last_first) {\n                n1 = make_pair(1, i);\n                graph.add_node(n1);\n                graph.add_edge(make_pair(0, 0), n1, GET_ACGU_NUC(first));\n            }\n            else {\n                n1 = make_pair(1, i-1);\n            }\n            last_first = first;\n            graph.add_edge(n1, n2, GET_ACGU_NUC(second));\n            for (auto& third : thirds) {\n                graph.add_edge(n2, make_pair(0,0), GET_ACGU_NUC(third));\n            }\n            i++; iter++;\n        }\n        aa_graphs[aa] = graph;\n#ifdef is_verbose\n        printf(\"-----------------Lattice------------------------\\n\");\n        for(IndexType pos = 0; pos <= 2; pos++){\n            for(auto &node : graph.nodes[pos]){\n                IndexType p = get<0>(node);\n                IndexType num = get<1>(node);\n                printf(\"node, (%d, %d)\\n\", p, num);\n                for(auto &item : graph.right_edges[node]){\n                    NodeType n2 = get<0>(item);\n                    IndexType p2 = get<0>(n2); IndexType num2 = get<1>(n2);\n                    IndexType nuc = get<1>(item);\n                    double weight = get<2>(item);\n                    printf(\"              (%d, %d) -(%d,%lf)-> (%d, %d)\\n\", p, num, nuc,  weight, p2, num2);\n                }\n                for(auto &item : graph.left_edges[node]){\n                    NodeType n1 = get<0>(item);\n                    IndexType p1 = get<0>(n1); IndexType num1 = get<1>(n1);\n                    IndexType nuc = get<1>(item);\n                    double weight = get<2>(item);\n                    printf(\"  (%d, %d) <-(%d,%lf)- (%d, %d)\\n\", p1, num1, nuc, weight, p, num);\n                }\n            }\n        }        \n#endif\n    }\n    inFile.close();\n    return aa_graphs;\n}\n\n\ntemplate <typename IndexType,\n          typename NodeType = pair<IndexType, NumType>,\n          typename LatticeType = Lattice<IndexType>,\n          typename NucType = IndexType,\n          typename NodeNucNodeType = std::tuple<NodeType, NucType, NodeType>>\nunordered_map<string, LatticeType> read_wheel_with_weights(const std::string& filename,\n        std::unordered_map<std::string, std::unordered_map<NodeType, double, hash_pair>>& nodes_with_best_weight,\n        std::unordered_map<std::string, std::unordered_map<NodeNucNodeType, double, std::hash<NodeNucNodeType>>>& edges_with_best_weight,\n        const Codon& codon) {\n    unordered_map<string, LatticeType> aa_graphs;\n    ifstream inFile;\n    inFile.open(filename);\n    if (!inFile) \n        throw std::runtime_error(\"Unable to open coding_wheel file\\n\");\n\n    vector<string> stuff;\n    vector<string> option_splited;\n    string aa;\n    IndexType i;\n\n    for (string line; getline(inFile, line);) {\n        stuff = util::split(line, '\\t');\n        aa = stuff[0];\n        LatticeType graph = LatticeType();\n        graph.add_node(make_pair(0,0)); // always initialize with node (0,0)\n\n        char last_first = 0;\n        vector<string>::iterator iter = stuff.begin();\n        ++iter; // position 0 is aa name\n        i = 0;\n        while(iter != stuff.end()){\n            string option = *iter;\n            option_splited = util::split(option, ' ');\n            char first = option_splited[0][0];\n            char second = option_splited[1][0];\n            string thirds = option_splited[2];\n            NodeType n2 = make_pair(2, i);\n            graph.add_node(n2);\n            NodeType n1;\n            if (first != last_first) {\n                n1 = make_pair(1, i);\n                graph.add_node(n1);\n                auto first_num = GET_ACGU_NUC(first);\n\n                double weight = 0.0f;\n                if (nodes_with_best_weight[aa].count(make_pair(0, 0))) {\n                    weight = edges_with_best_weight[aa][make_tuple(make_pair(0, 0), first_num, n1)] / nodes_with_best_weight[aa][make_pair(0, 0)];\n                }\n\n                graph.add_edge(make_pair(0, 0), n1, first_num, weight);\n            }\n            else {\n                n1 = make_pair(1, i-1);\n            }\n            \n            last_first = first;\n\n            auto second_num = GET_ACGU_NUC(second);\n\n            double weight = 0.0f;\n            if (nodes_with_best_weight[aa].count(n1)) {\n                weight = edges_with_best_weight[aa][make_tuple(n1, second_num, n2)] / nodes_with_best_weight[aa][n1];\n            }\n\n            graph.add_edge(n1, n2, second_num, weight);\n\n            for (auto& third : thirds) {\n\n                std::string three_nums = std::string(1, first) + std::string(1, second) + std::string(1, third);\n\n                double weight = 0.0f;\n                if (nodes_with_best_weight[aa].count(n2)) {\n                    weight = codon.get_weight(aa, three_nums) / nodes_with_best_weight[aa][n2];\n                } else {\n                    weight = codon.get_weight(aa, three_nums);\n                }\n\n                graph.add_edge(n2, make_pair(0,0), GET_ACGU_NUC(third), weight);\n            }\n            i++; iter++;\n        }\n        aa_graphs[aa] = graph;\n    }\n\n    inFile.close();\n    return aa_graphs;\n}\n\n\ntemplate <typename IndexType,\n          typename NodeType = pair<IndexType, NumType>,\n          typename LatticeType = Lattice<IndexType>,\n          typename NucType = IndexType,\n          typename NodeNucNodeType = std::tuple<NodeType, NucType, NodeType>>\nunordered_map<string, LatticeType> read_wheel_with_weights_log(const std::string& filename,\n        std::unordered_map<std::string, std::unordered_map<NodeType, double, hash_pair>>& nodes_with_best_weight,\n        std::unordered_map<std::string, std::unordered_map<NodeNucNodeType, double, std::hash<NodeNucNodeType>>>& edges_with_best_weight,\n        const Codon& codon, double lambda_) {\n    unordered_map<string, LatticeType> aa_graphs;\n    ifstream inFile;\n    inFile.open(filename);\n    if (!inFile) \n        throw std::runtime_error(\"Unable to open coding_wheel file\\n\");\n\n    vector<string> stuff;\n    vector<string> option_splited;\n    string aa;\n    IndexType i;\n\n    for (string line; getline(inFile, line);) {\n        stuff = util::split(line, '\\t');\n        aa = stuff[0];\n        LatticeType graph = LatticeType();\n        graph.add_node(make_pair(0,0)); // always initialize with node (0,0)\n\n        char last_first = 0;\n        vector<string>::iterator iter = stuff.begin();\n        ++iter; // position 0 is aa name\n        i = 0;\n        while(iter != stuff.end()){\n            string option = *iter;\n            option_splited = util::split(option, ' ');\n            char first = option_splited[0][0];\n            char second = option_splited[1][0];\n            string thirds = option_splited[2];\n            NodeType n2 = make_pair(2, i);\n            graph.add_node(n2);\n            NodeType n1;\n            if (first != last_first) {\n                n1 = make_pair(1, i);\n                graph.add_node(n1);\n                auto first_num = GET_ACGU_NUC(first);\n\n                double weight = 1.0f;\n                if (nodes_with_best_weight[aa].count(make_pair(0, 0))) {\n                    weight = lambda_ * log(edges_with_best_weight[aa][make_tuple(make_pair(0, 0), first_num, n1)] / nodes_with_best_weight[aa][make_pair(0, 0)]);\n                }\n\n                graph.add_edge(make_pair(0, 0), n1, first_num, weight);\n            }\n            else {\n                n1 = make_pair(1, i-1);\n            }\n            \n            last_first = first;\n\n            auto second_num = GET_ACGU_NUC(second);\n\n            double weight = 1.0f;\n            if (nodes_with_best_weight[aa].count(n1)) {\n                weight = lambda_ * log(edges_with_best_weight[aa][make_tuple(n1, second_num, n2)] / nodes_with_best_weight[aa][n1]);\n            }\n\n            graph.add_edge(n1, n2, second_num, weight);\n\n            for (auto& third : thirds) {\n\n                std::string three_nums = std::string(1, first) + std::string(1, second) + std::string(1, third);\n\n                double weight = 1.0f;\n                if (nodes_with_best_weight[aa].count(n2)) {\n                    weight = lambda_ *  log(codon.get_weight(aa, three_nums) / nodes_with_best_weight[aa][n2]);\n                } else {\n                    weight = lambda_ *  log(codon.get_weight(aa, three_nums));\n                }\n\n                graph.add_edge(n2, make_pair(0,0), GET_ACGU_NUC(third), weight);\n            }\n            i++; iter++;\n        }\n        aa_graphs[aa] = graph;\n    }\n\n    inFile.close();\n    return aa_graphs;\n}\n\ntemplate <typename IndexType,\n          typename NucType = IndexType,\n          typename NodeType = pair<IndexType, NumType>,\n          typename NodeNucNodeType = std::tuple<NodeType, NucType, NodeType>,\n          typename WeightType = double,\n          typename LatticeType = Lattice<IndexType>>\nvoid prepare_codon_unit_lattice(const std::string& wheel_path, const Codon& codon,\n        std::unordered_map<string, LatticeType>& aa_graphs_with_ln_weights_ret,\n        std::unordered_map<std::string, std::unordered_map<std::tuple<NodeType, NodeType>, std::tuple<double, NucType, NucType>, std::hash<std::tuple<NodeType, NodeType>>>>&\n                best_path_in_one_codon_unit_ret,\n        std::unordered_map<std::string, std::string>& aa_best_path_in_a_whole_codon_ret, double lambda_) {\n\n    std::unordered_map<std::string, std::unordered_map<NodeType, WeightType, hash_pair>> nodes_with_best_weight;\n    std::unordered_map<std::string, std::unordered_map<NodeNucNodeType, WeightType, std::hash<NodeNucNodeType>>> edges_with_best_weight;\n\n    unordered_map<string, LatticeType> aa_graphs_with_ln_weights;\n    unordered_map<string, LatticeType> aa_graphs_with_weights = read_wheel_with_weights<IndexType>(wheel_path, nodes_with_best_weight, edges_with_best_weight, codon);\n\n    for (auto& aa_aa_elem : aa_graphs_with_weights) {\n        auto& aa = aa_aa_elem.first;\n        auto& aa_elem = aa_aa_elem.second;\n        for (auto& node_at_2 : aa_elem.nodes[2]) {\n            for (auto& node_at_3_nuc_weight : aa_elem.right_edges[node_at_2]) {\n                auto node_at_3 = std::get<0>(node_at_3_nuc_weight);\n                auto nuc = std::get<1>(node_at_3_nuc_weight);\n                auto weight = std::get<2>(node_at_3_nuc_weight);\n                nodes_with_best_weight[aa][node_at_2] = max(nodes_with_best_weight[aa][node_at_2], weight);\n                edges_with_best_weight[aa][make_tuple(node_at_2,nuc,node_at_3)] = weight;\n            }\n        }\n\n        for (auto& node_at_1 : aa_elem.nodes[1]) {\n            for (auto& node_at_2_nuc_weight : aa_elem.right_edges[node_at_1]) {\n                auto node_at_2 = std::get<0>(node_at_2_nuc_weight);\n                auto nuc = std::get<1>(node_at_2_nuc_weight);\n                nodes_with_best_weight[aa][node_at_1] = max(nodes_with_best_weight[aa][node_at_1], nodes_with_best_weight[aa][node_at_2]);\n                edges_with_best_weight[aa][make_tuple(node_at_1,nuc,node_at_2)] = nodes_with_best_weight[aa][node_at_2];\n            }\n        }\n\n        for (auto& node_at_0 : aa_elem.nodes[0]) {\n            for (auto& node_at_1_nuc_weight : aa_elem.right_edges[node_at_0]) {\n                auto node_at_1 = std::get<0>(node_at_1_nuc_weight);\n                auto nuc = std::get<1>(node_at_1_nuc_weight);\n                nodes_with_best_weight[aa][node_at_0] = max(nodes_with_best_weight[aa][node_at_0], nodes_with_best_weight[aa][node_at_1]);\n                edges_with_best_weight[aa][make_tuple(node_at_0,nuc,node_at_1)] = nodes_with_best_weight[aa][node_at_1];\n            }\n        }\n    }\n\n    aa_graphs_with_ln_weights = read_wheel_with_weights_log<IndexType>(wheel_path,  nodes_with_best_weight, edges_with_best_weight, codon, lambda_);\n\n    std::unordered_map<std::string, \n                       std::unordered_map<std::tuple<NodeType, NodeType>, \n                       std::tuple<double, NucType, NucType>,\n                       std::hash<std::tuple<NodeType, NodeType>>>>\n                       best_path_in_one_codon_unit;\n\n\n    for (auto& aa_graph : aa_graphs_with_ln_weights) {\n        auto& aa = aa_graph.first;\n        auto& graph = aa_graph.second;\n        for (auto& node_0 : graph.nodes[0]) {\n            for (auto& node_1_nuc_log_w : graph.right_edges[node_0]) {\n                auto node_1 = std::get<0>(node_1_nuc_log_w);\n                auto nuc = std::get<1>(node_1_nuc_log_w);\n                auto log_weight = std::get<2>(node_1_nuc_log_w);\n\n                if (!best_path_in_one_codon_unit[aa].count(make_tuple(node_0,node_1)))\n                    best_path_in_one_codon_unit[aa][make_tuple(node_0,node_1)] = make_tuple(util::value_min<double>(),k_void_nuc,k_void_nuc);\n\n                double current_log_weight = std::get<0>(best_path_in_one_codon_unit[aa][make_tuple(node_0,node_1)]);\n                if (current_log_weight < log_weight) {\n                    best_path_in_one_codon_unit[aa][make_tuple(node_0,node_1)] = make_tuple(log_weight,nuc,k_void_nuc);\n                }\n            }\n        }\n\n        for (auto& node_1 : graph.nodes[1]) {\n            for (auto& node_2_nuc_log_w : graph.right_edges[node_1]) {\n                auto node_2 = std::get<0>(node_2_nuc_log_w);\n                auto nuc = std::get<1>(node_2_nuc_log_w);\n                auto log_weight = std::get<2>(node_2_nuc_log_w);\n\n                if (!best_path_in_one_codon_unit[aa].count(make_tuple(node_1,node_2)))\n                    best_path_in_one_codon_unit[aa][make_tuple(node_1,node_2)] = make_tuple(util::value_min<double>(),k_void_nuc,k_void_nuc);\n\n                double current_log_weight = std::get<0>(best_path_in_one_codon_unit[aa][make_tuple(node_1,node_2)]);\n                if (current_log_weight < log_weight) {\n                    best_path_in_one_codon_unit[aa][make_tuple(node_1,node_2)] = make_tuple(log_weight,nuc,k_void_nuc);\n                }\n\n                auto temp = best_path_in_one_codon_unit[aa][make_tuple(node_1,node_2)];\n            }\n        }\n\n        for (auto& node_2 : graph.nodes[2]) {\n            for (auto& node_3_nuc_log_w : graph.right_edges[node_2]) {\n                auto node_3 = std::get<0>(node_3_nuc_log_w);\n                auto nuc = std::get<1>(node_3_nuc_log_w);\n                auto log_weight = std::get<2>(node_3_nuc_log_w);\n\n                if (!best_path_in_one_codon_unit[aa].count(make_tuple(node_2,node_3)))\n                    best_path_in_one_codon_unit[aa][make_tuple(node_2,node_3)] = make_tuple(util::value_min<double>(),k_void_nuc,k_void_nuc);\n\n                double current_log_weight = std::get<0>(best_path_in_one_codon_unit[aa][make_tuple(node_2,node_3)]);\n                if (current_log_weight < log_weight) {\n                    best_path_in_one_codon_unit[aa][make_tuple(node_2,node_3)] = make_tuple(log_weight,nuc,k_void_nuc);\n                }\n            }\n        }\n\n        for (auto& node_0 : graph.nodes[0]) {\n            for (auto& node_1_nuc_0_log_weight_0 : graph.right_edges[node_0]) {\n                auto& node_1 = std::get<0>(node_1_nuc_0_log_weight_0);\n                auto& nuc_0 = std::get<1>(node_1_nuc_0_log_weight_0);\n                auto log_weight_0 = std::get<2>(node_1_nuc_0_log_weight_0);\n                for (auto& node_2_nuc_1_log_weight_1 : graph.right_edges[node_1]) {\n                    auto& node_2 = std::get<0>(node_2_nuc_1_log_weight_1);\n                    auto& nuc_1 = std::get<1>(node_2_nuc_1_log_weight_1);\n                    auto log_weight_1 = std::get<2>(node_2_nuc_1_log_weight_1);\n\n                    if (!best_path_in_one_codon_unit[aa].count(make_tuple(node_0,node_2)))\n                        best_path_in_one_codon_unit[aa][make_tuple(node_0,node_2)] = make_tuple(util::value_min<double>(),k_void_nuc,k_void_nuc);\n\n                    if (std::get<0>(best_path_in_one_codon_unit[aa][make_tuple(node_0,node_2)]) < log_weight_0 + log_weight_1)\n                        best_path_in_one_codon_unit[aa][make_tuple(node_0,node_2)] = make_tuple(log_weight_0 + log_weight_1, nuc_0, nuc_1);\n                }\n            }\n        }\n\n        for (auto& node_1 : graph.nodes[1]) {\n            for (auto& node_2_nuc_1_log_weight_1 : graph.right_edges[node_1]) {\n                auto& node_2 = std::get<0>(node_2_nuc_1_log_weight_1);\n                auto& nuc_1 = std::get<1>(node_2_nuc_1_log_weight_1);\n                auto log_weight_1 = std::get<2>(node_2_nuc_1_log_weight_1);\n                for (auto& node_3_nuc_2_log_weight_2 : graph.right_edges[node_2]) {\n                    auto& node_3 = std::get<0>(node_3_nuc_2_log_weight_2);\n                    auto& nuc_2 = std::get<1>(node_3_nuc_2_log_weight_2);\n                    auto log_weight_2 = std::get<2>(node_3_nuc_2_log_weight_2);\n\n                    if (!best_path_in_one_codon_unit[aa].count(make_tuple(node_1,node_3)))\n                        best_path_in_one_codon_unit[aa][make_tuple(node_1,node_3)] = make_tuple(util::value_min<double>(),k_void_nuc,k_void_nuc);\n\n                    if (std::get<0>(best_path_in_one_codon_unit[aa][make_tuple(node_1,node_3)]) < log_weight_1 + log_weight_2)\n                        best_path_in_one_codon_unit[aa][make_tuple(node_1,node_3)] = make_tuple(log_weight_1 + log_weight_2, nuc_1, nuc_2);\n                }\n            }\n        }\n    }\n\n    std::unordered_map<std::string, double> max_path;\n    std::unordered_map<std::string, std::string> aa_best_path_in_a_whole_codon;\n\n    for (auto& aa_path_weight : codon.aa_table_) {\n        auto& aa = aa_path_weight.first; // char\n        for (auto& path_weight : aa_path_weight.second) {\n            if (max_path[aa] < path_weight.second) {\n                max_path[aa] = path_weight.second;\n                aa_best_path_in_a_whole_codon[aa] = path_weight.first;\n            }\n        }\n    }\n\n    aa_graphs_with_ln_weights_ret = aa_graphs_with_ln_weights;\n    best_path_in_one_codon_unit_ret = best_path_in_one_codon_unit;\n    aa_best_path_in_a_whole_codon_ret = aa_best_path_in_a_whole_codon;\n}\n\n\n\ntemplate <typename IndexType,\n          typename NodeType = pair<IndexType, NumType>,\n          typename LatticeType = Lattice<IndexType>,\n          typename DFAType = DFA<IndexType>>\nDFAType get_dfa(unordered_map<string, LatticeType> aa_graphs, vector<string> aa_seq) {\n    DFAType dfa = DFAType();\n    NodeType newnode = make_pair(3 * static_cast<IndexType>(aa_seq.size()), 0);\n    dfa.add_node(newnode);\n    IndexType i = 0;\n    IndexType i3;\n    string aa;\n    LatticeType graph;\n    for(auto& item : aa_seq) {\n        i3 = i * 3;\n        aa = aa_seq[i];\n        graph = aa_graphs[aa];\n        for (IndexType pos = 0; pos <= 2; pos++) {\n            for(auto& node : graph.nodes[pos]) {\n                IndexType num = get<1>(node);\n                newnode = make_pair(i3 + pos, num);\n                dfa.add_node(newnode);\n                for (auto& edge : graph.right_edges[node]) {\n                    NodeType n2 = get<0>(edge);\n                    IndexType nuc = get<1>(edge);\n                    num = get<1>(n2);\n                    NodeType newn2 = make_pair(i3 + pos + 1, num);\n                    dfa.add_edge(newnode, newn2, nuc, get<2>(edge));\n                }\n            }\n        }\n        i++;\n    }\n#ifdef is_verbose\n    printf(\"-----------------DFA------------------------\\n\");\n    for(IndexType pos = 0; pos < 3 * static_cast<IndexType>(aa_seq.size()) + 1; pos++){\n        for(auto& node : dfa.nodes[pos]) {\n            IndexType p = get<0>(node);\n            IndexType num = get<1>(node);\n            printf(\"node, (%d, %d)\\n\", p, num);\n            for(auto &n2 : dfa.auxiliary_right_edges[node]){\n                IndexType p2 = get<0>(n2.first);\n                IndexType num2 = get<1>(n2.first);\n                for(auto nuc : n2.second){\n                    printf(\"              (%d, %d) -(%d,%lf)-> (%d, %d)\\n\", p, num, get<0>(nuc),get<1>(nuc), p2, num2);\n                }\n            }\n            for(auto &n1 : dfa.auxiliary_left_edges[node]){\n                IndexType p1 = get<0>(n1.first); IndexType num1 = get<1>(n1.first);\n                for(auto nuc : n1.second){\n                    printf(\"  (%d, %d) <-(%d,%lf)- (%d, %d)\\n\", p1, num1, get<0>(nuc),get<1>(nuc), p, num);\n                }\n            }\n        }\n    }\n#endif\n    return dfa;\n}\n\n}\n"
  },
  {
    "path": "src/Utils/reader.h",
    "content": "#ifndef fasta_h\n#define fasta_h\n\n#include <exception>\n#include <map>\n#include <fstream>\n#include <string>\n\n#include \"base.h\"\n\nnamespace LinearDesign {\n\nstruct Reader {\n\tstatic bool cvt_to_seq(const string& from, string& to) {\n\t\treturn false;\n\t}\n};\n\nstruct Fasta : public Reader {\n\tstatic map<char, string> map_fasta;\n\n\tstatic bool cvt_to_seq(const string& fasta, string& nucs) {\n\t    nucs.reserve(4 * fasta.length());\n\t    for(auto aa : fasta) {\n\t        if (map_fasta.count(aa)) {\n\t            nucs.append(map_fasta[aa] + \" \");\n\t        } else {\n\t            cerr << \"invalid protein sequence!\\n\" << endl;\n\t            return false;\n\t        }\n\t    }\n\t    nucs.pop_back();\n\t    return true;\n\t}\n};\n\nmap<char, string> Fasta::map_fasta = {\n\t{'F',\"Phe\"}, \n\t{'L',\"Leu\"}, \n\t{'S',\"Ser\"}, \n\t{'Y',\"Tyr\"}, \n\t{'*',\"STOP\"}, \n\t{'C',\"Cys\"}, \n\t{'W',\"Trp\"}, \n\t{'P',\"Pro\"}, \n\t{'H',\"His\"}, \n\t{'Q',\"Gln\"}, \n\t{'R',\"Arg\"}, \n\t{'I',\"Ile\"}, \n\t{'M',\"Met\"}, \n\t{'T',\"Thr\"}, \n\t{'N',\"Asn\"}, \n\t{'K',\"Lys\"}, \n\t{'V',\"Val\"}, \n\t{'D',\"Asp\"}, \n\t{'E',\"Glu\"}, \n\t{'G',\"Gly\"}, \n\t{'A',\"Ala\"}\n};\n\ntemplate <class T>\nstruct ReaderTraits {\n\tstatic bool cvt_to_seq(const string& from, string& to) {\n\t\treturn T::cvt_to_seq(from, to);\n\t}\n};\n\n}\n\n#endif /* fasta_h */"
  },
  {
    "path": "src/Utils/utility_v.h",
    "content": "\n#ifndef FASTCKY_UTILITY_V_H\n#define FASTCKY_UTILITY_V_H\n\n#include <string>\n#include <vector>\n#include <cmath>\n#include <assert.h>\n\n#define NTP(x,y) (x==1? (y==4?5:0) : (x==2? (y==3?1:0) : (x==3 ? (y==2?2:(y==4?3:0)) : (x==4 ? (y==3?4:(y==1?6:0)) : 0))))\n#define PTLN(x) (x==1? 2:((x==2 || x==3)? 3:(x==5)? 1:4))\n#define PTRN(x) (x==2? 2:((x==1 || x==4)? 3:(x==6)? 1:4))\n\n#define NOTON 5 // NUM_OF_TYPE_OF_NUCS\n#define NOTOND 25\n#define NOTONT 125\n\n#define EXPLICIT_MAX_LEN 4\n#define SINGLE_MIN_LEN 0\n#define SINGLE_MAX_LEN 20  // NOTE: *must* <= sizeof(char), otherwise modify State::TraceInfo accordingly\n\n#define HAIRPIN_MAX_LEN 30\n#define BULGE_MAX_LEN SINGLE_MAX_LEN\n#define INTERNAL_MAX_LEN SINGLE_MAX_LEN\n#define SYMMETRIC_MAX_LEN 15\n#define ASYMMETRY_MAX_LEN 28\n#define SPECIAL_HAIRPIN_SCORE_BASELINE -10000\n\nextern bool _allowed_pairs[NOTON][NOTON];\n\n#define MAXLOOP 30\n\n#define GET_ACGU(x) ((x==1? 'A' : (x==2? 'C' : (x==3? 'G' : (x==4?'U': 'X')))))\n\n#define GET_ACGU_NUC(x) ((x=='A'? 1 : (x=='C'? 2 : (x=='G'? 3 : (x=='U'?4: 0)))))\n\n#define HAIRPINTYPE(x) ((x==5?0 : (x==6?1 : (x==8?2 : 3))))\n\nextern int func1(std::string& a, int8_t b);\n\nextern void func2(std::string& a, int b, std::vector<int>& c, std::vector<int>& d, std::vector<int>& e);\n\nextern int func3(int a, int b, int c, int d, int e);\n\nextern int func4(int a, int b, int c, int d, int e, int f, int g);\n\nextern int func5(int a, int b, int c);\n\nextern int func6(int a, int b, int c, int d, int e, int f, int g, int h);\n\nextern int func7(int a, int b, int c, int d, int e, int h, int i);\n\nextern int func8(int a, int b);\n\nextern void func9(int a, int b);\n\nextern int func10(int a, int b, int c);\n\nextern int func11(int a, int b, int c, int d, int e, int f, int g, int h);\n\nextern int func12(int a, int b, int c, int d, int e, int f, int g = -1);\n\nextern int func13(int a, int b);\n\nextern int func14(int a, int b, int c, int d, int e, int f, int g, int h, int i, int j, int k, int l);\n\nextern int func15(int a, int b, int c, int d, int e, int f, int g);\n\n#endif \n\n"
  },
  {
    "path": "src/backtrace_iter.cc",
    "content": "\n#include \"beam_cky_parser.h\"\n\nusing namespace std;\n\n#define tetra_hex_tri -1\n\nnamespace LinearDesign {\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nBacktraceResult BeamCKYParser<ScoreType, IndexType, NodeType>::backtrace(DFA_t& dfa, const State_t& state, NodeType end_node){\n\n    char sequence[seq_length+1];\n    memset(sequence, '.', seq_length);\n    sequence[seq_length] = 0;\n\n    char structure[seq_length+1];\n    memset(structure, '.', seq_length);\n    structure[seq_length] = 0;\n\n    bool no_backpointer;\n\n    stack<tuple<NodeType, NodeType, State_t, Beam_type, PairType>> stk;\n    NodeType start_node = make_pair(0, 0);\n    stk.push(make_tuple(start_node, end_node, state, Beam_type::BEAM_C, -1));\n\n    double epsilon = 1e-8;\n\n    while(!stk.empty()) {\n        tuple<NodeType, NodeType, State_t, Beam_type, PairType> top = stk.top();\n        NodeType i_node = get<0>(top), j_node = get<1>(top);\n        State_t& state = get<2>(top);\n        Beam_type beam_type = get<3>(top);\n        PairType curr_pair_nuc = get<4>(top);\n        stk.pop();\n\n        IndexType i, j, p, q, hairpin_length;\n        j = j_node.first;\n        NucType nuci, nucj, nuci1, nucj_1;\n        no_backpointer = true;\n        \n        int left_start, left_end, right_start, right_end;\n\n        switch (beam_type) {\n            case Beam_type::BEAM_C:\n                if (j <= 0) continue;\n                for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                    auto j_1_node = std::get<0>(j_1_node_nucj_1);\n                    auto& c_state = bestC[j_1_node];\n                    auto weight_nucj_1 = std::get<2>(j_1_node_nucj_1);\n                    auto cai_score = c_state.cai_score + weight_nucj_1;\n\n                    if (state.score == c_state.score && abs(state.cai_score - cai_score) < epsilon){\n                        NucType nucj_1 = std::get<1>(j_1_node_nucj_1);\n                        stk.push(make_tuple(i_node, j_1_node, c_state, Beam_type::BEAM_C, curr_pair_nuc));\n                        sequence[j-1] = GET_ACGU(nucj_1);\n                        no_backpointer = false;\n                        break;\n                    }\n                }\n\n                // C = C + P\n                if(no_backpointer) {\n                    for (size_t c_node_nucpair_ = 0; c_node_nucpair_ < 16 * seq_length; ++c_node_nucpair_){\n                        auto& p_state = bestP[j_node][c_node_nucpair_];\n\n                        if (p_state.score == util::value_min<ScoreType>()) continue;\n                        auto c_node_nucpair = reverse_index(c_node_nucpair_);\n\n                        auto c = c_node_nucpair.node_first;\n                        auto c_num = c_node_nucpair.node_second;\n                        auto pair_nuc = c_node_nucpair.nucpair;\n                        auto c_node = make_pair(c, c_num);\n\n                        auto nucc = PTLN(pair_nuc);\n                        auto nucj_1 = PTRN(pair_nuc);                         \n\n\n                        auto newscore = - func3(c, j-1, nucc, nucj_1, seq_length) + p_state.score;\n\n                        if (c > 0){\n                            auto& c_state = bestC[c_node];\n                            auto cai_score = c_state.cai_score + p_state.cai_score;\n\n                            if (state.score == c_state.score + newscore && abs(state.cai_score - cai_score) < epsilon){\n                                stk.push(make_tuple(i_node, c_node, c_state, Beam_type::BEAM_C, curr_pair_nuc));\n                                stk.push(make_tuple(c_node, j_node, p_state, Beam_type::BEAM_P, pair_nuc));\n                                no_backpointer = false;\n                                break;\n                            }\n                        } else{\n                            if (state.score == newscore && abs(state.cai_score - p_state.cai_score) < epsilon){\n                                stk.push(make_tuple(c_node, j_node, p_state, Beam_type::BEAM_P, pair_nuc));\n                                no_backpointer = false;\n                                break;\n                            }\n                        }\n                    }\n                    if (!no_backpointer) break;\n                }\n                assert(no_backpointer == false); // something wrong if no path matches\n                break;\n\n            case Beam_type::BEAM_P:\n                i = i_node.first;\n                j = j_node.first;\n                nuci = PTLN(curr_pair_nuc);\n                nucj_1 = PTRN(curr_pair_nuc);\n\n\n                hairpin_length = j - i;\n\n                for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                    NucType new_nucj_1 = std::get<1>(j_1_node_nucj_1);\n                    if (new_nucj_1 != nucj_1) continue;\n\n                    auto j_1_node = std::get<0>(j_1_node_nucj_1);\n                    auto weight_nucj_1 = std::get<2>(j_1_node_nucj_1);\n\n\n\n#ifdef SPECIAL_HP\n                    if (hairpin_length == 5 or hairpin_length == 6 or hairpin_length == 8){\n                        for(auto & seq_score_weight : hairpin_seq_score_cai[i_node][j_1_node][NTP(nuci, nucj_1)]){\n                            auto seq =  get<0>(seq_score_weight);\n                            auto pre_cal_score =  get<1>(seq_score_weight);\n                            auto pre_cal_cai_score = get<2>(seq_score_weight);\n\n\n                            if (state.score == pre_cal_score && abs(state.cai_score - pre_cal_cai_score) < epsilon){\n                                for(int c=0; c<seq.size(); c++)\n                                    sequence[i+c] = seq[c];\n\n                                structure[i] = '(';\n                                structure[j-1] = ')';\n\n                                no_backpointer = false;\n                                break;\n                            }\n                        }if (!no_backpointer) break;\n                    }if (!no_backpointer) break;\n#endif\n\n\n\n                    for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n                        NucType new_nuci = std::get<1>(i1_node_nuci);\n                        if (new_nuci != nuci) continue;\n                        auto i1_node = std::get<0>(i1_node_nuci);\n                        auto weight_nuci = std::get<2>(i1_node_nuci);\n\n                        // helix \n                        for (auto& j_2_node_nucj_2 : dfa.left_edges[j_1_node]){\n                            NucType nucj_2 = std::get<1>(j_2_node_nucj_2);\n\n                            for (auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                NucType nuci1 = std::get<1>(i2_node_nuci1);\n                                auto pair_nuc = NTP(nuci1, nucj_2);\n\n                                NodeNucpair temp = {i1_node.first, i1_node.second, static_cast<NucPairType>(pair_nuc)};\n\n                                auto& p_state = bestP[j_1_node][temp];\n                                auto newscore = - func14(i, j-1, i+1, j-2,\n                                                                    nuci, nuci1, nucj_2, nucj_1,\n                                                                    nuci, nuci1, nucj_2, nucj_1)\n                                                        + p_state.score;\n\n                                auto cai_score = p_state.cai_score + (weight_nuci + weight_nucj_1);\n\n                                if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                    stk.push(make_tuple(i1_node, j_1_node, p_state, Beam_type::BEAM_P, pair_nuc));\n                                    sequence[i] = GET_ACGU(nuci);\n                                    sequence[j-1] = GET_ACGU(nucj_1);\n                                    structure[i] = '(';\n                                    structure[j-1] = ')';\n\n                                    no_backpointer = false;\n                                    break;\n                                } \n                            }if (!no_backpointer) break;\n                        }if (!no_backpointer) break;\n                    }\n                    if (!no_backpointer) break;\n\n                    // hairpin\n                    NodeNucpair temp = {i_node.first, LinearDesign::NumType(i_node.second), curr_pair_nuc};\n\n                    if (state.score == bestH[j_1_node][temp].score){ //no need to check CAI score here\n\n                        for (auto& j_2_node_nucj_2 : dfa.left_edges[j_1_node]){\n                            NucType nucj_2 = std::get<1>(j_2_node_nucj_2);\n                            auto j_2_node = std::get<0>(j_2_node_nucj_2);\n                            auto j_2 = j_2_node.first;\n\n                            auto weight_nucj_2 = std::get<2>(j_2_node_nucj_2);\n\n                            for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n                                NucType new_nuci = std::get<1>(i1_node_nuci);\n                                if (new_nuci != nuci) continue;\n\n                                auto i1_node = std::get<0>(i1_node_nuci);\n                                auto weight_nuci = get<2>(i1_node_nuci);\n                                \n\n                                for(auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                    NucType nuci1 = std::get<1>(i2_node_nuci1);\n                                    auto i2_node = std::get<0>(i2_node_nuci1);\n                                    auto i2 = i2_node.first;\n                                    auto weight_nuci1 = std::get<2>(i2_node_nuci1);\n\n                                    if (j - 1 - i == 4 and (j_2_node.second != i2_node.second and dfa.nodes[i+2].size() == dfa.nodes[j-2].size())) continue;\n\n                                    auto newscore = - func12(i, j-1, nuci, nuci1, nucj_2, nucj_1, tetra_hex_tri);\n                                    auto cai_score = weight_nuci + weight_nuci1 + get_broken_codon_score(i2_node, j_2_node) + weight_nucj_2 + weight_nucj_1;\n\n                                    if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                        sequence[i] = GET_ACGU(nuci);\n                                        sequence[i+1] = GET_ACGU(nuci1);\n                                        sequence[j-2] = GET_ACGU(nucj_2);\n                                        sequence[j-1] = GET_ACGU(nucj_1);\n                                        structure[i] = '(';\n                                        structure[j-1] = ')';\n\n                                        auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, i2_node, j_2_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                        int count = i2;\n                                        for (auto & nuc : temp_string ){\n                                            sequence[count] = nuc;\n                                            count++;\n                                        }\n                                        assert(count == j_2);\n\n                                        no_backpointer = false;\n                                        break;\n                                    }\n                                }if (!no_backpointer) break;\n                            }if (!no_backpointer) break;\n                        }if (!no_backpointer) break;\n                    }\n                }\n\n                // single branch\n                if (no_backpointer) {\n                    vector<pair<IndexType, NucType>> right_seq;\n                    vector<tuple<NodeType, NodeType, NucType, NucType, NucType, vector<pair<IndexType, NucType>>, int, int, NodeType, NodeType, double, double, double, double, bool, NodeType>> q_node_nucs_list;\n                    for (IndexType q = j-1; q >= std::max(j - SINGLE_MAX_LEN - 1, i + 5); --q){    \n                        int right_start = -1;\n                        int right_end = -1;\n                        q_node_nucs_list.clear();\n\n                        if (q == j-1){\n                            for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                                if (get<1>(j_1_node_nucj_1) != nucj_1) continue;\n                                auto q_node = get<0>(j_1_node_nucj_1);\n                                auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                                for (auto& q_1_node_nucq_1 : dfa.left_edges[q_node]){\n                                    NodeType q_1_node = get<0>(q_1_node_nucq_1);\n                                    auto nucq_1 = get<1>(q_1_node_nucq_1);\n                                    double weight_nucq_1 = get<2>(q_1_node_nucq_1);\n                                    right_seq.push_back(make_pair(j-1, nucj_1));\n                                    q_node_nucs_list.push_back(make_tuple(q_1_node, q_node, nucq_1, nucj_1, nucq_1, right_seq, right_start, right_end, make_pair(-1,0), make_pair(-1,0), weight_nucq_1, 0., 0., weight_nucj_1, true, make_pair(-1,0)));\n                                    right_seq.clear();\n                                }\n                            }\n                        }else if(q == j-2){\n                            for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                                if (get<1>(j_1_node_nucj_1) != nucj_1) continue;\n                                auto j_1_node = get<0>(j_1_node_nucj_1);\n                                auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                                for (auto& q_node_nucq : dfa.left_edges[j_1_node]){\n                                    auto q_node = get<0>(q_node_nucq);\n                                    auto nucq = get<1>(q_node_nucq);\n                                    auto weight_nucq = get<2>(q_node_nucq);\n                                    for (auto& q_1_node_nucq_1 : dfa.left_edges[q_node]){\n                                        NodeType q_1_node = get<0>(q_1_node_nucq_1);\n                                        auto nucq_1 = get<1>(q_1_node_nucq_1);\n                                        double weight_nucq_1 = get<2>(q_1_node_nucq_1);\n                                        right_seq.push_back(make_pair(q, nucq));\n                                        right_seq.push_back(make_pair(j-1, nucj_1));\n                                        q_node_nucs_list.push_back(make_tuple(q_1_node, q_node, nucq_1, nucq, nucq, right_seq, right_start, right_end, make_pair(-1,0), make_pair(-1,0), weight_nucq_1, weight_nucq, 0., weight_nucj_1, false, j_1_node));\n                                        right_seq.clear();\n                                    }\n                                }\n                            }\n                        }else if(q == j-3){\n                            for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                                if (get<1>(j_1_node_nucj_1) != nucj_1) continue;\n                                auto j_1_node = get<0>(j_1_node_nucj_1);\n                                auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                                for(auto& j_2_node_nucj_2 : dfa.left_edges[j_1_node]){\n                                    auto j_2_node = get<0>(j_2_node_nucj_2);\n                                    auto nucj_2 = get<1>(j_2_node_nucj_2);\n                                    auto weight_nucj_2 = get<2>(j_2_node_nucj_2);\n                                    for (auto& q_node_nucq : dfa.left_edges[j_2_node]){\n                                        auto q_node = get<0>(q_node_nucq);\n                                        auto nucq = get<1>(q_node_nucq);\n                                        auto weight_nucq = get<2>(q_node_nucq);\n                                        for (auto& q_1_node_nucq_1 : dfa.left_edges[q_node]){\n                                            NodeType q_1_node = get<0>(q_1_node_nucq_1);\n                                            auto nucq_1 = get<1>(q_1_node_nucq_1);\n                                            double weight_nucq_1 = get<2>(q_1_node_nucq_1);\n                                            right_seq.push_back(make_pair(q, nucq));\n                                            right_seq.push_back(make_pair(j-2, nucj_2));\n                                            right_seq.push_back(make_pair(j-1, nucj_1));\n                                            q_node_nucs_list.push_back(make_tuple(q_1_node, q_node, nucq_1, nucq, nucj_2, right_seq, right_start, right_end, make_pair(-1,0), make_pair(-1,0), weight_nucq_1, weight_nucq, weight_nucj_2, weight_nucj_1, false, j_1_node));\n                                            right_seq.clear();\n                                        }\n                                    }\n                                }\n                            }\n                        }\n                        else if(q == j-4){\n                            for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                                if (get<1>(j_1_node_nucj_1) != nucj_1) continue;\n                                auto j_1_node = get<0>(j_1_node_nucj_1);\n                                auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                                for(auto& j_2_node_nucj_2 : dfa.left_edges[j_1_node]){\n                                    auto j_2_node = get<0>(j_2_node_nucj_2);\n                                    auto nucj_2 = get<1>(j_2_node_nucj_2);\n                                    auto weight_nucj_2 = get<2>(j_2_node_nucj_2);\n                                    for(auto& j_3_node_nucj_3 : dfa.left_edges[j_2_node]){\n                                        auto j_3_node = get<0>(j_3_node_nucj_3);\n                                        for (auto& q_node_nucq : dfa.left_edges[j_3_node]){\n                                            auto q_node = get<0>(q_node_nucq);\n                                            auto nucq = get<1>(q_node_nucq);\n                                            auto weight_nucq = get<2>(q_node_nucq);\n                                            for (auto& q_1_node_nucq_1 : dfa.left_edges[q_node]){\n                                                NodeType q_1_node = get<0>(q_1_node_nucq_1);\n                                                auto nucq_1 = get<1>(q_1_node_nucq_1);\n                                                double weight_nucq_1 = get<2>(q_1_node_nucq_1);\n                                                right_seq.push_back(make_pair(q, nucq));\n                                                right_seq.push_back(make_pair(j-2, nucj_2));\n                                                right_seq.push_back(make_pair(j-1, nucj_1));\n                                                right_start = q+1;\n                                                right_end = j-2;\n                                                q_node_nucs_list.push_back(make_tuple(q_1_node, q_node, nucq_1, nucq, nucj_2, right_seq, right_start, right_end, j_3_node, j_2_node, weight_nucq_1, weight_nucq, weight_nucj_2, weight_nucj_1, false, j_1_node));\n                                                right_seq.clear();\n                                            }\n                                        }\n                                    }\n                                }\n                            }\n                        }\n                        else{\n                            for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                                if (get<1>(j_1_node_nucj_1) != nucj_1) continue;\n                                auto j_1_node = get<0>(j_1_node_nucj_1);\n                                auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                                for(auto& j_2_node_nucj_2 : dfa.left_edges[j_1_node]){\n                                    auto j_2_node = get<0>(j_2_node_nucj_2);\n                                    auto nucj_2 = get<1>(j_2_node_nucj_2);\n                                    auto weight_nucj_2 = get<2>(j_2_node_nucj_2);\n                                    for (auto& q_node : dfa.nodes[q]){\n                                        for (auto& q1_node_nucq : dfa.right_edges[q_node]){\n                                            auto q1_node = get<0>(q1_node_nucq);\n                                            auto nucq = get<1>(q1_node_nucq);\n                                            auto weight_nucq = get<2>(q1_node_nucq);\n                                            for (auto& q_1_node_nucq_1 : dfa.left_edges[q_node]){\n                                                NodeType q_1_node = get<0>(q_1_node_nucq_1);\n                                                auto nucq_1 = get<1>(q_1_node_nucq_1);\n                                                double weight_nucq_1 = get<2>(q_1_node_nucq_1);\n                                                right_seq.push_back(make_pair(q, nucq));\n                                                right_seq.push_back(make_pair(j-2, nucj_2));\n                                                right_seq.push_back(make_pair(j-1, nucj_1));\n                                                right_start = q+1;\n                                                right_end = j-2;\n                                                q_node_nucs_list.push_back(make_tuple(q_1_node, q_node, nucq_1, nucq, nucj_2, right_seq, right_start, right_end, q1_node, j_2_node, weight_nucq_1, weight_nucq, weight_nucj_2, weight_nucj_1,false, j_1_node));\n                                                right_seq.clear();\n                                            }\n                                        }\n                                    }\n                                }\n                            }\n                        }\n\n                        for (auto& q_node_nucs : q_node_nucs_list){\n                            auto q_1_node = get<0>(q_node_nucs);\n                            auto q_node = get<1>(q_node_nucs);\n                            auto nucq_1 = get<2>(q_node_nucs);\n                            auto nucq = get<3>(q_node_nucs);\n                            auto nucj_2 = get<4>(q_node_nucs);\n                            auto right_seq = get<5>(q_node_nucs);\n                            auto right_start = get<6>(q_node_nucs);\n                            auto right_end = get<7>(q_node_nucs);\n                            auto right_start_node = get<8>(q_node_nucs);\n                            auto right_end_node = get<9>(q_node_nucs);\n                            auto weight_nucq_1 = get<10>(q_node_nucs);\n                            auto weight_nucq = get<11>(q_node_nucs);\n                            auto weight_nucj_2 = get<12>(q_node_nucs);\n                            auto weight_nucj_1 = get<13>(q_node_nucs);\n                            bool q_equ_j_1 = get<14>(q_node_nucs);\n                            auto j_1_node = get<15>(q_node_nucs);\n\n                            double weight_right = 0.0;\n                            double weight_left = 0.0;\n\n                            if(q_equ_j_1){\n                                for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n                                    NucType new_nuci = get<1>(i1_node_nuci);\n                                    if (new_nuci != nuci) continue;\n                                    auto i1_node = get<0>(i1_node_nuci);\n                                    auto weight_nuci = get<2>(i1_node_nuci);\n\n                                    auto p_list = next_list[nucq_1][i1_node];\n                                    for (auto &p_node_nucp : p_list){\n                                        auto p_node = get<0>(p_node_nucp);\n                                        auto nucp = get<1>(p_node_nucp);\n                                        auto p = p_node.first; \n                                        PairType pair_nuc = NTP(nucp, nucq_1);\n\n                                        if (p == i + 1) continue; // stack\n                                        if (p - i + j - q - 2 > SINGLE_MAX_LEN) continue;\n                                        \n                                        NodeNucpair temp = {p_node.first, p_node.second, static_cast<NucPairType>(pair_nuc)};\n\n                                        auto& p_state = bestP[q_node][temp];\n\n                                        auto newscore = - func14(i, j-1, p, q-1, nuci, nucp, nucq_1, nucj_1, nuci, nucp, nucq_1, nucj_1) + p_state.score;\n                                        auto weight_left = weight_nuci + get_broken_codon_score(i1_node, p_node);\n                                        auto cai_score = p_state.cai_score + (weight_left + weight_nucj_1);\n\n                                        if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                            stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                                            sequence[i] = GET_ACGU(nuci);\n\n                                            auto temp_i1_to_p_nucs = get_nuc_from_dfa_cai<IndexType>(dfa, i1_node, p_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                            assert(temp_i1_to_p_nucs.size() == p - (i+1));\n                                            auto count = i+1;\n                                            for (auto& nuc : temp_i1_to_p_nucs){\n                                                sequence[count] = nuc;\n                                                count++;\n                                            }\n                                            assert(count == p);\n\n                                            assert(right_seq.size() == 1);\n                                            sequence[j-1] = GET_ACGU(right_seq[0].second);\n\n                                            structure[i] = '(';\n                                            structure[j-1] = ')';\n\n                                            no_backpointer = false;\n                                            break;\n                                        }if (!no_backpointer) break;\n                                    }if (!no_backpointer) break;\n                                }\n                            }else{\n                                for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n                                    NucType new_nuci = get<1>(i1_node_nuci);\n                                    if (new_nuci != nuci) continue;\n                                    auto i1_node = get<0>(i1_node_nuci);\n                                    auto weight_nuci = get<2>(i1_node_nuci);\n\n                                    auto p_list = next_list[nucq_1][i1_node];\n                                        \n                                    for (auto &p_node_nucp : p_list){\n                                        auto p_node = get<0>(p_node_nucp);\n                                        auto nucp = get<1>(p_node_nucp);\n                                        auto weight_nucp = get<2>(p_node_nucp);\n                                        auto p = p_node.first; \n                                        PairType pair_nuc = NTP(nucp, nucq_1);\n\n                                        if (p - i + j - q - 2 > SINGLE_MAX_LEN) continue;\n                                        \n                                        NodeNucpair temp = {p_node.first, p_node.second, static_cast<NucPairType>(pair_nuc)};\n\n                                        auto& p_state = bestP[q_node][temp];\n                                        auto newscore = 0;\n                                        if (p == i+1){\n                                            newscore = - func14(i, j-1, p, q-1, nuci, nucp, nucj_2, nucj_1, nuci, nucp, nucq_1, nucq) \n                                                            + p_state.score;\n\n                                            weight_right = get_broken_codon_score(q_node,j_1_node) + weight_nucj_1;       \n\n                                            auto cai_score = p_state.cai_score + (weight_nuci + weight_right);\n\n                                            if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                                stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                                                sequence[i] = GET_ACGU(nuci);\n                                                for(auto& idx_nucidx : right_seq){\n                                                    IndexType idx = idx_nucidx.first;\n                                                    NucType nucidx = idx_nucidx.second;\n                                                    sequence[idx] = GET_ACGU(nucidx);\n                                                }\n\n                                                structure[i] = '(';\n                                                structure[j-1] = ')';\n\n                                                auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, q_node, j_1_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                int count = q;\n                                                for (auto & nuc : temp_string ){\n                                                    sequence[count] = nuc;\n                                                    count++;\n                                                }\n\n                                                assert(count == j-1);\n                                                no_backpointer = false;\n                                                break;\n                                            }\n                                        }else if(p == i+2){\n                                            for (auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                                auto i2_node = get<0>(i2_node_nuci1);\n                                                if (p_node != i2_node) continue;\n\n                                                NucType nuci1 = get<1>(i2_node_nuci1);\n                                                auto weight_nuci1 = get<2>(i2_node_nuci1);\n                                                newscore = - func14(i, j-1, p, q-1, nuci, nuci1, nucj_2, nucj_1, nuci1, nucp, nucq_1, nucq) \n                                                                + p_state.score;\n     \n                                                weight_left = weight_nuci + weight_nuci1;\n                                                weight_right = weight_nucq + get_broken_codon_score(right_start_node, right_end_node) + weight_nucj_2 + weight_nucj_1;\n\n                                                auto cai_score = p_state.cai_score + (weight_left + weight_right);\n\n                                                if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                                    stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                                                    sequence[i] = GET_ACGU(nuci);\n                                                    sequence[i+1] = GET_ACGU(nuci1);\n                                                    for(auto& idx_nucidx : right_seq){\n                                                        IndexType idx = idx_nucidx.first;\n                                                        NucType nucidx = idx_nucidx.second;\n                                                        sequence[idx] = GET_ACGU(nucidx);\n                                                    }\n\n                                                    structure[i] = '(';\n                                                    structure[j-1] = ')';\n                                                    auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, right_start_node, right_end_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                    int count = right_start;\n                                                    for (auto & nuc : temp_string ){\n                                                        sequence[count] = nuc;\n                                                        count++;\n                                                    }\n                                                    assert(count == right_end);\n                                                    no_backpointer = false;\n                                                    break;\n                                                }        \n                                            }  \n                                        }else if(p == i+3){\n                                            for (auto& p_1_node_nucp_1 : dfa.left_edges[p_node]){\n                                                auto p_1_node = get<0>(p_1_node_nucp_1);\n                                                auto nucp_1 = get<1>(p_1_node_nucp_1);\n                                                auto weight_nucp_1 = get<2>(p_1_node_nucp_1);\n                                                for (auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                                    auto i2_node = get<0>(i2_node_nuci1);\n                                                    if (p_1_node != i2_node) continue;\n\n                                                    NucType nuci1 = get<1>(i2_node_nuci1);\n                                                    auto weight_nuci1 = get<2>(i2_node_nuci1);\n                                                    newscore = - func14(i, j-1, p, q-1, nuci, nuci1, nucj_2, nucj_1, nucp_1, nucp, nucq_1, nucq) \n                                                            + p_state.score;\n\n                                                    weight_left = weight_nuci + weight_nuci1 + weight_nucp_1;\n                                                    weight_right = weight_nucq + get_broken_codon_score(right_start_node,right_end_node) + weight_nucj_2 + weight_nucj_1;\n\n                                                    auto cai_score = p_state.cai_score + (weight_left + weight_right);\n\n                                                    if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                                        stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n                                                        \n                                                        sequence[i] = GET_ACGU(nuci);\n                                                        sequence[i+1] = GET_ACGU(nuci1);\n                                                        sequence[p-1] = GET_ACGU(nucp_1);\n\n                                                        for(auto& idx_nucidx : right_seq){\n                                                            IndexType idx = idx_nucidx.first;\n                                                            NucType nucidx = idx_nucidx.second;\n                                                            sequence[idx] = GET_ACGU(nucidx);\n                                                        }\n                                                        structure[i] = '(';\n                                                        structure[j-1] = ')';\n                                                        auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, right_start_node, right_end_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                        int count = right_start;\n                                                        for (auto & nuc : temp_string ){\n                                                            sequence[count] = nuc;\n                                                            count++;\n                                                        }\n                                                        assert(count == right_end);\n\n                                                        no_backpointer = false;\n                                                        break;\n                                                    }          \n                                                }if (!no_backpointer) break;\n                                            }\n                                        }else if(p == i+4){\n                                            for (auto& p_1_node_nucp_1 : dfa.left_edges[p_node]){\n                                                auto p_1_node = get<0>(p_1_node_nucp_1);\n                                                auto nucp_1 = get<1>(p_1_node_nucp_1);\n                                                auto weight_nucp_1 = get<2>(p_1_node_nucp_1);\n                                                for (auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                                    auto i2_node = get<0>(i2_node_nuci1);\n                                                    NucType nuci1 = get<1>(i2_node_nuci1);\n                                                    auto weight_nuci1 = get<2>(i2_node_nuci1);\n                                                    for (auto& i3_node_nuci2 : dfa.right_edges[i2_node]){\n                                                        auto i3_node = get<0>(i3_node_nuci2);\n                                                        if (i3_node != p_1_node) continue;\n                                                        auto nuci2 = get<1>(i3_node_nuci2);\n                                                        auto weight_nuci2 = get<2>(i3_node_nuci2);\n                                                        \n                                                        newscore = - func14(i, j-1, p, q-1, nuci, nuci1, nucj_2, nucj_1, nucp_1, nucp, nucq_1, nucq) \n                                                                + p_state.score;\n\n                                                        weight_left = weight_nuci + weight_nuci1 + weight_nuci2 + weight_nucp_1;\n                                                        weight_right = weight_nucq + get_broken_codon_score(right_start_node,right_end_node) + weight_nucj_2 + weight_nucj_1;\n\n                                                        auto cai_score = p_state.cai_score + (weight_left + weight_right);\n\n                                                        if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                                            stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                                                            sequence[i] = GET_ACGU(nuci);\n                                                            sequence[i+1] = GET_ACGU(nuci1);\n                                                            sequence[i+2] = GET_ACGU(nuci2);\n                                                            sequence[p-1] = GET_ACGU(nucp_1);\n\n                                                            for(auto& idx_nucidx : right_seq){\n                                                                IndexType idx = idx_nucidx.first;\n                                                                NucType nucidx = idx_nucidx.second;\n                                                                sequence[idx] = GET_ACGU(nucidx);\n                                                            }\n                                                            structure[i] = '(';\n                                                            structure[j-1] = ')';\n                                                            auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, right_start_node, right_end_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                            auto count = right_start;\n                                                            for (auto & nuc : temp_string ){\n                                                                sequence[count] = nuc;\n                                                                count++;\n                                                            }\n                                                            assert(count == right_end);\n\n                                                            no_backpointer = false;\n                                                            break;\n                                                        }\n                                                    }if (!no_backpointer) break;   \n                                                }if (!no_backpointer) break;\n                                            }\n                                        }else{\n                                            for (auto& i2_node_nuci1 : dfa.right_edges[i1_node]){\n                                                NucType nuci1 = get<1>(i2_node_nuci1);\n                                                auto i2_node = get<0>(i2_node_nuci1);\n                                                auto weight_nuci1 = get<2>(i2_node_nuci1);\n\n                                                for (auto& p_1_node_nucp_1 : dfa.left_edges[p_node]){\n                                                    auto nucp_1 = get<1>(p_1_node_nucp_1);\n                                                    auto p_1_node = get<0>(p_1_node_nucp_1);\n                                                    auto weight_nucp_1 = get<2>(p_1_node_nucp_1);\n\n                                                    newscore = - func14(i, j-1, p, q-1, nuci, nuci1, nucj_2, nucj_1, nucp_1, nucp, nucq_1, nucq) \n                                                                    + p_state.score;\n\n                                                    weight_left = weight_nuci + weight_nuci1 + get_broken_codon_score(i2_node, p_1_node) + weight_nucp_1;\n                                                    weight_right = weight_nucq + get_broken_codon_score(right_start_node,right_end_node) + weight_nucj_2 + weight_nucj_1;\n\n                                                    auto cai_score = p_state.cai_score + (weight_left + weight_right);\n                                                    if (state.score == newscore && abs(state.cai_score - cai_score) < epsilon){\n                                                        stk.push(make_tuple(p_node, q_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                                                        sequence[i] = GET_ACGU(nuci);\n                                                        sequence[i+1] = GET_ACGU(nuci1);\n                                                        sequence[p-1] = GET_ACGU(nucp_1);\n                                                        auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, i2_node, p_1_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                        int count = i+2;\n                                                        for (auto & nuc : temp_string ){\n                                                            sequence[count] = nuc;\n                                                            count++;\n                                                        }\n                                                        assert(count == p-1);\n\n                                                        for(auto& idx_nucidx : right_seq){\n                                                            IndexType idx = idx_nucidx.first;\n                                                            NucType nucidx = idx_nucidx.second;\n                                                            sequence[idx] = GET_ACGU(nucidx);\n                                                        }\n                                                        structure[i] = '(';\n                                                        structure[j-1] = ')';\n\n                                                        temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, right_start_node, right_end_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                                        count = right_start;\n                                                        for (auto & nuc : temp_string ){\n                                                            sequence[count] = nuc;\n                                                            count++;\n                                                        }\n                                                        assert(count == right_end);\n                                                        no_backpointer = false;\n                                                        break;\n                                                    }   \n                                                }if (!no_backpointer) break;\n                                            }\n                                        }if (!no_backpointer) break;\n\n                                    }if (!no_backpointer) break;\n                                }if (!no_backpointer) break;\n                            }\n                        }if (!no_backpointer) break;\n                    }\n                }\n                \n                // Multi\n                if (no_backpointer){\n\n                    NodeNucpair temp = {i_node.first, i_node.second, static_cast<NucPairType>(curr_pair_nuc)};\n\n\n                    auto& multi_state = bestMulti[j_node][temp];\n                    auto newscore = multi_state.score - func15(i, j, nuci, -1, -1, nucj_1, seq_length);\n\n                    if (state.score == newscore && abs(state.cai_score - multi_state.cai_score) < epsilon){\n                        stk.push(make_tuple(i_node, j_node, multi_state, Beam_type::BEAM_MULTI, curr_pair_nuc));\n                        \n                        sequence[i] = GET_ACGU(nuci);\n                        sequence[j-1] = GET_ACGU(nucj_1);\n                        structure[i] = '(';\n                        structure[j-1] = ')';\n\n                        no_backpointer = false;\n                    }\n                }\n\n                assert(no_backpointer == false);\n                break;\n\n            case Beam_type::BEAM_MULTI:\n                nuci = PTLN(curr_pair_nuc);\n                nucj_1 = PTRN(curr_pair_nuc);\n                j = j_node.first;\n                i = i_node.first;\n\n                \n                for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                    auto j_1_node = get<0>(j_1_node_nucj_1);\n                    auto weight_nucj_1 = get<2>(j_1_node_nucj_1);\n                    NodeType q_node = state.pre_node;\n                    q = q_node.first;\n\n                    if(q == j - 1 and q_node != j_1_node) continue;\n                    if(q == j - 2 and dfa.nodes[q].size() == dfa.nodes[j-1].size() and q_node.second != j_1_node.second) continue;\n\n                    \n\n                    for (size_t p_node_ = 0; p_node_ < 2 * q; ++p_node_) {\n\n                        auto& temp_state = bestM2[q_node][p_node_];\n                        if (temp_state.score == util::value_min<ScoreType>())\n                            continue;\n\n                        auto p_node = reverse_index2(p_node_);\n                        auto p = p_node.first;\n\n                        if(p <= i) continue;\n\n                        for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n\n                            auto i1_node = get<0>(i1_node_nuci);\n\n                            if(p == i + 1 and p_node != i1_node) continue;\n                            if(p == i + 2 and dfa.nodes[p].size() == dfa.nodes[i+1].size() and p_node.second != i1_node.second) continue;\n\n\n                            double weight_nuci = double(get<2>(i1_node_nuci));\n\n                            auto& m2_state = bestM2[q_node][p_node];\n\n                            auto cai_score = m2_state.cai_score + (weight_nuci + get_broken_codon_score(i1_node, p_node) + get_broken_codon_score(q_node, j_1_node) + weight_nucj_1);\n\n                            if (state.score == m2_state.score && abs(state.cai_score - cai_score) < epsilon){\n                                stk.push(make_tuple(p_node, q_node, m2_state, Beam_type::BEAM_M2, -1));\n\n                                auto temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, i1_node, p_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                auto count = i+1;\n                                for (auto & nuc : temp_string ){\n                                    sequence[count] = nuc;\n                                    count++;\n                                }\n                                assert(count == p);\n\n                                temp_string.clear();\n                                temp_string = get_nuc_from_dfa_cai<IndexType>(dfa, q_node, j_1_node, protein, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon);\n                                count = q;\n                                for (auto & nuc : temp_string ){\n                                    sequence[count] = nuc;\n                                    count++;\n                                }\n\n                                assert(count == j-1);\n                                \n                                no_backpointer = false;\n                            }if (!no_backpointer) break;\n                        }if (!no_backpointer) break;\n                    }if (!no_backpointer) break;\n                }\n                assert(no_backpointer == false);\n                break;\n\n            case Beam_type::BEAM_M2:\n                // M2 = M + P\n                i = i_node.first;\n                j = j_node.first;\n\n                for (size_t m_node_nucpair_ = 0; m_node_nucpair_ < 16 * j; ++m_node_nucpair_){\n                \n                    auto& p_state = bestP[j_node][m_node_nucpair_];\n                    if (p_state.score == util::value_min<ScoreType>())\n                        continue;\n\n                    auto m_node_nucpair = reverse_index(m_node_nucpair_);\n                    auto m = m_node_nucpair.node_first;\n                    auto m_num = m_node_nucpair.node_second;\n\n                    auto m_node = make_pair(m, m_num);\n\n                    auto pair_nuc = m_node_nucpair.nucpair;\n\n                    if (m <= i+4) continue; // no sharpturn\n\n                    auto nucm = PTLN(pair_nuc);\n                    auto nucj_1 = PTRN(pair_nuc); \n                    auto newscore = - func6(-1, -1, -1, -1, nucm, nucj_1, -1, seq_length) + p_state.score;\n\n                    auto& m_state = bestM[m_node][i_node];\n                    auto cai_score = m_state.cai_score + p_state.cai_score;\n\n                    if (state.score == m_state.score + newscore && state.cai_score == cai_score){\n                        stk.push(make_tuple(i_node, m_node, m_state, Beam_type::BEAM_M1, -1));\n                        stk.push(make_tuple(m_node, j_node, p_state, Beam_type::BEAM_P, pair_nuc));\n\n                        no_backpointer = false;\n                        break;\n                    }\n                    if (!no_backpointer) break;\n                }\n                assert(no_backpointer == false);\n                break;\n\n            case Beam_type::BEAM_M1: \n                // M = M + U\n                for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                    auto j_1_node = std::get<0>(j_1_node_nucj_1);\n                    auto weight_nucj_1 = std::get<2>(j_1_node_nucj_1);\n                    auto& m_state = bestM[j_1_node][i_node];\n                    auto cai_score = m_state.cai_score + weight_nucj_1;\n\n                    if (state.score == m_state.score && abs(state.cai_score - cai_score) < epsilon) {\n                        NucType nucj_1 = std::get<1>(j_1_node_nucj_1);\n                        stk.push(make_tuple(i_node, j_1_node, m_state, Beam_type::BEAM_M1, -1));\n\n                        sequence[j-1] = GET_ACGU(nucj_1);\n\n                        no_backpointer = false;\n                        break;\n                    }\n                }\n\n                // M = P\n                if(no_backpointer){\n                    for (auto& j_1_node_nucj_1 : dfa.left_edges[j_node]){\n                        NucType nucj_1 = std::get<1>(j_1_node_nucj_1);\n\n                        for (auto& i1_node_nuci : dfa.right_edges[i_node]){\n                            NucType nuci = std::get<1>(i1_node_nuci);\n                            PairType pair_nuc = NTP(nuci, nucj_1);\n                            \n                            NodeNucpair temp = {i_node.first, i_node.second, static_cast<NucPairType>(pair_nuc)};\n\n                            auto& p_state = bestP[j_node][temp];\n                            auto newscore = - func6(-1, -1, -1, -1, nuci, nucj_1, -1, seq_length) + p_state.score;\n\n                            if (state.score == newscore && abs(state.cai_score - p_state.cai_score) < epsilon) {\n                                stk.push(make_tuple(i_node, j_node, p_state, Beam_type::BEAM_P, pair_nuc));\n                                no_backpointer = false;\n                                break;\n                            }\n                        }if(!no_backpointer) break;\n                    }\n                }\n\n                // M = M2\n                if(no_backpointer){\n                    auto& m2_state = bestM2[j_node][i_node];\n\n\n\n                    if (state.score == m2_state.score && state.cai_score == m2_state.cai_score) {\n                        stk.push(make_tuple(i_node, j_node, m2_state, Beam_type::BEAM_M2, -1));\n                        no_backpointer = false;\n                    }\n                }\n\n                assert(no_backpointer == false);\n                break;\n            default:  // MANNER_NONE or other cases\n                \n                printf(\"wrong beam_type at %d, %d\\n\", i, j); fflush(stdout);\n                assert(false);\n        }\n\n    }\n    assert(string(sequence).size() == string(structure).size());\n    return {string(sequence), string(structure)};\n}\n}\n"
  },
  {
    "path": "src/beam_cky_parser.cc",
    "content": "\n#include <fstream>\n#include <iostream>\n#include <sys/time.h>\n#include <stack>\n#include <tuple>\n#include <cassert>\n#include <unordered_map>\n#include <algorithm>\n#include <string>\n#include <map>\n#include <set>\n#include <unordered_set>\n#include <chrono>\n#include <string>\n\n#include \"beam_cky_parser.h\"\n#include \"Utils/utility_v.h\"\n#include \"backtrace_iter.cc\"\n#include \"Utils/common.h\"\n\nusing namespace std;\n\nusing NodeType = std::pair<LinearDesign::IndexType, LinearDesign::NumType>;\n\n#define tetra_hex_tri -1\n\nnamespace LinearDesign {\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ndouble BeamCKYParser<ScoreType, IndexType, NodeType>::get_broken_codon_score(\n        const NodeType& start_node, const NodeType& end_node) {\n\n    IndexType s_index = start_node.first;\n    IndexType t_index = end_node.first;\n\n    if (s_index >= t_index)\n        return 0.0;\n\n    auto aa_left = protein[s_index / 3]; // tri letter\n\n    auto aa_right = protein[(int)(s_index / 3)];\n    if (t_index / 3 < protein.size()){\n        aa_right = protein[(int)(t_index / 3)];\n    }\n        \n    auto start_node_re_index = make_pair(s_index % 3, start_node.second);\n    auto end_node_re_index = make_pair(t_index % 3, end_node.second);\n\n    double ret = 0.0;\n\n    if (t_index - s_index < 3) {\n        if (s_index / 3 == t_index / 3) {\n            ret = std::get<0>(best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index,end_node_re_index)]);\n        }else{\n            double left_ln_cai = 0.0, right_ln_cai = 0.0;\n            if (s_index % 3 != 0) \n                left_ln_cai = std::get<0>(best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index,make_pair(0, 0))]);\n            if (t_index % 3 != 0) \n                right_ln_cai = std::get<0>(best_path_in_one_codon_unit[aa_right][make_tuple(make_pair(0, 0), end_node_re_index)]);\n            ret = left_ln_cai + right_ln_cai;\n        }\n    }else{\n        double left_ln_cai = 0.0, right_ln_cai = 0.0;\n        if (s_index % 3 != 0) \n            left_ln_cai = std::get<0>(best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index,make_pair(0, 0))]);\n        if (t_index % 3 != 0) \n            right_ln_cai = std::get<0>(best_path_in_one_codon_unit[aa_right][make_tuple(make_pair(0, 0), end_node_re_index)]);\n        ret = left_ln_cai + right_ln_cai;\n    }\n    return ret;\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::hairpin_beam(IndexType j, DFA_t& dfa) {\n \n    auto j_node = make_pair(j,j_num);\n\n    for (auto &j1_node_nucj : dfa.right_edges[j_node]) { // right_edges[j][j_num][j1_num][nuc]: false/true\n        \n        auto j1_node = std::get<0>(j1_node_nucj);\n        auto nucj = std::get<1>(j1_node_nucj);\n\n        auto weight_nucj = std::get<2>(j1_node_nucj);\n\n\n        for (auto &j4_node : dfa.nodes[j+4]){\n            const auto& jnext_list = next_pair[nucj][j4_node];\n\n            if (jnext_list.empty())\n                continue;\n\n            for (auto &jnext_node_nucjnext : jnext_list){\n                auto jnext_node = std::get<0>(jnext_node_nucjnext);\n                auto nucjnext = std::get<1>(jnext_node_nucjnext);\n                auto weight_nucjnext = std::get<2>(jnext_node_nucjnext);\n                auto jnext = jnext_node.first;\n\n                IndexType hairpin_length = jnext + 1 - j; //special hairpin\n                NodeNucpair temp = {j, j_num, static_cast<NucPairType>(NTP(nucj, nucjnext))};\n\n\n#ifdef SPECIAL_HP\n                if (hairpin_length == 5 or hairpin_length == 6 or hairpin_length == 8){\n                    for(auto & seq_score_weight : hairpin_seq_score_cai[j_node][jnext_node][NTP(nucj, nucjnext)]){\n                            auto seq =  get<0>(seq_score_weight);\n                            auto pre_cal_score =  get<1>(seq_score_weight);\n                            auto pre_cal_cai_score = get<2>(seq_score_weight);\n                            update_if_better(bestH[jnext_node][temp], pre_cal_score, pre_cal_cai_score);\n                    }\n\n                    continue;\n                }\n#endif\n\n                for (auto &j2_node_nucj1 : dfa.right_edges[j1_node]) {\n                    auto j2_node = std::get<0>(j2_node_nucj1);\n                    auto j2_num = j2_node.second;\n                    auto nucj1 = std::get<1>(j2_node_nucj1);\n                    auto weight_nucj1 = std::get<2>(j2_node_nucj1);\n\n                    for (auto& jnext_1_node_list : dfa.auxiliary_left_edges[jnext_node]){\n\n                        NodeType jnext_1_node = jnext_1_node_list.first;\n                        NumType jnext_1_num = jnext_1_node.second;\n                        if (jnext - j == 4 and (jnext_1_num != j2_num and dfa.nodes[j+2].size() == dfa.nodes[jnext-1].size())) continue;\n\n                        for (auto& nucjnext_1_weight : jnext_1_node_list.second){\n\n                            IndexType nucjnext_1 = get<0>(nucjnext_1_weight);\n                            auto weight_nucjnext_1 = get<1>(nucjnext_1_weight);\n\n                            auto newscore = - func12(j, jnext, nucj, nucj1, nucjnext_1, nucjnext, tetra_hex_tri);\n                            \n                            FinalScoreType cai_score =  weight_nucj + weight_nucj1 + weight_nucjnext_1 + weight_nucjnext; //ZL need to add weight_nucjnext\n\n                            if ((jnext_1_node.first - j2_node.first) <= SINGLE_MAX_LEN)\n                                cai_score += get_broken_codon_score_map[j2_node][jnext_1_node];\n                            else\n                                cai_score += get_broken_codon_score(j2_node,jnext_1_node);\n\n                            update_if_better(bestH[jnext_node][temp], newscore, cai_score); \n\n                        }\n                    }\n                }\n            }\n        }\n    }\n\n    // for every state h in H[j]\n    //    1. extend h(i, j) to h(i, jnext)\n    //    2. generate p(i, j)\n    for (size_t i_node_nucpair_ = 0; i_node_nucpair_ < 16 * j; ++i_node_nucpair_) {\n\n        if (bestH[j_node][i_node_nucpair_].score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node_nucpair = reverse_index(i_node_nucpair_);\n        \n        auto i = i_node_nucpair.node_first;\n        auto i_num = i_node_nucpair.node_second;\n        auto pair_nuc = i_node_nucpair.nucpair;\n        auto i_node = make_pair(i,i_num);\n\n        auto nuci = PTLN(pair_nuc);\n        auto nucj = PTRN(pair_nuc);\n\n\n        for (const auto& item : dfa.auxiliary_right_edges[j_node]){\n            auto j1_node = item.first;\n            auto jnext_list = next_pair[nuci][j1_node];\n\n            if (jnext_list.empty()) continue;\n\n            for (auto &jnext_node_nucjnext : jnext_list){\n                auto jnext_node = std::get<0>(jnext_node_nucjnext);\n                auto nucjnext = std::get<1>(jnext_node_nucjnext);\n                auto jnext = jnext_node.first;\n                auto weight_nucjnext = std::get<2>(jnext_node_nucjnext);\n                auto hairpin_length = jnext + 1 - i;\n\n                NodeNucpair temp = {i, i_num, static_cast<NucPairType>(NTP(nuci, nucjnext))};\n\n#ifdef SPECIAL_HP\n\n                if (hairpin_length == 5 or hairpin_length == 6 or hairpin_length == 8){\n                    for(auto & seq_score_weight : hairpin_seq_score_cai[i_node][jnext_node][NTP(nuci, nucjnext)]){\n                            auto seq =  get<0>(seq_score_weight);\n                            auto pre_cal_score =  get<1>(seq_score_weight);\n                            auto pre_cal_cai_score = get<2>(seq_score_weight);\n                            update_if_better(bestH[jnext_node][temp], pre_cal_score, pre_cal_cai_score);\n                    }\n\n                    continue;\n                }\n\n#endif          \n\n                for (auto &i1_node_newnuci : dfa.right_edges[i_node]){\n                    NucType newnuci = get<1>(i1_node_newnuci);\n                    if (nuci != newnuci) continue;\n                    NodeType i1_node = get<0>(i1_node_newnuci);\n                    double weight_newnuci = get<2>(i1_node_newnuci);\n\n\n                    for (auto &i2_node_nuci1 : dfa.right_edges[i1_node]) {\n                        auto i2_node = get<0>(i2_node_nuci1);\n                        auto nuci1 = get<1>(i2_node_nuci1);\n                        auto weight_nuci1 = get<2>(i2_node_nuci1);\n\n                        for (auto &jnext_1_node_nucjnext_1 : dfa.left_edges[jnext_node]) {\n                            auto jnext_1_node = get<0>(jnext_1_node_nucjnext_1);\n                            auto nucjnext_1 = get<1>(jnext_1_node_nucjnext_1);\n                            auto weight_nucjnext_1 = get<2>(jnext_1_node_nucjnext_1);\n\n                            auto newscore = - func12(i, jnext, nuci, nuci1, nucjnext_1, nucjnext, tetra_hex_tri);\n\n                            FinalScoreType cai_score =  weight_newnuci + weight_nuci1 + weight_nucjnext_1 + weight_nucjnext; //move weight_nucjnext from H to P to here. Since we added SH here, so it must be here.\n\n                            if ((jnext_1_node.first - i2_node.first) <= SINGLE_MAX_LEN)\n                                cai_score += get_broken_codon_score_map[i2_node][jnext_1_node];\n                            else\n                                cai_score += get_broken_codon_score(i2_node,jnext_1_node);\n\n                            update_if_better(bestH[jnext_node][temp], newscore, cai_score);\n\n                        }\n                    }\n                }\n            }\n        }\n\n        auto& state = bestH[j_node][i_node_nucpair_];\n\n        for (auto &j1_node_newnucj : dfa.right_edges[j_node]){\n            NucType newnucj = get<1>(j1_node_newnucj);\n            if (nucj != newnucj) continue;\n            NodeType j1_node = get<0>(j1_node_newnucj);\n            update_if_better(bestP[j1_node][i_node_nucpair_], state.score, state.cai_score);\n        }\n    }\n\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::Multi_beam(IndexType j, DFA_t& dfa){\n    \n    NodeType j_node = make_pair(j, j_num);\n\n    for (size_t i_node_nucpair_ = 0; i_node_nucpair_ < 16 * j; ++i_node_nucpair_){\n\n        auto& new_state_score = bestMulti[j_node][i_node_nucpair_];\n\n        if (new_state_score.score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node_nucpair = reverse_index(i_node_nucpair_);\n        auto i = i_node_nucpair.node_first;\n        auto i_num = i_node_nucpair.node_second;\n        auto pair_nuc = i_node_nucpair.nucpair;\n        auto nuci = PTLN(pair_nuc);\n        auto nucj_1 = PTRN(pair_nuc);\n\n        auto& jnext_list = next_pair[nuci][j_node];\n\n        if (!jnext_list.empty()){\n\n            for (auto &jnext_node_nucjnext : jnext_list){\n                auto jnext_node = std::get<0>(jnext_node_nucjnext);\n                auto nucjnext = std::get<1>(jnext_node_nucjnext);\n                auto weight_nucjnext = std::get<2>(jnext_node_nucjnext);\n                auto jnext = jnext_node.first;\n\n                for (auto &jnext1_node_newnucjnext : dfa.right_edges[jnext_node]){\n                    auto jnext1_node = std::get<0>(jnext1_node_newnucjnext);\n                    auto newnucjnext = std::get<1>(jnext1_node_newnucjnext);\n                    if (newnucjnext == nucjnext){\n                        double cai_score;\n\n                        if ((jnext_node.first - new_state_score.pre_node.first) <= SINGLE_MAX_LEN)\n                            cai_score = new_state_score.pre_left_cai + (get_broken_codon_score_map[new_state_score.pre_node][jnext_node] + weight_nucjnext);\n                        else    \n                            cai_score = new_state_score.pre_left_cai + (get_broken_codon_score(new_state_score.pre_node, jnext_node) + weight_nucjnext);\n\n                        NodeNucpair temp = {i, i_num, static_cast<NucPairType>(NTP(nuci, nucjnext))};\n\n                        update_if_better(bestMulti[jnext1_node][temp], new_state_score.score, cai_score, new_state_score.pre_node, new_state_score.pre_left_cai);\n                    }\n                }\n            }\n        }\n        //  2. generate multi(i, j) -> p(i, j)\n        auto newscore = new_state_score.score - func15(i, j, nuci, -1, -1, nucj_1, seq_length); // hzhang: TODO\n        update_if_better(bestP[j_node][i_node_nucpair_], newscore, new_state_score.cai_score);\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::P_beam(IndexType j, DFA_t& dfa){\n    \n    auto j_node = make_pair(j, j_num);\n\n    if (j < seq_length){\n        for (size_t i_node_nucpair_ = 0; i_node_nucpair_ < 16 * j; ++i_node_nucpair_){\n        \n            auto& state = bestP[j_node][i_node_nucpair_];\n            if (state.score == util::value_min<ScoreType>())\n                continue;\n\n            auto i_node_nucpair = reverse_index(i_node_nucpair_);\n            auto i = i_node_nucpair.node_first;\n            \n            if (i <= 0) continue;\n\n            auto i_num = i_node_nucpair.node_second;\n            auto pair_nuc = i_node_nucpair.nucpair;\n            auto i_node = make_pair(i, i_num);\n\n            auto nuci = PTLN(pair_nuc);\n            auto nucj_1 = PTRN(pair_nuc);  \n\n            // stacking\n            for (auto &j1_node_nucj : dfa.right_edges[j_node]){\n                auto j1_node = std::get<0>(j1_node_nucj);\n                auto nucj = std::get<1>(j1_node_nucj);\n                auto weight_nucj = std::get<2>(j1_node_nucj);\n\n                for (auto &i_1_node_nuci_1 : dfa.left_edges[i_node]){\n                    auto i_1_node = std::get<0>(i_1_node_nuci_1);\n                    auto nuci_1 = std::get<1>(i_1_node_nuci_1);\n                    auto weight_nuci_1 = std::get<2>(i_1_node_nuci_1);\n                    auto outer_pair = NTP(nuci_1, nucj);\n                    if (_allowed_pairs[nuci_1][nucj]){\n                        auto newscore = stacking_score[outer_pair-1][pair_nuc-1] + state.score;\n                        double cai_score = state.cai_score + (weight_nuci_1 + weight_nucj);\n                        NodeNucpair temp = {i_1_node.first, i_1_node.second, static_cast<NucPairType>(NTP(nuci_1, nucj))};\n                        update_if_better(bestP[j1_node][temp], newscore, cai_score);\n                    }\n                }\n            }\n\n            // right bulge: ((...)..) \n            for (auto &j1_node_list : dfa.auxiliary_right_edges[j_node]){\n                auto j1_node = j1_node_list.first;\n\n                for (auto &i_1_node_nuci_1 : dfa.left_edges[i_node]){\n                    auto i_1_node = std::get<0>(i_1_node_nuci_1);\n                    auto nuci_1 = std::get<1>(i_1_node_nuci_1);\n                    auto weight_nuci_1 = std::get<2>(i_1_node_nuci_1);\n\n                    auto q_list = next_list[nuci_1][j1_node]; \n\n                    for (auto& q_node_nucq : q_list){\n\n                        auto q_node = std::get<0>(q_node_nucq);\n\n                        auto q_num = q_node.second;\n                        auto q = q_node.first;\n\n                        if (q-j > SINGLE_MAX_LEN) break;\n                        \n                        auto nucq = std::get<1>(q_node_nucq);\n                        auto weight_nucq = std::get<2>(q_node_nucq);\n                        auto outer_pair = NTP(nuci_1, nucq);\n                        \n                        for(auto& q1_node_list : dfa.auxiliary_right_edges[q_node]){\n                            NodeType q1_node = q1_node_list.first;\n                            if(dfa.nodes[q].size() == 1 and dfa.nodes[q+1].size() == 2 and ((q1_node_list.second)[0]).first != nucq) continue;\n                            \n                            auto newscore = bulge_score[outer_pair-1][pair_nuc-1][q-j-1]\n                                            + state.score;\n\n                            double cai_score;\n                            if ((q_node.first - j_node.first) <= SINGLE_MAX_LEN)\n                                cai_score = state.cai_score + (weight_nuci_1 + get_broken_codon_score_map[j_node][q_node] + weight_nucq);\n                            else\n                                cai_score = state.cai_score + (weight_nuci_1 + get_broken_codon_score(j_node, q_node) + weight_nucq);\n\n                            NodeNucpair temp = {i_1_node.first, i_1_node.second, static_cast<NucPairType>(outer_pair)};\n\n                            update_if_better(bestP[q1_node][temp], newscore, cai_score);\n                            break;\n                        }\n                    }                        \n                }\n            }\n\n            // left bulge: (..(...)) \n            for (auto &j1_node_nucj : dfa.right_edges[j_node]){\n                auto j1_node = std::get<0>(j1_node_nucj);\n                auto nucj = std::get<1>(j1_node_nucj);\n                auto weight_nucj = std::get<2>(j1_node_nucj);\n\n                for (auto &i_1_node_list : dfa.auxiliary_left_edges[i_node]){\n                    auto i_1_node = i_1_node_list.first;\n                    auto p_list = prev_list[nucj][i_1_node]; \n\n                    for (auto &p_node_nucp_1 : p_list){\n\n                        auto p_node = std::get<0>(p_node_nucp_1);\n                        auto p_num = p_node.second;\n                        auto p = p_node.first;\n\n                        if (i-p > SINGLE_MAX_LEN) break;\n\n                        auto nucp_1 = std::get<1>(p_node_nucp_1);\n                        auto outer_pair = NTP(nucp_1, nucj);\n\n                        for(auto& p_1_node_new_nucp_1 : dfa.left_edges[p_node]){\n                            \n                            NucType new_nucp_1 = std::get<1>(p_1_node_new_nucp_1);\n                            if(nucp_1 != new_nucp_1) continue;\n                            NodeType p_1_node = std::get<0>(p_1_node_new_nucp_1);\n                            auto weight_nucp_1 = std::get<2>(p_1_node_new_nucp_1);\n\n                            auto newscore = bulge_score[outer_pair-1][pair_nuc-1][i-p-1]\n                                            + state.score;\n                            \n                            double cai_score;\n\n                            if ((i_node.first - p_node.first) <= SINGLE_MAX_LEN)\n                                cai_score = state.cai_score + (weight_nucp_1 + get_broken_codon_score_map[p_node][i_node] + weight_nucj);\n                            else\n                                cai_score = state.cai_score + (weight_nucp_1 + get_broken_codon_score(p_node, i_node) + weight_nucj);\n\n                            NodeNucpair temp = {p_1_node.first, (NumType)p_1_node.second, static_cast<NucPairType>(outer_pair)};\n                            update_if_better(bestP[j1_node][temp], newscore, cai_score);\n                        }\n                    }                 \n                }\n            }\n\n            // internal loop\n            for (auto &j1_node_dict : dfa.auxiliary_right_edges[j_node]){\n                auto j1_node = j1_node_dict.first;\n                auto j1_num = j1_node.second;\n\n                for (auto &i_1_node_nuci_1 : dfa.left_edges[i_node]){\n                    auto i_1_node = std::get<0>(i_1_node_nuci_1);\n                    auto i_1_num = i_1_node.second;\n                    auto nuci_1 = std::get<1>(i_1_node_nuci_1);\n                    auto weight_nuci_1 = std::get<2>(i_1_node_nuci_1);\n\n                    for (IndexType p = i-1; p > max(i - SINGLE_MAX_LEN, 0); --p) {//ZL, i-(p-1)<=len => i - len < p\n                        vector<pair<int, NumType>> p_node_list;\n                        \n                        if (p == i - 1)\n                            p_node_list.push_back(i_1_node);\n                        else if (p == i - 2) // hzhang: N.B. add this p, i-1, i o--o--o\n                            for (auto &p_node_dict : dfa.auxiliary_left_edges[i_1_node])\n                                p_node_list.push_back(p_node_dict.first);\n                        else\n                            p_node_list = dfa.nodes[p];\n                        \n                        for (auto &p_node : p_node_list){\n                            for (auto &p1_node_nucp : dfa.right_edges[p_node]){\n\n                                auto p1_node = std::get<0>(p1_node_nucp);\n                                auto p1_num = p1_node.second;\n                                auto nucp = std::get<1>(p1_node_nucp);\n                                auto weight_nucp = std::get<2>(p1_node_nucp);\n\n                                if (p == i - 1 and nucp != nuci_1) continue;\n                                else if (p == i - 2 and p1_num != i_1_num) continue;\n                                else if (p == i - 3 and p1_num != i_1_num and dfa.nodes[p+1].size() == dfa.nodes[i-1].size()) continue; \n\n                                for (auto &p_1_node_nucp_1 : dfa.left_edges[p_node]){\n                                    auto p_1_node = std::get<0>(p_1_node_nucp_1);\n                                    auto nucp_1 = std::get<1>(p_1_node_nucp_1);\n                                    auto weight_nucp_1 = std::get<2>(p_1_node_nucp_1);\n\n                                    auto q_list = next_list[nucp_1][j1_node]; \n\n                                    for (auto &q_node_nucq : q_list){\n\n                                        auto q_node = std::get<0>(q_node_nucq);\n                                        auto q_num = q_node.second;\n                                        auto q = q_node.first;\n\n                                        if (i-p+q-j > SINGLE_MAX_LEN) //check if q is still in the internal loop limit boundary.\n                                            break;\n\n                                        auto nucq = std::get<1>(q_node_nucq);\n                                        auto weight_nucq = std::get<2>(q_node_nucq);\n\n                                        for(auto& q1_node_list : dfa.auxiliary_right_edges[q_node]){\n                                            NodeType q1_node = q1_node_list.first;\n                                            if(dfa.nodes[q].size() == 1 and dfa.nodes[q+1].size() == 2 and ((q1_node_list.second)[0]).first != nucq) continue;\n                                            NodeNucpair temp = {p_1_node.first, p_1_node.second, static_cast<NucPairType>(NTP(nucp_1, nucq))};\n                                            auto& BestP_val = bestP[q1_node][temp];\n\n                                            for(auto & nucj_weightj: j1_node_dict.second){\n                                                auto nucj = nucj_weightj.first;\n                                                auto weight_nucj = nucj_weightj.second;\n                                                if (q == j+1){\n                                                    auto newscore = - func14(p-1, q, i, j-1, nucp_1, nucp, nucj, nucq, nuci_1, nuci, nucj_1, nucj) + state.score;\n                                                    \n                                                    double weight_left;\n                                                    if (p == i-1){\n                                                        weight_left = weight_nucp_1 + weight_nucp;\n                                                    }\n                                                    else{\n                                                        if (i_1_node.first - p1_node.first <= SINGLE_MAX_LEN)\n                                                            weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score_map[p1_node][i_1_node] + weight_nuci_1;\n                                                        else\n                                                            weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score(p1_node, i_1_node) + weight_nuci_1;\n                                                    }\n                                                    double cai_score = state.cai_score + (weight_left + weight_nucj + weight_nucq); //j+1 == q\n\n                                                    update_if_better(BestP_val, newscore, cai_score);\n                                                }else if (q == j+2){\n                                                    for(auto& q_1_node_list : dfa.auxiliary_left_edges[q_node]){\n                                                        auto q_1_node = q_1_node_list.first;\n                                                        NumType q_1_num = q_1_node.second;\n                                                        if (q_1_num != j1_num) continue;\n                                                        for(auto & nucq_1_weight : q_1_node_list.second){\n                                                            auto nucq_1 = nucq_1_weight.first;\n                                                            auto weight_nucq_1 = nucq_1_weight.second;\n                                                            auto newscore = - func14(p-1, q, i, j-1, nucp_1, nucp, nucq_1, nucq, nuci_1, nuci, nucj_1, nucj) + state.score;\n                                                            \n                                                            double weight_left;\n                                                            if (p == i-1){\n                                                                weight_left = weight_nucp_1 + weight_nucp;\n                                                            }\n                                                            else{\n                                                                // assert(p < i-1);\n                                                                if (i_1_node.first - p1_node.first <= SINGLE_MAX_LEN)\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score_map[p1_node][i_1_node] + weight_nuci_1;\n                                                                else\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score(p1_node, i_1_node) + weight_nuci_1;\n                                                            }\n\n                                                            auto cai_score = state.cai_score + (weight_left + weight_nucj + weight_nucq_1 + weight_nucq);\n\n                                                            update_if_better(BestP_val, newscore, cai_score);\n                                                        }\n                                                        if(dfa.nodes[q-1].size() == 2) break;\n                                                    }\n                                                }else if (q == j + 3){\n                                                    for(auto& q_1_node_list : dfa.auxiliary_left_edges[q_node]){\n                                                        auto q_1_node = q_1_node_list.first;\n                                                        NumType q_1_num = q_1_node.second;\n                                                        if (q_1_num != j1_num and dfa.nodes[q-1].size() == dfa.nodes[j+1].size()) continue;\n                                                        for(auto & nucq_1_weight : q_1_node_list.second){\n                                                            auto nucq_1 = nucq_1_weight.first;\n                                                            auto weight_nucq_1 = nucq_1_weight.second;\n                                                            auto newscore = - func14(p-1, q, i, j-1, nucp_1, nucp, nucq_1, nucq, nuci_1, nuci, nucj_1, nucj) + state.score;\n                                                            \n                                                            double weight_left;\n                                                            if (p == i-1){\n                                                                weight_left = weight_nucp_1 + weight_nucp;\n                                                            }\n                                                            else{\n                                                                // assert(p < i-1);\n                                                                if (i_1_node.first - p1_node.first <= SINGLE_MAX_LEN)\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score_map[p1_node][i_1_node] + weight_nuci_1;\n                                                                else\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score(p1_node, i_1_node) + weight_nuci_1;\n                                                            }\n\n                                                            double cai_score;\n                                                            if (q_1_node.first - j1_node.first <= SINGLE_MAX_LEN)\n                                                                cai_score = state.cai_score + (weight_left + weight_nucj + get_broken_codon_score_map[j1_node][q_1_node] + weight_nucq_1 + weight_nucq);\n                                                            else\n                                                                cai_score = state.cai_score + (weight_left + weight_nucj + get_broken_codon_score(j1_node, q_1_node) + weight_nucq_1 + weight_nucq);\n\n                                                            update_if_better(BestP_val, newscore, cai_score);\n                                                        }\n                                                        if(dfa.nodes[q-1].size() == 2) break;\n                                                    }\n                                                }else{\n                                                    for(auto& q_1_node_list : dfa.auxiliary_left_edges[q_node]){\n                                                        auto q_1_node = q_1_node_list.first;\n                                                        for(auto & nucq_1_weight : q_1_node_list.second){\n                                                            auto nucq_1 = nucq_1_weight.first;\n                                                            auto weight_nucq_1 = nucq_1_weight.second;\n                                                            auto newscore = - func14(p-1, q, i, j-1, nucp_1, nucp, nucq_1, nucq, nuci_1, nuci, nucj_1, nucj) + state.score;\n\n\n                                                            double weight_left;\n                                                            if (p == i-1){\n                                                                weight_left = weight_nucp_1 + weight_nucp;\n                                                            }\n                                                            else{\n                                                                // assert(p < i-1);\n                                                                if (i_1_node.first - p1_node.first <= SINGLE_MAX_LEN){\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score_map[p1_node][i_1_node] + weight_nuci_1;\n                                                                }\n                                                                else\n                                                                    weight_left = weight_nucp_1 + weight_nucp + get_broken_codon_score(p1_node, i_1_node) + weight_nuci_1;\n                                                            }\n\n                                                            double cai_score;\n                                                            if (q_1_node.first - j1_node.first <= SINGLE_MAX_LEN){\n                                                                cai_score = state.cai_score + (weight_left + weight_nucj + get_broken_codon_score_map[j1_node][q_1_node] + weight_nucq_1 + weight_nucq);\n                                                            }\n                                                            else\n                                                                cai_score = state.cai_score + (weight_left + weight_nucj + get_broken_codon_score(j1_node, q_1_node) + weight_nucq_1 + weight_nucq);\n                                                            \n                                                            update_if_better(BestP_val, newscore, cai_score);\n                                                        }\n                                                        if(dfa.nodes[q-1].size() == 2) break;\n                                                    }\n                                                }\n                                            }\n                                        }\n                                    }\n                                }\n                            }\n                        }\n                    }\n                }\n            }\n        }\n    }\n    // M = P and M_P = P\n    for (size_t i_node_nucpair_ = 0; i_node_nucpair_ < 16 * j; ++i_node_nucpair_){\n        auto& state = bestP[j_node][i_node_nucpair_];\n        if (state.score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node_nucpair = reverse_index(i_node_nucpair_);\n        auto i = i_node_nucpair.node_first;\n        auto i_num = i_node_nucpair.node_second;\n        auto pair_nuc = i_node_nucpair.nucpair;\n        auto i_node = make_pair(i, i_num);\n\n        auto nuci = PTLN(pair_nuc);\n        auto nucj_1 = PTRN(pair_nuc);\n\n        if (i > 0 and j < seq_length){\n\n            auto M1_score = - func6(i, j-1, j-1, -1, nuci, nucj_1, -1, seq_length) + state.score;\n            \n            update_if_better(bestM[j_node][i_node], M1_score, state.cai_score);\n            update_if_better(bestM_P[j_node][i_node], M1_score, state.cai_score);\n        }\n    }\n\n    // M2 = M + M_P\n    for (size_t i_node_ = 0; i_node_ < 2 * j; ++i_node_) {\n        auto& state = bestM_P[j_node][i_node_];\n        auto i_node = reverse_index2(i_node_);\n        auto i = i_node.first;\n\n        if (state.score == util::value_min<ScoreType>())\n            continue;\n\n        if (i > 0 and j < seq_length){\n            for (size_t m_node = 0; m_node < 2 * i; ++m_node){\n                auto& m_new_state_score = bestM[i_node][m_node];\n\n                if (m_new_state_score.score == util::value_min<ScoreType>())\n                    continue;\n\n                auto newscore = m_new_state_score.score + state.score;\n                auto cai_score = m_new_state_score.cai_score + state.cai_score;\n                update_if_better(bestM2[j_node][m_node], newscore, cai_score);\n            }\n        }\n    }\n\n    // C = C + P\n    for (size_t i_node_nucpair_ = 0; i_node_nucpair_ < 16 * j; ++i_node_nucpair_){\n        auto& state = bestP[j_node][i_node_nucpair_];\n        if (state.score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node_nucpair = reverse_index(i_node_nucpair_);\n        auto i = i_node_nucpair.node_first;\n        auto i_num = i_node_nucpair.node_second;\n        auto pair_nuc = i_node_nucpair.nucpair;\n        auto i_node = make_pair(i, i_num);\n\n        auto nuci = PTLN(pair_nuc);\n        auto nucj_1 = PTRN(pair_nuc); \n        \n        if (i > 0){\n            auto& prefix_C = bestC[i_node];\n\n            if (prefix_C.score != util::value_min<ScoreType>()){\n                auto newscore = - func3(i, j-1, nuci, nucj_1, seq_length) + prefix_C.score + state.score;\n\n                auto cai_score = prefix_C.cai_score + state.cai_score;\n                update_if_better(bestC[j_node], newscore, cai_score);\n            }\n        }\n        else{\n            auto newscore = - func3(0, j-1, nuci, nucj_1, seq_length) + state.score;\n\n            update_if_better(bestC[j_node], newscore, state.cai_score);\n        }\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::M2_beam(IndexType j, DFA_t& dfa){\n    \n    auto j_node = make_pair(j, j_num);\n    for (size_t i_node_ = 0; i_node_ < 2 * j; ++i_node_) {\n        auto& state = bestM2[j_node][i_node_];\n        if (state.score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node = reverse_index2(i_node_);\n        auto i = i_node.first;\n\n        // 1. multi-loop\n        for (IndexType p = i-1; p >= max(i - SINGLE_MAX_LEN, 0); --p){\n            vector<pair<int, NumType>> p_node_list;\n            if (p == i - 1)\n                for(auto& p_node_dict : dfa.auxiliary_left_edges[i_node])\n                    p_node_list.push_back(p_node_dict.first);\n            else p_node_list = dfa.nodes[p];\n\n            for (auto &p_node : p_node_list){\n                for (auto &p1_node_nucp : dfa.right_edges[p_node]){\n                    auto p1_node = std::get<0>(p1_node_nucp);\n                    auto nucp = std::get<1>(p1_node_nucp);\n                    auto weight_nucp = std::get<2>(p1_node_nucp);\n\n                    if(p == i - 1 and p1_node != i_node) continue;\n                    if(p == i - 2 and dfa.nodes[p+1].size() == dfa.nodes[i].size() and p1_node.second != i_node.second) continue;\n\n                    auto q_list = next_pair[nucp][j_node];\n\n                    for (auto &q_node_nucq : q_list){\n                        auto q_node = std::get<0>(q_node_nucq);\n                        auto nucq = std::get<1>(q_node_nucq);\n                        auto weight_nucq = std::get<2>(q_node_nucq);\n                        auto q = q_node.first;\n\n                        if (i - p + q - j - 1 > SINGLE_MAX_LEN) continue; //ZL, i-p-1+q-j\n                        auto outer_pair = NTP(nucp, nucq);\n                        for (auto &q1_node_newnucq : dfa.right_edges[q_node]){\n                            auto newnucq = std::get<1>(q1_node_newnucq);\n                            if (newnucq == nucq) {\n                                auto q1_node = std::get<0>(q1_node_newnucq);\n\n                                double cai_score = state.cai_score + (weight_nucp + get_broken_codon_score_map[p1_node][i_node] + get_broken_codon_score_map[j_node][q_node] + weight_nucq);\n                                double temp_left_cai = state.cai_score + (weight_nucp + get_broken_codon_score_map[p1_node][i_node]);\n                                NodeNucpair temp = {p_node.first, p_node.second, static_cast<NucPairType>(NTP(nucp, nucq))};\n                                update_if_better(bestMulti[q1_node][temp], state.score, cai_score, j_node, temp_left_cai);\n                                break;\n                            }\n                        }\n                    }\n                }\n            }\n        }\n        //  2. M = M2\n        update_if_better(bestM[j_node][i_node], state.score, state.cai_score);\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::M_beam(IndexType j, DFA_t& dfa)\n{\n    \n    auto j_node = make_pair(j, j_num);\n\n    for (size_t i_node_ = 0; i_node_ < 2 * j; ++i_node_) {\n        auto& state = bestM[j_node][i_node_];\n\n        if (state.score == util::value_min<ScoreType>())\n            continue;\n\n        auto i_node = reverse_index2(i_node_);\n        for (auto &j1_node_nucj : dfa.right_edges[j_node]){\n            auto j1_node = std::get<0>(j1_node_nucj);\n            auto nucj = std::get<1>(j1_node_nucj);\n            auto weight_nucj = std::get<2>(j1_node_nucj);\n\n            double cai_score = state.cai_score + weight_nucj;\n            update_if_better(bestM[j1_node][i_node], state.score, cai_score);\n        }\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\ntemplate <IndexType j_num>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::C_beam(IndexType j, DFA_t& dfa)\n{\n    //  beam of C\n    //  C = C + U\n    auto j_node = make_pair(j, j_num);\n\n    auto& state = bestC[j_node];\n\n\n    for (auto &j1_node_nucj : dfa.right_edges[j_node]){\n        NodeType j1_node = std::get<0>(j1_node_nucj);\n        IndexType nucj = std::get<1>(j1_node_nucj);\n        auto weight_nucj = std::get<2>(j1_node_nucj);\n\n        double cai_score = state.cai_score + (double)weight_nucj;\n        update_if_better(bestC[j1_node], state.score, cai_score);\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::get_next_pair(DFA_t& dfa) {\n    vector<tuple<NodeType, NucType, double>> temp_vector; \n    for (NucType nuci = 0; nuci < NOTON; nuci++) {\n        for (IndexType j = seq_length; j > 0; j--) {\n\n            for (auto& j_node : dfa.nodes[j]) {\n                for (auto& item : dfa.auxiliary_left_edges[j_node]) {\n                    NodeType j_1_node = item.first;\n                    temp_vector.clear();\n                    for (auto& nuc_weight : item.second){\n                        auto nuc = std::get<0>(nuc_weight);\n                        auto weight_nuc = std::get<1>(nuc_weight);\n                        if (_allowed_pairs[nuci][nuc])\n                            temp_vector.push_back(make_tuple(j_1_node, nuc, weight_nuc));\n                    }\n                    if(temp_vector.size() == 0){\n                        if (next_pair[nuci][j_1_node].size() > 0 and next_pair[nuci][j_node].size() > 0) {\n                            // merge\n                            IndexType index1 = std::get<0>(next_pair[nuci][j_1_node][0]).first;\n                            IndexType index2 = std::get<0>(next_pair[nuci][j_node][0]).first;\n                            if(index1/3 == index2/3)\n                                next_pair[nuci][j_1_node].insert(next_pair[nuci][j_1_node].end(),\n                                                                     next_pair[nuci][j_node].begin(),\n                                                                     next_pair[nuci][j_node].end());\n                            else if(index1 > index2){\n                                next_pair[nuci][j_1_node].clear();\n                                next_pair[nuci][j_1_node].insert(next_pair[nuci][j_1_node].end(),\n                                                                     next_pair[nuci][j_node].begin(),\n                                                                     next_pair[nuci][j_node].end());\n                            }\n                        }else if (next_pair[nuci][j_node].size() > 0)\n                            next_pair[nuci][j_1_node].insert(next_pair[nuci][j_1_node].end(),\n                                                                     next_pair[nuci][j_node].begin(),\n                                                                     next_pair[nuci][j_node].end());\n                    }\n                    else\n                        next_pair[nuci][j_1_node].insert(next_pair[nuci][j_1_node].end(),\n                                                                     temp_vector.begin(),\n                                                                     temp_vector.end());\n                }\n            }\n        }\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::get_next_pair_set() {\n\n    for(NucType nuci=0; nuci<5; nuci++){\n        for (auto& j_node_vnuc : next_pair[nuci]) {\n            NodeType j_node = j_node_vnuc.first;\n            next_pair_set[nuci][j_node] = set<tuple<NodeType, NucType, double>>(j_node_vnuc.second.begin(), j_node_vnuc.second.end());\n        }\n    }\n    for(NucType nuci=0; nuci<5; nuci++){\n        for (auto& j_node_vnuc : next_pair_set[nuci]) {\n            NodeType j_node = j_node_vnuc.first;\n            next_pair[nuci][j_node].clear();\n            for(auto& item : next_pair_set[nuci][j_node]){\n                next_pair[nuci][j_node].push_back(item);\n            }\n        }\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::get_prev_pair(DFA_t& dfa) {\n    vector<tuple<NodeType, NucType, double>> temp_vector;\n    for (NucType nuci = 0; nuci < NOTON; nuci++) {\n        for (IndexType j = 0; j < seq_length; j++) {\n            for (auto& j_node : dfa.nodes[j]) {\n                for (auto& item : dfa.auxiliary_right_edges[j_node]) {\n                    NodeType j1_node = item.first;\n                    temp_vector.clear();\n                    for (auto& nuc_weight : item.second){\n                        auto nuc = std::get<0>(nuc_weight);\n                        auto weight_nuc = std::get<1>(nuc_weight);    \n                        if (_allowed_pairs[nuci][nuc])\n                            temp_vector.push_back(make_tuple(j1_node, nuc, weight_nuc));\n                    }\n                    if(temp_vector.size() == 0){\n                        if (prev_pair[nuci][j1_node].size() > 0 and prev_pair[nuci][j_node].size() > 0) {\n                            // merge\n                            IndexType index1 = std::get<0>(prev_pair[nuci][j1_node][0]).first-1;\n                            IndexType index2 = std::get<0>(prev_pair[nuci][j_node][0]).first-1;\n                            if(index1/3 == index2/3)\n                                prev_pair[nuci][j1_node].insert(prev_pair[nuci][j1_node].end(),\n                                                                     prev_pair[nuci][j_node].begin(),\n                                                                     prev_pair[nuci][j_node].end());\n                            else if(index1 < index2){\n                                prev_pair[nuci][j1_node].clear();\n                                prev_pair[nuci][j1_node].insert(prev_pair[nuci][j1_node].end(),\n                                                                     prev_pair[nuci][j_node].begin(),\n                                                                     prev_pair[nuci][j_node].end());\n                            }\n                        }else if (prev_pair[nuci][j_node].size() > 0)\n                            prev_pair[nuci][j1_node].insert(prev_pair[nuci][j1_node].end(),\n                                                                     prev_pair[nuci][j_node].begin(),\n                                                                     prev_pair[nuci][j_node].end());\n                    }\n                    else\n                        prev_pair[nuci][j1_node].insert(prev_pair[nuci][j1_node].end(),\n                                                                     temp_vector.begin(),\n                                                                     temp_vector.end());\n                }\n            }\n        }\n    }\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::get_prev_pair_set() {\n\n    for(NucType nuci=0; nuci<5; nuci++){\n        for (auto& j_node_vnuc : prev_pair[nuci]) {\n            NodeType j_node = j_node_vnuc.first;\n            prev_pair_set[nuci][j_node] =\n                set<tuple<NodeType, NucType, double>>(j_node_vnuc.second.begin(), j_node_vnuc.second.end());\n        }\n    }\n    for(NucType nuci=0; nuci<5; nuci++){\n        for (auto& j_node_vnuc : prev_pair_set[nuci]) {\n            NodeType j_node = j_node_vnuc.first;\n            prev_pair[nuci][j_node].clear();\n            for(auto& item : prev_pair_set[nuci][j_node]){\n                prev_pair[nuci][j_node].push_back(item);\n            }\n        }\n    }\n}\n\n#ifdef SPECIAL_HP\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::special_hp(DFA_t& dfa, int8_t hairpin_length) {\n    int8_t hairpin_type = HAIRPINTYPE(hairpin_length);\n    vector<tuple<NodeType, string, double, NodeType>> queue;\n    vector<tuple<NodeType, string, double, NodeType>> frontier; \n    // vector\n    for(IndexType i=0; i<=seq_length - hairpin_length; i++){\n        for(NodeType i_node : dfa.nodes[i]){\n            int count = hairpin_length;\n            queue.clear();\n            queue.push_back(make_tuple(i_node, \"\", double(0.), i_node));\n            while(count > 0){\n                count --;\n                frontier.clear();\n                for(auto& node_str : queue){\n                    NodeType cur_node = std::get<0>(node_str);\n                    string cur_str = std::get<1>(node_str);\n                    double cur_lncai = std::get<2>(node_str);\n                    for(auto& node_nuc : dfa.right_edges[cur_node]){\n                        NodeType new_node = std::get<0>(node_nuc);\n                        string new_str = cur_str + GET_ACGU(std::get<1>(node_nuc));\n                        double new_total_lncai = cur_lncai + std::get<2>(node_nuc);\n                        frontier.push_back(make_tuple(new_node, new_str, new_total_lncai, cur_node));\n                    }\n                }\n                queue.swap(frontier);\n            }\n            for(auto node_str : queue){\n                auto j_node = std::get<3>(node_str);\n                auto temp_seq = std::get<1>(node_str);\n                auto cai_score = std::get<2>(node_str);\n                auto hairpin_length = temp_seq.size();\n                int8_t hairpin_type = HAIRPINTYPE(hairpin_length);\n                NucType nuci = GET_ACGU_NUC(temp_seq[0]);\n                NucType nucj = GET_ACGU_NUC(temp_seq[temp_seq.size() - 1]);\n                auto temp_nucpair = NTP(nuci, nucj);\n\n                ScoreType special_hairpin_score = func1(temp_seq, hairpin_type);\n                if(special_hairpin_score == SPECIAL_HAIRPIN_SCORE_BASELINE){\n\n                    auto newscore = - func12(0, hairpin_length - 1, GET_ACGU_NUC(temp_seq[0]), GET_ACGU_NUC(temp_seq[1]), GET_ACGU_NUC(temp_seq[hairpin_length-2]), GET_ACGU_NUC(temp_seq[hairpin_length-1]), tetra_hex_tri);\n                    hairpin_seq_score_cai[i_node][j_node][temp_nucpair].push_back(make_tuple(temp_seq, newscore, cai_score));\n                }\n\n                else{\n                    hairpin_seq_score_cai[i_node][j_node][temp_nucpair].push_back(make_tuple(temp_seq, special_hairpin_score, cai_score));\n                }\n            }\n        }\n    }\n}\n#endif\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nvoid BeamCKYParser<ScoreType, IndexType, NodeType>::preprocess(DFA_t& dfa) {\n\n    vector<tuple<NodeType, NucType, double>> new_q_list, new_p_list;\n    set<NodeType> visited; \n\n    // next_list\n    NodeType init_node = make_pair(0, 0);\n    for (NucType nuci=1; nuci<NOTON; nuci++) {\n        visited.clear();\n        auto q_list = next_pair[nuci][init_node]; \n\n        while (!q_list.empty()){\n            new_q_list.clear();\n\n            for (auto& q_node_nucq : q_list){\n\n                auto q_node = std::get<0>(q_node_nucq);\n                auto q_num = q_node.second;\n                auto q = q_node.first; \n                auto nucq = std::get<1>(q_node_nucq);\n\n                // q_node\n                next_list[nuci][q_node].push_back(q_node_nucq);\n                // q-1 is special\n                for(auto& q_1_node_dict : dfa.auxiliary_left_edges[q_node]){\n                    NodeType q_1_node = q_1_node_dict.first;\n                    next_list[nuci][q_1_node].push_back(q_node_nucq);\n                }\n                for(IndexType j=q-2; j>=max(0, q-SINGLE_MAX_LEN-1); j--)\n                    for(NodeType j_node : dfa.nodes[j])\n                        next_list[nuci][j_node].push_back(q_node_nucq);\n\n                for(auto& q1_node_list : dfa.auxiliary_right_edges[q_node]){\n                    NodeType q1_node = q1_node_list.first;\n\n                    if(dfa.nodes[q].size() == 1 and dfa.nodes[q+1].size() == 2 and ((q1_node_list.second)[0]).first != nucq) continue;\n                \n                    if(visited.find(q1_node) == visited.end()){\n                        visited.insert(q1_node);\n                        new_q_list.insert(new_q_list.end(), next_pair[nuci][q1_node].cbegin(), next_pair[nuci][q1_node].cend());\n                    }\n                    break;\n                }\n            }\n            q_list.swap(new_q_list);\n        }\n    }\n\n    // prev_list\n    init_node = make_pair(seq_length, 0);\n    for (NucType nucj=1; nucj<NOTON; nucj++) {\n        visited.clear();\n        auto p_list = prev_pair[nucj][init_node]; \n\n        while (!p_list.empty()){\n            new_p_list.clear();\n\n            for (auto& p_node_nucp_1 : p_list){\n                auto p_node = std::get<0>(p_node_nucp_1);\n                auto p_num = p_node.second;\n                auto p = p_node.first; \n                auto nucp_1 = std::get<1>(p_node_nucp_1); \n\n                // p_node\n                prev_list[nucj][p_node].push_back(p_node_nucp_1);\n                // p+1 is special\n                for(auto& p1_node_dict : dfa.auxiliary_right_edges[p_node]){\n                    NodeType p1_node = p1_node_dict.first;\n                    prev_list[nucj][p1_node].push_back(p_node_nucp_1);\n                }\n                for(IndexType i=p+2; i<=min(seq_length, p+SINGLE_MAX_LEN+1); i++)\n                    for(NodeType i_node : dfa.nodes[i])\n                        prev_list[nucj][i_node].push_back(p_node_nucp_1);\n\n                for(auto& p_1_node_new_nucp_1 : dfa.left_edges[p_node]){\n                                    \n                    NucType new_nucp_1 = std::get<1>(p_1_node_new_nucp_1);\n                    if(nucp_1 != new_nucp_1) continue;\n                    NodeType p_1_node = std::get<0>(p_1_node_new_nucp_1);\n                \n                    if(visited.find(p_1_node) == visited.end()){\n                        visited.insert(p_1_node);\n                        new_p_list.insert(new_p_list.end(), prev_pair[nucj][p_1_node].cbegin(), prev_pair[nucj][p_1_node].cend());\n                    }\n                }\n            }\n            p_list.swap(new_p_list);\n        }\n    }\n\n    // stacking energy computation\n    int newscore;\n    for(int8_t outer_pair=1; outer_pair<=6; outer_pair++){\n        auto nuci_1 = PTLN(outer_pair);\n        auto nucq = PTRN(outer_pair);\n        for(int8_t inner_pair=1; inner_pair<=6; inner_pair++){\n            auto nuci = PTLN(inner_pair);\n            auto nucj_1 = PTRN(inner_pair);\n            newscore = - func14(0, 1, 1, 0,\n                                nuci_1, nuci, nucj_1, nucq,\n                                nuci_1, nuci, nucj_1, nucq);\n            stacking_score[outer_pair-1][inner_pair-1] = newscore;\n\n            for (IndexType l=0; l<=SINGLE_MAX_LEN; l++){\n                newscore = - func14(0, l+2, 1, 0,\n                              nuci_1, nuci, nucj_1, nucq,\n                              nuci_1, nuci, nucj_1, nucq); \n\n                bulge_score[outer_pair-1][inner_pair-1][l] = newscore;\n            }\n        }   \n    }\n\n#ifdef SPECIAL_HP\n    // Triloops\n    special_hp(dfa, 5);\n    // Tetraloop37\n    special_hp(dfa, 6);\n    // Hexaloops\n    special_hp(dfa, 8);\n#endif\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nDecoderResult<double, IndexType> BeamCKYParser<ScoreType, IndexType, NodeType>::parse(\n        DFA_t& dfa, Codon& codon, std::string& aa_seq, std::vector<std::string>& p, \n        std::unordered_map<std::string, std::string>& aa_best_in_codon,\n        std::unordered_map<std::string, std::unordered_map<std::tuple<NodeType, NodeType>, std::tuple<double, NucType, NucType>, \n            std::hash<std::tuple<NodeType, NodeType>>>>& best_path_in_one_codon,\n            std::unordered_map<string, Lattice<IndexType>>& aa_graphs_with_ln_weights) {\n    \n\n    protein = p;\n    aa_graphs_with_ln_w = aa_graphs_with_ln_weights;\n    aa_best_path_in_a_whole_codon = aa_best_in_codon;\n    best_path_in_one_codon_unit = best_path_in_one_codon;\n\n    seq_length = 3 * static_cast<IndexType>(aa_seq.size());\n    next_pair.resize(5);\n    next_pair_set.resize(5);\n    get_next_pair(dfa);\n    get_next_pair_set();\n\n    prev_pair.resize(5);\n    prev_pair_set.resize(5);\n    get_prev_pair(dfa);\n    get_prev_pair_set();\n\n    next_list.resize(5);\n    prev_list.resize(5);\n    stacking_score.resize(6, vector<int>(6));\n    bulge_score.resize(6, vector<vector<int>>(6, vector<int>(SINGLE_MAX_LEN+1)));\n\n    preprocess(dfa);\n\n\n    int reserved_size = (seq_length + 1) * 16; //node,nucpair\n    int reserved_size2 = (seq_length + 1) * 2; //node\n\n    bestH.resize(reserved_size2);\n    bestP.resize(reserved_size2);\n    bestM2.resize(reserved_size2); //slim signature, Liang Zhang\n    bestMulti.resize(reserved_size2);\n    bestM.resize(reserved_size2); //slim signature, Liang Zhang\n    bestM_P.resize(reserved_size2); // hzhang: inter-state: P -> M\n\n    bestC.resize(reserved_size2); //slim signature, Liang Zhang\n\n    get_broken_codon_score_map.resize(reserved_size2);\n    for (auto& e : get_broken_codon_score_map){ //slim signature, Liang Zhang\n        e.resize(reserved_size2);\n        for (auto& ee : e){\n            ee = util::value_min<double>();\n        }\n    }\n\n    for (auto& ee : bestC){\n        ee.score = util::value_min<ScoreType>();\n        ee.cai_score = util::value_min<double>();\n    }\n\n\n    for (auto& e : bestH){\n        e.resize(reserved_size);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }\n\n    }\n\n    for (auto& e : bestP){\n        e.resize(reserved_size);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }\n    }\n\n    for (auto& e : bestMulti){ //slim signature, Liang Zhang\n        e.resize(reserved_size);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }        \n    }\n\n    for (auto& e : bestM2){ //slim signature, Liang Zhang\n        e.resize(reserved_size2);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }        \n    }\n\n    for (auto& e : bestM){ //slim signature, Liang Zhang\n        e.resize(reserved_size2);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }\n    }\n\n    for (auto& e : bestM_P){ // hzhang\n        e.resize(reserved_size2);\n        for (auto& ee : e){\n            ee.score = util::value_min<ScoreType>();\n            ee.cai_score = util::value_min<double>();\n        }\n    }\n\n    for (IndexType i = 0; i <= seq_length; ++i) {\n        for (auto & node_i : dfa.nodes[i]){\n            for (IndexType l = 0; l <= SINGLE_MAX_LEN; ++l){\n                auto j = i + l;\n                if (j > seq_length)\n                    break;\n\n                for (auto & node_j : dfa.nodes[j]){\n                    get_broken_codon_score_map[node_i][node_j] = get_broken_codon_score(node_i, node_j);\n                }\n\n            }\n        }\n\n    }\n\n    bestC[make_pair(0,0)].score = 0;\n    bestC[make_pair(0,0)].cai_score = double(0.);\n\n    for(const auto& node_nue_weight : dfa.right_edges[make_pair(0,0)]) {\n        auto node = std::get<0>(node_nue_weight);\n        auto weight_nue = std::get<2>(node_nue_weight);\n        update_if_better(bestC[node], 0, weight_nue);\n    }\n    \n    for (IndexType j = 0; j <= seq_length; ++j) {\n        cout << \"j=\" << j << \"\\r\" << flush;\n        \n        hairpin_beam<0>(j, dfa);\n        hairpin_beam<1>(j, dfa);\n        \n        if (j == 0)\n            continue;\n        \n        Multi_beam<0>(j, dfa);\n        Multi_beam<1>(j, dfa);\n        P_beam<0>(j, dfa);\n        P_beam<1>(j, dfa);\n        M2_beam<0>(j, dfa);\n        M2_beam<1>(j, dfa);\n        \n        if (j < seq_length) {\n            M_beam<0>(j, dfa);\n            M_beam<1>(j, dfa);\n            C_beam<0>(j, dfa);\n            C_beam<1>(j, dfa);\n        }\n    }\n\n    auto end_node = make_pair(seq_length, 0);\n    auto viterbi = bestC[end_node];\n\n    auto backtrace_result = backtrace(dfa, viterbi, end_node);\n\n    return DecoderResult<double, IndexType>{backtrace_result.seq, backtrace_result.structure, viterbi.score / -100.0, 0., viterbi.cai_score, seq_length};\n}\n\ntemplate <typename ScoreType, typename IndexType, typename NodeType>\nBeamCKYParser<ScoreType, IndexType, NodeType>::BeamCKYParser(const double lambda_value, const bool verbose)\n        : lambda(lambda_value), is_verbose(verbose) {\n        func9(0, 0);\n}\n\n}"
  },
  {
    "path": "src/beam_cky_parser.h",
    "content": "\n#pragma once\n\n#include <memory>\n#include <string>\n#include <cstring>\n#include <limits>\n#include <vector>\n#include <unordered_map>\n#include <thread>\n#include <mutex>\n\n#include \"Utils/network.h\"\n#include \"Utils/codon.h\"\n#include \"Utils/flat.h\"\n\n\nnamespace LinearDesign {\n\nnamespace detail {\n\n    struct NodeNucIndex {\n        LINEAR_DESIGN_INLINE size_t operator()(const NodeNucpair& node_nucpair) const {\n            return (node_nucpair.node_first << 4) | (node_nucpair.node_second << 3) | node_nucpair.nucpair;\n        }\n    };\n\n    struct NodeNucReverseIndex {\n        LINEAR_DESIGN_INLINE NodeNucpair operator()(const size_t index) const {\n            NodeNucpair node_nucpair = {IndexType(index >> 4), NumType((index & 0xf) >> 3), NucPairType(index & 0x7)};\n            return node_nucpair;\n        }\n    };\n\n    struct NodeIndex {\n        LINEAR_DESIGN_INLINE size_t operator()(const NodeType& node) const {\n            return (node.first << 1) | node.second;\n        }\n    };\n\n    struct NodeNucReverseIndex2 {\n        LINEAR_DESIGN_INLINE NodeType operator()(const size_t index) const {\n            return {index >> 1, (index & 0x1)};\n        }\n    };\n\n} /* detail */    \n\ntemplate <typename IndexType,\n          typename NucType = IndexType,\n          typename NodeType = std::pair<IndexType, NumType>,\n          typename DFAType = DFA<IndexType>>\nstring get_nuc_from_dfa_cai(DFAType& dfa, const NodeType& start_node, const NodeType& end_node,\n        const std::vector<std::string>& protein, std::unordered_map<std::string, std::unordered_map<std::tuple<NodeType, NodeType>, \n        std::tuple<double, NucType, NucType>, std::hash<std::tuple<NodeType, NodeType>>>>&\n        best_path_in_one_codon_unit, std::unordered_map<std::string, std::string>& aa_best_path_in_a_whole_codon) {\n\n    IndexType s_index = start_node.first;\n    IndexType t_index = end_node.first;\n\n    if (s_index >= t_index)\n        return \"\";\n\n    auto aa_left = protein[s_index / 3]; // tri letter\n    auto aa_right = protein[t_index / 3];\n    auto start_node_re_index = make_pair(s_index % 3, start_node.second);\n    auto end_node_re_index = make_pair(t_index % 3, end_node.second);\n    if (t_index - s_index < 3) {\n        if (s_index / 3 == t_index / 3) {\n            std::string temp_seq = \"\";\n            auto& nucs = best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index, end_node_re_index)];\n            temp_seq.append(1, GET_ACGU(std::get<1>(nucs)));\n            if (std::get<2>(nucs) != k_void_nuc) \n                temp_seq.append(1, GET_ACGU(std::get<2>(nucs)));\n\n            if (temp_seq.length() != end_node.first - start_node.first) {\n                assert(false);\n            }\n            return temp_seq;\n        } else {\n            std::string temp_left = \"\";\n            std::string temp_right = \"\";\n            if (s_index % 3 != 0) {\n                auto& nucs = best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index, make_pair(0, 0))];\n                temp_left.append(1, GET_ACGU(std::get<1>(nucs)));\n                if (std::get<2>(nucs) != k_void_nuc) \n                    temp_left.append(1, GET_ACGU(std::get<2>(nucs)));\n            }\n\n            if (t_index % 3 != 0) {\n                auto& nucs = best_path_in_one_codon_unit[aa_right][make_tuple(make_pair(0, 0), end_node_re_index)];\n                temp_right.append(1, GET_ACGU(std::get<1>(nucs)));\n                if (std::get<2>(nucs) != k_void_nuc) \n                    temp_right.append(1, GET_ACGU(std::get<2>(nucs)));\n            }\n\n            assert((temp_left + temp_right).length() == end_node.first - start_node.first);\n\n            return temp_left + temp_right;\n        }\n\n    } else {\n\n        std::string temp_left = \"\";\n        std::string temp_mid = \"\";\n        std::string temp_right = \"\";\n\n        if (s_index % 3 != 0) {\n            auto& nucs = best_path_in_one_codon_unit[aa_left][make_tuple(start_node_re_index, make_pair(0, 0))];\n            temp_left.append(1, GET_ACGU(std::get<1>(nucs)));\n            if (std::get<2>(nucs) != k_void_nuc) \n                temp_left.append(1, GET_ACGU(std::get<2>(nucs)));\n        }\n\n        IndexType protein_start_index = s_index / 3;\n        if (s_index % 3 != 0)\n            protein_start_index++;\n\n        IndexType protein_end_index = t_index / 3;\n\n        if (protein_start_index != protein_end_index) {\n            for (IndexType protein_index = protein_start_index; protein_index < protein_end_index; ++protein_index) {\n                \n                std::string nucs;\n                auto aa_tri = protein[protein_index];\n                if (k_map_3_1.count(aa_tri)) {\n                    nucs = aa_best_path_in_a_whole_codon[std::string(1, k_map_3_1[aa_tri])];\n                } else if (aa_best_path_in_a_whole_codon.count(aa_tri)) {\n                    nucs = aa_best_path_in_a_whole_codon[aa_tri];\n                } else {\n                    assert(false);\n                }\n\n                for (auto nuc : nucs) {\n                    temp_mid.append(1, nuc);\n                }\n            }\n        }\n\n        if (t_index % 3 != 0) {\n            auto& nucs = best_path_in_one_codon_unit[aa_right][make_tuple(make_pair(0, 0), end_node_re_index)];\n            temp_right.append(1, GET_ACGU(std::get<1>(nucs)));\n            if (std::get<2>(nucs) != k_void_nuc) \n                temp_right.append(1, GET_ACGU(std::get<2>(nucs)));\n        }\n\n        assert((temp_left + temp_mid + temp_right).length() == end_node.first - start_node.first);\n\n        return temp_left + temp_mid + temp_right;\n    }\n}\n\ntemplate <typename ScoreType,\n          typename IndexType,\n          typename NodeType = pair<IndexType, NumType>>\nclass BeamCKYParser {\npublic:\n    using State_t = State<ScoreType>;\n    using DFA_t = DFA<IndexType>;\n    using ScoreInnerDate_t = ScoreInnerDate<ScoreType, IndexType>;\n    using NextPair_t = vector<unordered_map<NodeType, vector<tuple<NodeType, NucType, double>>, hash_pair>>;\n    using NextPairSet_t = vector<unordered_map<NodeType, set<tuple<NodeType, NucType, double>>, hash_pair>>;\n    using PrefixScore_t = unordered_map<NodeType, ScoreType, hash_pair>;\n    using BestX_t_CAI = Flat<NodeType, Flat<NodeNucpair, State_t, detail::NodeNucIndex>, detail::NodeIndex>;\n    using BestM_t_CAI = Flat<NodeType, Flat<NodeType, State_t, detail::NodeIndex>, detail::NodeIndex>;\n    using BestC_t_CAI = Flat<NodeType, State_t, detail::NodeIndex>;\n    using Broken_codon_t_CAI = Flat<NodeType, Flat<NodeType, double, detail::NodeIndex>, detail::NodeIndex>;\n\n    BeamCKYParser(const double lambda_value, const bool verbose);\n\n    DecoderResult<double, IndexType> parse(DFA_t& dfa, \n        Codon& codon, \n        std::string& aa_seq, \n        std::vector<std::string>& p, \n        std::unordered_map<std::string, std::string>& aa_best_in_codon,\n        std::unordered_map<std::string, std::unordered_map<std::tuple<NodeType, NodeType>, \n        std::tuple<double, NucType, NucType>, std::hash<std::tuple<NodeType, NodeType>>>>& best_path_in_one_codon,\n        std::unordered_map<string, Lattice<IndexType>>& aa_graphs_with_ln_weights);\n\nprivate:\n    \n    template <IndexType j_num>\n    void hairpin_beam(IndexType j, DFA_t& dfa);\n    \n    template <IndexType j_num>\n    void Multi_beam(IndexType j, DFA_t& dfa);\n    \n    template <IndexType j_num>\n    void P_beam(IndexType j, DFA_t& dfa);\n    \n    template <IndexType j_num>\n    void M2_beam(IndexType j, DFA_t& dfa);\n    \n    template <IndexType j_num>\n    void M_beam(IndexType j, DFA_t& dfa);\n    \n    template <IndexType j_num>\n    void C_beam(IndexType j, DFA_t& dfa);\n\n    void update_if_better(State_t &state, const ScoreType newscore, const double cai_score) {\n        if (state.score + state.cai_score < newscore + cai_score) {\n            state.score = newscore;\n            state.cai_score = cai_score;\n        }\n    }\n\n    void update_if_better(State_t &state, const ScoreType newscore, const double cai_score, const NodeType pre_node, const double pre_left_cai) {\n        if (state.score + state.cai_score < newscore + cai_score) {\n            state.score = newscore;\n            state.cai_score = cai_score;\n            state.pre_node = pre_node;\n            state.pre_left_cai = pre_left_cai;\n        }\n    }\n\n\n    void get_next_pair(DFA_t& dfa);\n    void get_next_pair_set();\n\n    void get_prev_pair(DFA_t& dfa);\n    void get_prev_pair_set();\n\n    void preprocess(DFA_t& dfa);    \n\n    BacktraceResult backtrace(DFA_t& dfa, const State_t& state, NodeType end_node);\n\n    ScoreType quickselect_partition(std::vector<ScoreInnerDate_t>& scores,\n        ScoreType lower, ScoreType upper);\n\n    ScoreType quickselect(std::vector<ScoreInnerDate_t>& scores,\n        const ScoreType lower, const ScoreType upper, const IndexType k);\n\n    double get_broken_codon_score(const NodeType& start_node, const NodeType& end_node);\n\n    double lambda;\n    bool is_verbose;\n\n    IndexType seq_length; \n\n    BestX_t_CAI bestH, bestP, bestMulti;\n    BestM_t_CAI bestM2, bestM, bestM_P; // hzhang: bestM_P\n    BestC_t_CAI bestC;\n\n    detail::NodeNucReverseIndex reverse_index;\n    detail::NodeNucReverseIndex2 reverse_index2;\n\n    NextPair_t next_pair;\n    NextPairSet_t next_pair_set;\n\n    NextPair_t prev_pair;\n    NextPairSet_t prev_pair_set;\n\n    NextPair_t next_list;\n    NextPair_t prev_list;\n\n    vector<vector<vector<ScoreType>>> bulge_score;\n    vector<vector<ScoreType>> stacking_score;\n\n    std::unordered_map<string, Lattice<IndexType>> aa_graphs_with_ln_w;\n\n    std::vector<std::string> protein;\n    std::unordered_map<std::string, std::string> aa_best_path_in_a_whole_codon;\n    std::unordered_map<std::string, \n                       std::unordered_map<std::tuple<NodeType, NodeType>, std::tuple<FinalScoreType, NucType, NucType>, \n                       std::hash<std::tuple<NodeType, NodeType>>>> best_path_in_one_codon_unit;\n\n    Broken_codon_t_CAI get_broken_codon_score_map;\n\n#ifdef SPECIAL_HP\n    unordered_map<NodeType, unordered_map<NodeType, unordered_map<int8_t, vector<tuple<string, ScoreType, FinalScoreType>>>, hash_pair>, hash_pair> hairpin_seq_score_cai;\n    void special_hp(DFA_t& dfa, int8_t hairpin_length);\n#endif\n};\n\n}\n"
  },
  {
    "path": "src/linear_design.cpp",
    "content": "#include <iomanip>\n#include \"beam_cky_parser.h\"\n#include \"beam_cky_parser.cc\"\n#include \"Utils/reader.h\"\n#include \"Utils/common.h\"\n#include \"Utils/codon.h\"\n\n// #ifndef CODON_TABLE\n// #define CODON_TABLE \"./codon_usage_freq_table_human.csv\"\n// #endif\n\n#ifndef CODING_WHEEL\n#define CODING_WHEEL \"./coding_wheel.txt\"\n#endif\n\nusing namespace LinearDesign;\n\ntemplate <typename ScoreType, typename IndexType>\nbool output_result(const DecoderResult<ScoreType, IndexType>& result, \n        const double duration, const double lambda, const bool is_verbose, \n        const Codon& codon, string& CODON_TABLE) {\n\n    stringstream ss;\n    if (is_verbose)\n        ss << \"Using lambda = \" << (lambda / 100.) << \"; Using codon frequency table = \" << CODON_TABLE << endl;\n    ss << \"mRNA sequence:  \" << result.sequence << endl;\n    ss << \"mRNA structure: \" << result.structure << endl;\n    ss << \"mRNA folding free energy: \" << std::setprecision(2) << fixed << result.score \n                                        << \" kcal/mol; mRNA CAI: \" << std::setprecision(3) \n                                        << fixed << codon.calc_cai(result.sequence) << endl;\n    if (is_verbose)\n        ss << \"Runtime: \" << duration << \" seconds\" << endl;\n    cout << ss.str() << endl;\n\n    return true;\n}\n\nvoid show_usage() {\n    cerr << \"echo SEQUENCE | ./lineardesign -l [LAMBDA]\" << endl;\n    cerr << \"OR\" << endl;\n    cerr << \"cat SEQ_FILE_OR_FASTA_FILE | ./lineardesign -l [LAMBDA]\" << endl;\n}\n\n\nint main(int argc, char** argv) {\n\n    // default args\n    double lambda = 0.0f;\n    bool is_verbose = false;\n    string CODON_TABLE = \"./codon_usage_freq_table_human.csv\";\n\n    // parse args\n    if (argc != 4) {\n        show_usage();\n        return 1;\n    }else{\n        lambda = atof(argv[1]);\n        is_verbose = atoi(argv[2]) == 1;\n        if (string(argv[3]) != \"\"){\n            CODON_TABLE = argv[3];\n        }\n    } \n    lambda *= 100.;\n    \n    // load codon table and coding wheel\n    Codon codon(CODON_TABLE);\n    std::unordered_map<string, Lattice<IndexType>> aa_graphs_with_ln_weights;\n    std::unordered_map<std::string, std::unordered_map<std::tuple<NodeType, NodeType>, std::tuple<double, NucType, NucType>, std::hash<std::tuple<NodeType, NodeType>>>> best_path_in_one_codon_unit;\n    std::unordered_map<std::string, std::string> aa_best_path_in_a_whole_codon;\n    prepare_codon_unit_lattice<IndexType>(CODING_WHEEL, codon, aa_graphs_with_ln_weights, best_path_in_one_codon_unit, aa_best_path_in_a_whole_codon, lambda);\n\n    // main loop\n    string aa_seq, aa_tri_seq;\n    vector<string> aa_seq_list, aa_name_list;\n    // load input\n    for (string seq; getline(cin, seq);){\n        if (seq.empty()) continue;\n        if (seq[0] == '>'){\n            aa_name_list.push_back(seq); // sequence name\n            if (!aa_seq.empty())\n                aa_seq_list.push_back(aa_seq);\n            aa_seq.clear();\n            continue;\n        }else{\n            rtrim(seq);\n            aa_seq += seq;\n        }\n    }\n    if (!aa_seq.empty())\n        aa_seq_list.push_back(aa_seq);\n\n    // start design\n    for(int i = 0; i < aa_seq_list.size(); i++){\n        if (aa_name_list.size() > i)\n            cout << aa_name_list[i] << endl;\n        auto& aa_seq = aa_seq_list[i];\n        // convert to uppercase\n        transform(aa_seq.begin(), aa_seq.end(), aa_seq.begin(), ::toupper);\n        aa_tri_seq.clear();\n        if (is_verbose)\n            cout << \"Input protein: \" << aa_seq << endl;\n        if (!ReaderTraits<Fasta>::cvt_to_seq(aa_seq, aa_tri_seq)) \n            continue;\n\n        // init parser\n        BeamCKYParser<ScoreType, IndexType> parser(lambda, is_verbose);\n\n        auto protein = util::split(aa_tri_seq, ' ');\n        // parse\n        auto system_start = chrono::system_clock::now();\n        auto dfa = get_dfa<IndexType>(aa_graphs_with_ln_weights, util::split(aa_tri_seq, ' '));\n        auto result = parser.parse(dfa, codon, aa_seq, protein, aa_best_path_in_a_whole_codon, best_path_in_one_codon_unit, aa_graphs_with_ln_weights);\n        auto system_diff = chrono::system_clock::now() - system_start;\n        auto system_duration = chrono::duration<double>(system_diff).count();  \n\n        // output\n        output_result(result, system_duration, lambda, is_verbose, codon, CODON_TABLE);\n\n#ifdef FINAL_CHECK\n        if (codon.cvt_rna_seq_to_aa_seq(result.sequence) != aa_seq) {\n            std::cerr << \"Final Check Failed:\" << std::endl;\n            std::cerr << codon.cvt_rna_seq_to_aa_seq(result.sequence) << std::endl;\n            std::cerr << aa_seq << std::endl;\n            assert(false);\n        }\n#endif\n    }\n    return 0;\n}\n"
  },
  {
    "path": "testseq",
    "content": ">seq1\nMPNTLACP\n>seq2\nMLDQVNKLKYPEVSLT*\n"
  }
]