[
  {
    "path": ".Rbuildignore",
    "content": "^.*\\.Rproj$\n^\\.Rproj\\.user$\nMakefile\nREADME.md\nREADME_files\nREADME.Rmd\n^_pkgdown\\.yml$\n^docs$\n^pkgdown$\nlogo.png\nCONDUCT.md\n"
  },
  {
    "path": ".gitignore",
    "content": ".Rproj.user\n.Rhistory\n.RData\n.Renviron\n.DS_Store\ninst/doc\nggmsa.Rproj\nggmsa.Rcheck\n.git\ndocs/\npkgdown/\n"
  },
  {
    "path": "CONDUCT.md",
    "content": "# Contributor Code of Conduct\n\nAs contributors and maintainers of this project, we pledge to respect all people who \ncontribute through reporting issues, posting feature requests, updating documentation,\nsubmitting pull requests or patches, and other activities.\n\nWe are committed to making participation in this project a harassment-free experience for\neveryone, regardless of level of experience, gender, gender identity and expression,\nsexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.\n\nExamples of unacceptable behavior by participants include the use of sexual language or\nimagery, derogatory comments or personal attacks, trolling, public or private harassment,\ninsults, or other unprofessional conduct.\n\nProject maintainers have the right and responsibility to remove, edit, or reject comments,\ncommits, code, wiki edits, issues, and other contributions that are not aligned to this \nCode of Conduct. Project maintainers who do not follow the Code of Conduct may be removed \nfrom the project team.\n\nInstances of abusive, harassing, or otherwise unacceptable behavior may be reported by \nopening an issue or contacting one or more of the project maintainers.\n\nThis Code of Conduct is adapted from the Contributor Covenant \n(http://contributor-covenant.org), version 1.0.0, available at \nhttp://contributor-covenant.org/version/1/0/0/\n"
  },
  {
    "path": "DESCRIPTION",
    "content": "Package: ggmsa\nTitle: Plot Multiple Sequence Alignment using 'ggplot2'\nVersion: 1.19.0\nAuthors@R: c(person(\"Guangchuang\", \"Yu\", email = \"guangchuangyu@gmail.com\", role = c(\"aut\", \"cre\",\"ths\"), comment = c(ORCID = \"0000-0002-6485-8781\")),\n             person(\"Lang\", \"Zhou\",      email = \"nyzhoulang@gmail.com\",    role = \"aut\"),\n             person(\"Shuangbin\", \"Xu\",   email = \"xshuangbin@163.com\",      role = \"ctb\"),\n             person(\"Huina\", \"Huang\",    email = \"1185796994@qq.com\",       role = \"ctb\"))\nDescription: A visual exploration tool for multiple sequence alignment \n    and associated data. Supports MSA of DNA, RNA, and protein sequences \n    using 'ggplot2'. Multiple sequence alignment can easily be combined \n    with other 'ggplot2' plots, such as phylogenetic tree Visualized by \n    'ggtree', boxplot, genome map and so on. More features: visualization \n    of sequence logos, sequence bundles, RNA secondary structures and detection \n    of sequence recombinations.\nDepends: R (>= 4.1.0)\nImports:\n    Biostrings, \n    ggplot2,\n    magrittr,\n    tidyr,\n    utils,\n    stats,\n    aplot,\n    RColorBrewer,\n    ggfun (>= 0.2.0),\n    ggforce,\n    dplyr,\n    R4RNA,\n    grDevices,\n    seqmagick,\n    grid,\n    methods,\n    ggtree (>= 1.17.1)\nSuggests:\n    ggtreeExtra,\n    ape,\n    cowplot,\n    knitr,\n    rmarkdown,\n    readxl,\n    ggnewscale,\n    kableExtra,\n    gggenes,\n    statebins,\n    prettydoc,\n    testthat (>= 3.0.0),\n    yulab.utils\nLicense: Artistic-2.0\nEncoding: UTF-8\nURL: https://doi.org/10.1093/bib/bbac222(paper), https://www.amazon.com/Integration-Manipulation-Visualization-Phylogenetic-Computational-ebook/dp/B0B5NLZR1Z/ (book)\nBugReports: https://github.com/YuLab-SMU/ggmsa/issues\nbiocViews: Software, Visualization, Alignment, Annotation, MultipleSequenceAlignment\nRoxygenNote: 7.3.2\nVignetteBuilder: knitr\nConfig/testthat/edition: 3\n"
  },
  {
    "path": "Makefile",
    "content": "PKGNAME := $(shell sed -n \"s/Package: *\\([^ ]*\\)/\\1/p\" DESCRIPTION)\nPKGVERS := $(shell sed -n \"s/Version: *\\([^ ]*\\)/\\1/p\" DESCRIPTION)\nPKGSRC  := $(shell basename `pwd`)\nBIOCVER := RELEASE_3_23\n\n\nall: rd check clean\n\nalldocs: rd readme mkdocs\n\nrd:\n\tRscript -e 'roxygen2::roxygenise(\".\")'\n\nreadme:\n\tRscript -e 'rmarkdown::render(\"README.Rmd\")'\n\nreadme2:\n\tRscript -e 'rmarkdown::render(\"README.Rmd\", \"html_document\")'\n\nbuild:\n\t# cd ..;\\\n\t# R CMD build $(PKGSRC)\n\tRscript -e 'devtools::build()'\n\nbuild2:\n\tcd ..;\\\n\tR CMD build --no-build-vignettes $(PKGSRC)\n\ninstall:\n\tcd ..;\\\n\tR CMD INSTALL $(PKGNAME)_$(PKGVERS).tar.gz\n\ncheck: #build\n\t#cd ..;\\\n\t#Rscript -e 'rcmdcheck::rcmdcheck(\"$(PKGNAME)_$(PKGVERS).tar.gz\")'\n\tRscript -e 'devtools::check()'\n\ncheck2: build\n\tcd ..;\\\n\tR CMD check $(PKGNAME)_$(PKGVERS).tar.gz\n\nbioccheck:\n\tcd ..;\\\n\tRscript -e 'BiocCheck::BiocCheck(\"$(PKGNAME)_$(PKGVERS).tar.gz\")'\n\t\ngpcheck:\n\tRscript -e 'goodpractice::gp()'\n\nclean:\n\tcd ..;\\\n\t$(RM) -r $(PKGNAME).Rcheck/\n\ngitmaintain:\n\tgit gc --auto;\\\n\tgit prune -v;\\\n\tgit fsck --full\n\nrmrelease:\n\tgit branch -D $(BIOCVER)\n\nrelease:\n\tgit checkout $(BIOCVER);\\\n\tgit fetch --all\n\nupdate:\n\tgit fetch --all;\\\n\tgit checkout devel;\\\n\tgit merge upstream/devel;\\\n\tgit merge origin/devel;\\\n\n\npush: \n\tgit push upstream devel;\\\n\tgit push origin devel\n\nbiocinit:\n\tgit remote add upstream git@git.bioconductor.org:packages/$(PKGNAME).git;\\\n\tgit fetch --all"
  },
  {
    "path": "NAMESPACE",
    "content": "# Generated by roxygen2: do not edit by hand\n\nS3method(diff,SeqDiff)\nS3method(ggplot_add,GCcontent)\nS3method(ggplot_add,facet_msa)\nS3method(ggplot_add,msaBar)\nS3method(ggplot_add,nucleotideeHelix)\nS3method(ggplot_add,seed)\nS3method(ggplot_add,seqlogo)\nexport(adjust_ally)\nexport(assign_dms)\nexport(available_colors)\nexport(available_fonts)\nexport(available_msa)\nexport(extract_seq)\nexport(facet_msa)\nexport(geom_GC)\nexport(geom_helix)\nexport(geom_msa)\nexport(geom_msaBar)\nexport(geom_seed)\nexport(geom_seqlogo)\nexport(ggSeqBundle)\nexport(gghelix)\nexport(ggmaf)\nexport(ggmsa)\nexport(merge_seq)\nexport(readSSfile)\nexport(read_maf)\nexport(reset_pos)\nexport(seqdiff)\nexport(seqlogo)\nexport(simplify_hdata)\nexport(simplot)\nexport(theme_msa)\nexport(tidy_hdata)\nexport(tidy_maf_df)\nexport(tidy_msa)\nexport(treeMSA_plot)\nexportMethods(plot)\nexportMethods(show)\nimportClassesFrom(Biostrings,BStringSet)\nimportFrom(Biostrings,AAStringSet)\nimportFrom(Biostrings,DNAStringSet)\nimportFrom(Biostrings,RNAStringSet)\nimportFrom(Biostrings,readBStringSet)\nimportFrom(Biostrings,readDNAStringSet)\nimportFrom(Biostrings,toString)\nimportFrom(Biostrings,width)\nimportFrom(R4RNA,as.helix)\nimportFrom(R4RNA,collapseHelix)\nimportFrom(R4RNA,expandHelix)\nimportFrom(R4RNA,readBpseq)\nimportFrom(R4RNA,readConnect)\nimportFrom(R4RNA,readHelix)\nimportFrom(R4RNA,readVienna)\nimportFrom(RColorBrewer,brewer.pal)\nimportFrom(aplot,insert_top)\nimportFrom(aplot,plot_list)\nimportFrom(dplyr,group_by)\nimportFrom(dplyr,group_by_)\nimportFrom(dplyr,n)\nimportFrom(dplyr,select)\nimportFrom(dplyr,summarize)\nimportFrom(dplyr,summarize_)\nimportFrom(ggforce,geom_arc)\nimportFrom(ggfun,geom_xspline)\nimportFrom(ggplot2,Geom)\nimportFrom(ggplot2,aes)\nimportFrom(ggplot2,aes_)\nimportFrom(ggplot2,coord_cartesian)\nimportFrom(ggplot2,coord_fixed)\nimportFrom(ggplot2,draw_key_polygon)\nimportFrom(ggplot2,element_blank)\nimportFrom(ggplot2,element_line)\nimportFrom(ggplot2,element_text)\nimportFrom(ggplot2,facet_wrap)\nimportFrom(ggplot2,geom_area)\nimportFrom(ggplot2,geom_blank)\nimportFrom(ggplot2,geom_col)\nimportFrom(ggplot2,geom_line)\nimportFrom(ggplot2,geom_point)\nimportFrom(ggplot2,geom_polygon)\nimportFrom(ggplot2,geom_ribbon)\nimportFrom(ggplot2,geom_segment)\nimportFrom(ggplot2,geom_smooth)\nimportFrom(ggplot2,geom_text)\nimportFrom(ggplot2,geom_tile)\nimportFrom(ggplot2,ggplot)\nimportFrom(ggplot2,ggplot_add)\nimportFrom(ggplot2,ggplot_build)\nimportFrom(ggplot2,ggplot_gtable)\nimportFrom(ggplot2,ggproto)\nimportFrom(ggplot2,ggtitle)\nimportFrom(ggplot2,labs)\nimportFrom(ggplot2,layer)\nimportFrom(ggplot2,scale_color_manual)\nimportFrom(ggplot2,scale_fill_gradientn)\nimportFrom(ggplot2,scale_fill_manual)\nimportFrom(ggplot2,scale_x_continuous)\nimportFrom(ggplot2,scale_y_continuous)\nimportFrom(ggplot2,theme)\nimportFrom(ggplot2,theme_bw)\nimportFrom(ggplot2,theme_minimal)\nimportFrom(ggplot2,theme_void)\nimportFrom(ggplot2,xlab)\nimportFrom(ggplot2,xlim)\nimportFrom(ggplot2,ylab)\nimportFrom(ggtree,geom_facet)\nimportFrom(ggtree,geom_tiplab)\nimportFrom(grDevices,colorRampPalette)\nimportFrom(grid,arrow)\nimportFrom(grid,gTree)\nimportFrom(grid,gpar)\nimportFrom(grid,polygonGrob)\nimportFrom(grid,unit)\nimportFrom(grid,unit.pmax)\nimportFrom(magrittr,\"%<>%\")\nimportFrom(magrittr,\"%>%\")\nimportFrom(methods,missingArg)\nimportFrom(methods,new)\nimportFrom(methods,show)\nimportFrom(seqmagick,fa_read)\nimportFrom(stats,setNames)\nimportFrom(tidyr,gather)\nimportFrom(utils,getFromNamespace)\nimportFrom(utils,globalVariables)\nimportFrom(utils,modifyList)\nimportFrom(utils,packageDescription)\nimportFrom(utils,read.delim)\n"
  },
  {
    "path": "NEWS.md",
    "content": "# ggmsa 1.18.0\n\n+ Bioconductor RELEASE_3_23 (2026-04-29, Wed)\n\n# ggmsa 1.16.0\n\n+ Bioconductor RELEASE_3_22 (2025-11-01, Sat)\n\n# ggmsa 1.15.1\n\n+ replace `ggalt::geom_xspline()` with `ggfun::geom_xspline()` (2017-07-12, Sat)\n\n# ggmsa 1.3.3\n\n+ calling `\\dontrun{}` for examples on `ggmsa()`\n\n# ggmsa 1.3.2\n+ bugfix: `geom_msaBar` conservation layer incorrectly aligned issues#34(2022-5-13, Fri)\n\n# ggmsa 1.3.1\n\n+ A new feature--selects ancestral sequence on Tree-MSA plot `treeMSA_plot` (2022-4-14, Thu)\n+ A new feature--visualization of genome alignment `ggmaf` (2022-4-14, Thu)\n+ A test feature--visualization protein-protein interactive (2022-4-14, Thu)\n+ updated the way smooth is invoked on simplot(2022-01-03, Mon)\n\n# ggmsa 1.1.4\nadded smoothed curve on simplot.(2021-12-17, Fri)\n\n# ggmsa 1.1.3\nfixed the typo in \"posHighligthed\", and changed it to \nsnake_case \"position_highlight\" from camelCase \"posHighligthed\" (2021-12-13, Mon)\n\n\n# ggmsa 1.1.2\nfixed the assignment error on line 155 'seqlogo.R'\n\n# ggmsa 1.1.1 \nfixed error: using `||` instead of `|` on 110 lines in geom_msa.R\n\n\n# ggmsa 0.99.0 or 0.99.x\n(Prepare for submission to `Bioconductor`, 2021-09-22 Wed)\n\n+ 0.99.1 update DESCRIPTION and NEWS files (2021-09-28, Tue)\n+ 0.99.2 add documentation for row data in extdata/inst and clean up code (2021-09-29, Wed)\n+ 0.99.3 remove some  vignettes from master (build on the gh-pages branch) (2021-10-1, Fri)\n+ 0.99.4 remove 'stringr' package from 'Imports' (2021-10-11, Mon)\n+ 0.99.5 make the consensus_views compatible ggtreeExtra and add package description. (2021-10-21, Thu)\n\n# ggmsa 0.0.10 \n\n+ update default color schemes in  lower part of the SeqDiff plot (2021-08-20, Fri)\n\n# ggmsa 0.0.9\n\n+ import R4RNA to fix R check (2021-08-03, Tue)\n\n# ggmsa 0.0.8\n\n+ bugfix: fix variable names error in color_scheme. (2021-07-29, Thu)\n+ The migration of sequence recombination functionality from `seqcombo` package. (2021-07-20, Tue)\n\n\n# ggmsa 0.0.7\n\n+ added `gghelix()` and `geom_helix()`.(2021-04-1, Thu)\n+ added option to show the fill legend.(2021-03-23, Tue)\n+ added a error message to remind that \"sequences must have unique names\".(2021-03-18, Thu)\n+ added `ggSeqBundle()` to plot Sequence Bundles for MSAs based `ggolot2` (2021-03-18, Thu)\n\n# ggmsa 0.0.6\n\n+ supports linking `ggtreeExtra`. (2021-01-21, Thu)\n+ bugfix: reversed sequence in 'tree + geom_facet(font)' . (2021-01-21, Thu)\n+ bugfix: partitioning error when the sequence starting point greater than 1. (2021-01-21, Thu)\n+ bugfix: generates continuous x-axis labels for each panel. (2021-01-21, Thu)\n+ supports customize colors `custom_color`. (2020-12-28, Mon)\n\n# ggmsa 0.0.5\n\n+ added a new view called `by_conservation`.(2020-12-22, Tue)\n+ added a new color scheme `Hydrophobicity` and a new parameter `border`.(2020-12-21, Mon)\n+ rewrite the function `facet_msa()`.(2020-12-03, Thu)\n+ Debug: tree + geom_facet(geom_msa()) does not work.(2020-12-03, Thu)\n+ added a new function `geom_msaBar()`.(2020-12-03, Thu)\n+ added a new parameter `ignore_gaps` used in consensus views.(2020-10-09, Fri)\n+ debug in consensus views (2020-10-05, Mon)\n+ added consensus views (2020-9-30, Wed)\n+ added new colors `LETTER` and `CN6` provided by ShixiangWang.[issues#8](https://github.com/YuLab-SMU/ggmsa/issues/8)\n\n# ggmsa 0.0.4\n\n+ fixed warning message in **msa_data.R** (2020-4-26, Sun)\n+ added ggplot_add methods for `geom_*()` (2020-4-24, Fri)\n+ added a parameter `seq_name` in `ggmsa()` (2020-4-23, Thu)\n+ added a new function `facet_msa()` --> break down the MSA (2020-4-17, Fri)\n+ added a parameter `posHighlighted` in `ggmsa()` (2020-4-17, Fri)\n+ created a new layer `geom_asterisk()` to optimized `geom_seed()` (2020-4-11, Sta)\n+ added new functions `available_colors()`, `available_fonts()` and `available_msa()` (2020-3-30, Thu)\n+ added a new function `geom_seed()` --> highlight the seed region in miRNA sequences (2020-3-27, Fri)\n+ added a new function `ggmotif()`--> plot sequence motifs independently (2020-3-23, Tue)\n+ added a Monospaced Font `DroidSansMono` (2020-3-23, Mon)\n\n# ggmsa 0.0.3\n\n+ release of v=0.0.3 (2020-03-16, Mon)\n+ added a new function `geom_GC()` --> plot GC content in MSA (2020-02-28, Fri)\n+ added a new function `geom_seqlogo()` --> plot plot sequence motifs in MSA (2020-02-14, Fri)\n+ used a proportional scaling algorithm (2020-01-08, Wed)\n\n\n# ggmsa 0.0.2\n\n+ support plot sequence logo (2019-12-25, Wed)\n+ added three fonts：`helvetical`, `times_new_roman`, `mono` (2019-12-21, Sta)\n+ ~~added three fonts：`serif_font`, `Montserrat_font`, `roboto_font` (2019-12-17, Tue)~~\n+ added internal outline polygons (2019-12-15, Sun)\n+ bug fixed of `tidy_msa`\n+ import `seqmagick` for parsing fasta \n+ `tidy_msa` for converting msa file/object to tidy data frame (2019-12-09, Mon)\n\n \n# ggmsa 0.0.1\n\n+ initial CRAN release (2019-10-17, Thu) \n+ removed from CRAN on 2021-08-17\n"
  },
  {
    "path": "R/AllClasses.R",
    "content": "setClass(\"SeqDiff\",\n         representation = representation(\n                          file = \"character\",\n                          sequence = \"BStringSet\",\n                          reference = \"numeric\",\n                          diff = \"data.frame\"\n                          )\n        )\n"
  },
  {
    "path": "R/SeqBundles.R",
    "content": "##'  plot Sequence Bundles for MSA based 'ggolot2'\n##'\n##'\n##' @title ggSeqBundle\n##' @importFrom ggfun geom_xspline\n##' @param msa Multiple sequence alignment file(FASTA) or object for \n##' representing either nucleotide sequences or peptide sequences.Also receives\n##'  multiple MSA files.\n##'  eg:msa = c(\"Gram-negative_AKL.fasta\", \"Gram-positive_AKL.fasta\").\n##' @param line_width The width of bundles at each site, default is 0.3.\n##' @param line_thickness The thickness of bundles at each site, default is 0.3.\n##' @param line_high The high of bundles at each site, default is 0.\n##' @param spline_shape A numeric vector of values between -1 and 1, which \n##' control the shape of the spline relative to the control points.\n##' @param size A numeric vector of values between 0 and 1, \n##' which control the size of each lines.\n##' @param alpha A numeric vector of values between 0 and 1, \n##' which control the alpha of each lines.\n##' @param bundle_color The colors of each sequence bundles.\n##' eg: bundle_color = c(\"#2ba0f5\",\"#424242\").\n##' @param lev_molecule Reassigning the Y-axis and displaying \n##' letter-coded amino acids/nucleotides arranged by physiochemical \n##' properties or others.eg:amino acids hydrophobicity \n##' lev_molecule = c(\"-\",\"A\", \"V\", \"L\", \"I\", \"P\", \"F\", \"W\", \"M\", \n##'    \"G\", \"S\",\"T\", \"C\", \"Y\", \"N\", \"Q\", \"D\", \"E\", \"K\",\"R\", \"H\").\n##' @return ggplot object\n##' @export\n##' @examples\n##' aln <- system.file(\"extdata\", \"Gram-negative_AKL.fasta\", package = \"ggmsa\")\n##' ggSeqBundle(aln)\n##' @author Lang Zhou\nggSeqBundle <- function(msa,\n                        line_width = 0.3,\n                        line_thickness = 0.3,\n                        line_high = 0,\n                        spline_shape = 0.3,\n                        size = 0.5,\n                        alpha = 0.2,\n                        bundle_color = c(\"#2ba0f5\",\"#424242\"),\n                        lev_molecule = c(\"-\", \"A\", \"V\", \"L\", \"I\", \"P\", \n                                         \"F\", \"W\", \"M\", \"G\", \"S\",\"T\", \n                                         \"C\", \"Y\", \"N\", \"Q\", \"D\", \"E\", \n                                         \"K\", \"R\", \"H\")\n                        ) {\n    if(length(msa) > length(bundle_color)) {\n      stop(\"Each MSA group should be assigned a bundle color!!\")\n    }\n\n    df <- lapply(seq_along(msa), function(i){\n        df_aa <- tidy_msa(msa[[i]])\n        df_aa$name <- as.character(df_aa$name)\n        df_aa$group <- i\n        df_aa\n    })%>% do.call(\"rbind\",.)\n\n    dd <- adjustMSA(df_msa = df,\n                    lev_molecule = lev_molecule,\n                    line_width = line_width,\n                    line_thickness = line_thickness,\n                    line_high = line_high,\n                    bundle_color = bundle_color\n                    )\n\n    mapping <- aes(x = position_adj, y = y_adj, \n                   group=name, color = I(bundle_color))\n    ggplot(data = dd, mapping = mapping) +\n        geom_xspline(shape = spline_shape, linewidth = size, alpha = alpha) +\n            theme_bundles(df = df, lev_molecule = lev_molecule)\n\n}\n\n\n\nadjustMSA <- function(df_msa, lev_molecule, line_width, \n                      line_thickness, bundle_color, line_high) {\n    data_scale <- lapply(nrow(df_msa) %>% seq_len(), function(i) {\n        d <- df_msa[i,]\n        d[2,] <-  d[1,]\n        d[1,\"position_adj\"] <- d[1,\"position\"] - line_width\n        d[2,\"position_adj\"] <- d[2,\"position\"] + line_width\n        d\n    }) %>% do.call(\"rbind\",.)\n\n    data_scale$y <- factor(data_scale$character, levels = lev_molecule) %>%\n      as.numeric()\n\n    data_adj <- lapply(data_scale$group %>% unique, function(g) {\n        data_group <- data_scale[data_scale$group == g,]\n        thickness <- line_thickness / factor(data_group$name) %>% \n          as.numeric %>%\n          max\n        \n        dd_adj <- lapply(unique(data_group$position), function(i){\n            df_pos <- data_group[data_group$position == i,]\n            lapply(unique(df_pos$y), function(j){\n                df_y <- df_pos[df_pos$y == j,]\n                thick_lev <- df_y$name %>% factor %>% as.numeric - 1\n                df_y$y_adj <- df_y$y - 0.4 + line_high + thickness * \n                              thick_lev + line_thickness * (g - 1)\n                df_y\n            }) %>% do.call(\"rbind\",.)\n        }) %>% do.call(\"rbind\",.)\n    dd_adj$bundle_color <- bundle_color[[g]]\n    dd_adj\n    }) %>% do.call(\"rbind\",.)\n    return(data_adj)\n}\n\n##' @importFrom ggplot2 element_line\ntheme_bundles <- function(df, lev_molecule){\n    break_y <- factor(lev_molecule, levels = lev_molecule) %>% as.numeric\n    minor_y <- c(break_y + 0.5, break_y - 0.5) %>% unique\n    break_x <- max(df$position) %>% seq_len\n    minor_x <- c(break_x + 0.5, break_x - 0.5) %>% unique\n\n    list(\n        ylab(NULL),\n        xlab(\"Position number\"),\n        scale_x_continuous(breaks = break_x, \n                           labels = break_x, \n                           minor_breaks = minor_x),\n        scale_y_continuous(breaks = break_y, \n                           labels = lev_molecule, \n                           minor_breaks = minor_y),\n        theme(panel.grid.minor.y = element_line(color = \"#e8e0e0\", linewidth = 0.4),\n              axis.line.x = element_line(color = \"gray60\", linewidth = 0.8),\n              panel.grid.major = element_blank(),\n              axis.ticks.y = element_blank(),\n              panel.background = element_blank())\n  )\n}\n\n\n\n\n\n"
  },
  {
    "path": "R/ancestor_seq.R",
    "content": "##' plot Tree-MSA plot\n##'\n##'\n##' 'treeMSA_plot()' automatically re-arranges the MSA data according to \n##' the tree structure,\n##' @title treeMSA_plot\n##' @param p_tree tree view\n##' @param tidymsa_df tidy MSA data \n##' @param ancestral_node vector, internal node in tree. Assigning a internal \n##' node to display \"ancestral sequences\",If ancestral_node = \"none\" hides \n##' all ancestral sequences, if ancestral_node = \"all\" shows all ancestral \n##' sequences.\n##' @param sub logical value. Displaying a subset of ancestral sequences or not.\n##' @param panel panel name for plot of MSA data\n##' @param font font families, possible values are 'helvetical', 'mono', and \n##' 'DroidSansMono', 'TimesNewRoman'.  Defaults is 'helvetical'. \n##' If font = NULL, only plot the background tile.\n##' @param color a Color scheme. One of 'Clustal', 'Chemistry_AA', \n##' 'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \n##' 'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\n##' @param seq_colname the colname of MSA on tree$data\n##' @param ... additional parameters for 'geom_msa'\n##' @export\n##' @importFrom ggtree geom_facet\n##' @return ggplot object \n##' @author Lang Zhou\ntreeMSA_plot <- function(p_tree, \n                         tidymsa_df, \n                         ancestral_node = \"none\", \n                         sub = FALSE,\n                         panel = \"MSA\",\n                         font = NULL,\n                         color = \"Chemistry_AA\",\n                         seq_colname = NULL,\n                         ...) {\n  \n  if(!ancestral_node == \"none\" && is.null(seq_colname)) {\n    stop(\"pls assign the colname of MSA on tree$data by arguments 'seq_colname'!\")\n  } \n  \n  \n  if(!ancestral_node == \"none\") {\n    p_tree <- adjust_ally(p_tree, node = ancestral_node, \n                          sub = sub,\n                          seq_colname = seq_colname)\n    \n    tidymsa_df <- extract_seq(p_tree, \n                              seq_colname = seq_colname)\n  }\n  \n  p <- p_tree + geom_facet(geom = geom_msa, \n                      data = tidymsa_df,  \n                      panel = panel,\n                      font = font, \n                      color = color,\n                      ...)\n  \n  if(ancestral_node == \"none\") {\n    p <- p + geom_tiplab(offset = 0.002)\n  }\n  \n  p\n}\n\n##' adjust the tree branch position after assigning ancestor node\n##'\n##' @title adjust_ally\n##' @param tree ggtree object\n##' @param node internal node in tree\n##' @param sub logical value.\n##' @param seq_colname the colname of MSA on tree$data\n##' @importFrom ggtree geom_tiplab\n##' @importFrom ggplot2 aes_\n##' @importFrom utils getFromNamespace\n##' @return tree\n##' @export\n##' @author Lang Zhou\n\nadjust_ally <- function(tree, node, sub = FALSE, seq_colname = \"mol_seq\") {\n  getSubtree <- getFromNamespace(\"getSubtree\", \"ggtree\")\n  \n  if(node == \"all\"){\n    d <- tree$data\n    ancestor_n <- d[!d$isTip & !is.na(d[,seq_colname][[1]]),\"node\"][[1]]\n  }else {\n    \n    if(sub){\n      ancestor_n <- lapply(node, function(i) {\n        sub_tree <- getSubtree(tree,node = i)\n        sub_ancestor <- sub_tree[!sub_tree$isTip,]\n        ancestor_n <- sub_ancestor$node\n        return(ancestor_n)\n      })%>% unlist %>% unique\n    }else {\n      ancestor_n <- node\n    }\n    \n  }\n  \n  for (i in ancestor_n) {\n    tree <- adjust_treey(tree = tree, node = i)\n  }\n  \n  tree$data$node_color <- \"black\"\n  tree$data[tree$data$node %in% ancestor_n,\"node_color\"] <- \"red\"\n  tree <- tree + geom_tiplab(aes_(color = ~I(node_color)),offset = 0.002)\n  return(tree)\n}\n\n##' extract ancestor sequence from tree data\n##'\n##' @title extract_seq\n##' @param tree_adjust ggtree object\n##' @param seq_colname the colname of MSA on tree$data\n##' @return character\n##' @export\n##' @author Lang Zhou\nextract_seq <- function(tree_adjust, seq_colname = \"mol_seq\") {\n  data <- tree_adjust$data\n  seq <- data[data$isTip,seq_colname][[1]]\n  names(seq) <- data[data$isTip,]$label\n  tidy <- tidy_msa(seq)\n  return(tidy)\n}\n\n\nadjust_treey <- function(tree, node) {\n  tree$data$isTip[tree$data$node == node] <- TRUE\n  tree$data$label[tree$data$node == node] <- \n    tree$data$name[tree$data$node == node]\n  \n  y_ancenstor <- tree$data$y[tree$data$node == node]\n  tree$data$y[tree$data$y > y_ancenstor] <- \n    tree$data$y[tree$data$y > y_ancenstor] + 1\n  tree$data$y[tree$data$node == node] <- \n    tree$data$y[tree$data$node == node] %>% ceiling\n  return(tree)\n}\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/arc.R",
    "content": "##'  Plots nucleltide secondary structure as helices in arc diagram\n##'\n##' @title gghelix\n##' @param helix_data a data frame. The file of nucleltide secondary structure\n##' and then read by readSSfile().\n##' @param overlap Logicals. If TRUE, two structures data called predict \n##' and known must be given(eg:heilx_data = list(known = data1, \n##'                                              predicted = data2)), \n##' plots the predicted helices that are known on top, predicted helices that\n##'  are not known on the bottom, and finally plots unpredicted helices \n##'  on top in black.\n##' @param color_by generate colors for helices by various rules,\n##' including integer counts and value ranges one of \"length\" and \"value\"\n##' @return ggplot object\n##' @export\n##' @examples\n##' RF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\n##' helix_data <- readSSfile(RF03120, type = \"Vienna\")\n##' gghelix(helix_data)\n##' @author Lang Zhou\ngghelix <- function(helix_data, color_by = \"length\",overlap = FALSE){\n    if(is.data.frame(helix_data)) {\n        helix_tidy <- tidy_helix(helix_data, color_by = color_by)\n    }else {\n        helix_tidy <- tidy_list_helix(helix_data, color_by = color_by)\n    }\n    ly <- layer_helix(helix_data = helix_tidy, overlap = overlap)\n    p <- ggplot() + ly + theme_helix()\n    return(p)\n}\n\n##' The layer of helix plot\n##'\n##' @title geom_helix\n##' @param helix_data a data frame. The file of nucleltide secondary structure\n##' and then read by readSSfile().\n##' @param overlap Logicals. If TRUE, two structures data called predict \n##' and known must be given(eg:heilx_data = list(known = data1, \n##'                                              predicted = data2)), \n##' plots the predicted helices that are known on top,\n##' predicted helices that are not known on the bottom, and finally plots \n##' unpredicted helices on top in black.\n##' @param color_by generate colors for helices by various rules,\n##' including integer counts and value ranges one of \"length\" and \"value\"\n##' @param ... additional parameter\n##' @return ggplot2 layers\n##' @export\n##' @examples\n##' RF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\n##'RF03120_fas <- system.file(\"extdata/Rfam/RF03120.fasta\", package=\"ggmsa\")\n##'SS <- readSSfile(RF03120, type = \"Vienna\")\n##'ggmsa(RF03120_fas, font = NULL,border = NA, \n##'     color = \"Chemistry_NT\", seq_name = FALSE) +\n##'geom_helix(SS)\n##' @author Lang Zhou\ngeom_helix <- function(helix_data, color_by = \"length\", overlap = FALSE,  ...) {\n  structure(list(helix_data = helix_data,\n                 color_by = color_by,\n                 overlap = overlap),\n            class = \"nucleotideeHelix\")\n}\n\n##' Read secondary structure file\n##'\n##' @title readSSfile\n##' @importFrom utils read.delim\n##' @param file A text file in connect format\n##' @param type file type. one of \"Helix, \"Connect\", \"Vienna\" and \"Bpseq\"\n##' @return data frame\n##' @importFrom R4RNA readHelix\n##' @importFrom R4RNA readConnect\n##' @importFrom R4RNA readVienna\n##' @importFrom R4RNA readBpseq\n##' @importFrom R4RNA expandHelix\n##' @importFrom R4RNA collapseHelix\n##' @export\n##' @examples\n##' RF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\n##' helix_data <- readSSfile(RF03120, type = \"Vienna\")\n##' @author Lang Zhou\nreadSSfile <- function(file, type = NULL) {\n    type <- match.arg(type, c(\"Helix\", \"Connect\", \"Vienna\", \"Bpseq\"))\n    load_data <- switch(type,\n                        Helix = readHelix(file),\n                        Connect = readConnect(file),\n                        Vienna = readVienna(file),\n                        Bpseq = expandHelix(file))\n\n    data <- collapseHelix(load_data)\n    return(data)\n\n}\n\ntidy_list_helix <- function(helix_data, color_by = \"length\"){\n  known <- tidy_helix(helix_data$known, color_by = color_by)\n  predicted <-  tidy_helix(helix_data$predicted, color_by = color_by)\n  return(list(known = known, predicted = predicted))\n}\n\ntidy_helix <- function(helix_data, color_by = \"length\"){\n    helix_data <- color_helix(helix_data, color = color_by)\n    names(helix_data)[c(1,2)] <- c(\"from\",\"to\")\n    helix_data$x0 <- (helix_data$to + helix_data$from)/2\n    helix_data$r <- (helix_data$to - helix_data$from)/2\n    return(helix_data)\n}\n\ncolor_helix <- function(helix_data, color){\n    #color <- match.arg(color, c(\"length\", \"value\"))\n    if(color == \"length\"){\n        data_color <- colorBy_length(helix_data)\n    }else if(color == \"value\") {\n        data_color <- colorBy_value(helix_data)\n    }else {\n      helix_data$col <- color\n      data_color <- helix_data\n    }\n      data <- expandHelix(data_color)\n      return(data)\n}\n\ncolorBy_length <- function(helix_data){\n    pal_lenght <- colorRampPalette(brewer.pal(name = \"Paired\", n = 12))\n    helix_data$col <- nrow(helix_data) %>% pal_lenght()\n    return(helix_data)\n}\n\ncolorBy_value <- function(helix_data){\n    pal_value <- colorRampPalette(rev(brewer.pal(name = \"Blues\", n = 4)))\n    helix_data$col <- nrow(helix_data) %>% pal_value()\n    return(helix_data)\n}\n\n##' @importFrom ggforce geom_arc\nlayer_helix <- function(helix_data, overlap = FALSE, seq_numbers = 0){\n    mapping_above <- aes_(x0 = ~x0, \n                          y0 = ~(seq_numbers + 0.5), \n                          r = ~r, start = ~1.5*pi, \n                          end = ~2.5*pi)\n    mapping_below <- aes_(x0 = ~x0, \n                          y0 = ~(-0.5), \n                          r = ~r, start = ~pi/2, \n                          end = ~1.5*pi)\n    if(seq_numbers > 0) {\n        mapping_below <- modifyList(mapping_below, aes_(y0 = ~0))\n    }\n    if(is.list(helix_data) & \"col\" %in% names(helix_data[[2]])) {\n        mapping_above <- modifyList(mapping_above, aes_(color = ~I(col)))\n        mapping_below <- modifyList(mapping_below, aes_(color = ~I(col)))\n      }\n\n    if(overlap) {\n        if(!is.list(helix_data)| length(helix_data) != 2){\n            stop(\"Overlapping structures must input a list with\n                 2 helix data.\n                 (eg: heilx_data = list(known = data1, predicted = data2)\")\n        }\n        if(!names(helix_data) %in% c(\"known\", \"predicted\") %>% all) {\n            stop(\"helix_data names must be 'known' and 'predicted'. \n                 (eg: heilx_data = list(known = data1, predicted = data2)\")\n        }\n\n        overlap_data <- overlap_helix(known = helix_data[[\"known\"]],\n                                      predicted = helix_data[[\"predicted\"]])\n\n        if (overlap_data[[\"above_justknown\"]] %>% nrow == 0){\n            ly_up <- geom_arc(data = overlap_data[[\"above_both\"]],\n                              mapping = mapping_above)\n            ly_below <- geom_arc(data = overlap_data[[\"below\"]], \n                                 mapping = mapping_below)\n            return(list(ly_up, ly_below))\n\n        }else {\n            ly_up <- geom_arc(data = overlap_data[[\"above_both\"]],\n                              mapping = mapping_above)\n            ly_up_justknown <- \n              geom_arc(data = overlap_data[[\"above_justknown\"]], \n                       mapping = mapping_above, \n                       color = \"black\")\n            \n            ly_below <- geom_arc(data = overlap_data[[\"below\"]], \n                                 mapping = mapping_below)\n            return(list(ly_up, ly_up_justknown, ly_below))\n        }\n\n    }else {#overlap = FALSE\n        if(is.list(helix_data) & length(helix_data) == 2) {\n            if(!\"col\" %in% names(helix_data[[\"known\"]])) {\n                mapping_below <- modifyList(mapping_below, \n                                            aes_(color = I(\"#8fce5e\")))\n            }\n            ly_up <- geom_arc(data = helix_data[[\"known\"]], \n                              mapping = mapping_below)\n            ly_below <- geom_arc(data = helix_data[[\"predicted\"]], \n                                 mapping = mapping_above)\n            return(list(ly_up, ly_below))\n\n        }else if(is.data.frame(helix_data)){\n            if(\"col\" %in% names(helix_data)){\n                mapping_above <- modifyList(mapping_above, \n                                            aes_(color = ~I(col)))\n            }\n            ly_arc <- geom_arc(data = helix_data, mapping = mapping_above)\n            return(ly_arc)\n        }else {\n            stop(\"Only a data frame or a list with 2 of helix data are allowed.\n                 eg: heilx_data = data or \n                 heilx_data = list(known = data1, predicted = data2)\")\n        }\n    }\n}\n\noverlap_helix <- function(known, predicted){\n    if(!c(\"from\", \"to\") %in% names(known) %>% all) {\n        stop(\"'known' must be a output from 'readSSfile()'\")\n    }\n    if(!c(\"from\", \"to\") %in% names(predicted) %>% all) {\n        stop(\"'predicted' must be a output from 'readSSfile()'\")\n    }\n\n    known$heli <- paste0(known$from, \"t\",known$to)\n    predicted$heli <- paste0(predicted$from, \"t\", predicted$to)\n\n    below <- predicted[!predicted$heli %in% known$heli,] #predicted & not known\n    above_both <- predicted[predicted$heli %in% known$heli,] #predicted & known\n    above_justknown <- known[!known$heli %in% above_both$heli,] #unpredicted & known\n\n    return(list(below = below,\n                above_both = above_both,\n                above_justknown = above_justknown))\n}\n\n##' @importFrom ggplot2 theme_void\n##' @importFrom ggplot2 element_text\n##' @importFrom grid arrow\ntheme_helix <- function(){\n    list(theme_void(),\n         scale_y_continuous(breaks = 0),\n         coord_fixed(),\n         theme(panel.grid.major.y = element_line(size = 1, arrow = arrow(length = unit(0.3, 'cm'))),\n               panel.grid.major.x = element_line(color = \"#eaeaea\", size = 0.4),\n               axis.text.x = element_text())\n         )\n  }\n\n\n\n\n"
  },
  {
    "path": "R/available.R",
    "content": "##' This function lists font families currently available \n##' that can be used by 'ggmsa'\n##'\n##'\n##' @title List Font Families currently available\n##' @return A character vector of available font family names\n##' @examples available_fonts()\n##' @export\n##' @author Lang Zhou\navailable_fonts <- function(){\n    message(\"font families currently available:\" )\n    font <- paste(names(font_fam), collapse = ' ')\n    message(font, \"\\n\")\n}\n\n##' This function lists color schemes currently available that\n##'  can be used by 'ggmsa'\n##'\n##'\n##' @title List Color Schemes currently available\n##' @return A character vector of available color schemes\n##' @examples available_colors()\n##' @export\n##' @author Lang Zhou\navailable_colors <- function(){\n    message(\"1.color schemes for nucleotide sequences currently available:\")\n    color_nt <- paste(names(scheme_NT), collapse = ' ')\n    message(color_nt, \"\\n\")\n    \n    message(\"2.color schemes for AA sequences currently available:\")\n    color_aa <- paste(names(scheme_AA), collapse = ' ')\n    message(\"Clustal\", color_aa, \"\\n\")\n}\n\n##' This function lists MSA objects currently available that\n##'  can be used by 'ggmsa'\n##'\n##'\n##' @title List MSA objects currently available\n##' @return A character vector of available objects\n##' @examples available_msa()\n##' @export\n##' @author Lang Zhou\navailable_msa <- function(){\n    message(\"1.files currently available:\")\n    message(\".fasta\",'\\n')\n\n    message(\"2.XStringSet objects from 'Biostrings' package:\")\n    mes <- paste(supported_msa_class[!grepl(\"bin\", supported_msa_class)],\n                 collapse = ' ')\n    message(mes, '\\n')\n\n    message(\"3.bin objects:\")\n    mes_bin <- paste(supported_msa_class[grepl(\"bin\", supported_msa_class)],\n                     collapse = ' ')\n    message(mes_bin, '\\n')\n\n}\n\n"
  },
  {
    "path": "R/clustal.R",
    "content": "##'  A color scheme of Culstal. The algorithm to assign colors\n##'   for Multiple Sequence.\n##'\n##' @param y sequence alignment with data frame, generated by tidy_msa().\n##' @keywords clustal\n##' @noRd\ncolor_Clustal <- function(y) {\n    char_freq <- lapply(split(y, y$position), function(x) table(x$character))\n    col_convert <- lapply(char_freq, function(seq_column) {\n        ##The white as the background\n        clustal <- rep(\"#ffffff\", length(seq_column)) \n        names(clustal) <- names(seq_column)\n        r <- seq_column/sum(seq_column)\n        for (pos in seq_along(seq_column)) {\n            char <- names(seq_column)[pos]\n            i <- grep(char, scheme_clustal$re_position)\n            for (j in i) {\n                if (scheme_clustal$type[j] == \"combined\"){\n                    rr <- sum(r[strsplit(scheme_clustal$re_gp[j], '')[[1]]], \n                              na.rm = TRUE)\n                    if (rr > scheme_clustal$thred[j]) {\n                        clustal[pos] <- scheme_clustal$colour[j]}\n                    } else{\n                        rr1<-r[strsplit(scheme_clustal$re_gp[j], ',')[[1]]]\n                        if (any(rr1> scheme_clustal$thred[j],na.rm = TRUE) ) {\n                            clustal[pos] <- scheme_clustal$colour[j]}\n                    }\n                break\n            }\n        }\n        return(clustal)\n    })\n\n    yy <- split(y, y$position)\n    lapply(names(yy), function(n) {\n        d <- yy[[n]]\n        col <- col_convert[[n]]\n        d$color <- col[d$character]\n        return(d)\n    }) %>% do.call('rbind', .)\n}\n"
  },
  {
    "path": "R/color_by_conservation.R",
    "content": "color_increment <- function(conservation_visibility){\n    lapply(seq_len(nrow(conservation_visibility)), function(i){\n        color_ramp <- \n            colorRampPalette(colors = \n                                 c(conservation_visibility[i,\"color\"], \n                                   \"#ffffff\"))\n        \n        color_change <- \n            rev(color_ramp(100))[conservation_visibility[i,\"visibility\"]]\n        return(color_change)\n        }) %>% unlist \n\n}\n\n\ncolor_visibility <- function(y){\n    #options(digits = 2)\n    #on.exit()\n    conser_data <- bar_data(y)\n    conser_data$visibility <- \n        conser_data$Freq / length(levels(y[[1]])) %>% round(2)\n    conser_data$visibility <- conser_data$visibility * 100\n    names(conser_data)[3] <- \"position\"\n    y_filter <- y[c(-1,-3)] \n    conser_ready <- merge(conser_data, y_filter)\n    y$color <- color_increment(conser_ready)\n    return(y)\n}\n"
  },
  {
    "path": "R/color_else.R",
    "content": "##'  Assigning colors to sequence alignment.\r\n##'\r\n##'\r\n##' @param y sequence alignment with data frame, generated by tidy_msa().\r\n##' @param color a Color scheme. One of 'Clustal', 'Chemistry_AA', \r\n##' 'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT',\r\n##'  'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\r\n##' @param custom_color A data frame with two column called \"names\" \r\n##' and \"color\".Customize the color scheme.\r\n##' @noRd\r\ncolor_scheme <- function(y, color = \"Chemistry_AA\", custom_color = NULL) {\r\n    if (!is.null(custom_color)){\r\n        #Elimination factor interference\r\n        custom_color[[\"names\"]] <- as.character(custom_color[[\"names\"]]) \r\n        #Fuzzy matching the string \"colors\" or \"colours\"\r\n        custom_color[[\"color\"]] <- as.character(custom_color$col)\r\n        row.names(custom_color) <- custom_color[[\"names\"]]\r\n        scheme_AA$custom_color <- \r\n            custom_color[row.names(scheme_AA), \"color\"] %>% as.character()\r\n        y$color <- scheme_AA[y$character, \"custom_color\"]\r\n    }else{\r\n        if(grepl(\"NT\", color)){\r\n            y$color <- scheme_NT[y$character, color]\r\n        } else{\r\n            y$color <- scheme_AA[y$character, color]\r\n        }\r\n    }\r\n    return(y)\r\n}\r\n\r\n\r\n"
  },
  {
    "path": "R/cons.R",
    "content": "##' cleaning the needless sequences' color according to the \r\n##' consensus sequence (only used in the consensus views).\r\n##'\r\n##' @param y a data frame, sequence alignment with specified color.\r\n##' @param consensus the consensus sequence which can be called by \r\n##' get_consensus().\r\n##' @param disagreement a logical value. Displays characters that \r\n##' disagreement to consensus(excludes ambiguous disagreements).\r\n##' @param ref a character string. Specifying the reference sequence\r\n##'  which should be one of input sequences when 'consensus_views' is TRUE.\r\n##' @keywords tidy_color\r\n##' @noRd\r\ntidy_color <- function(y, consensus, disagreement, ref) {\r\n    c <- lapply(unique(y$position), function(i) {\r\n        msa_cloumn <- y[y$position == i, ]\r\n        if(!is.null(ref)) {\r\n            if ('label' %in% names(msa_cloumn)) { ##work for ggtreeExtra\r\n                msa_cloumn <- msa_cloumn[!msa_cloumn$label == ref, ]\r\n            }else{\r\n                msa_cloumn <- msa_cloumn[!msa_cloumn$name == ref, ]\r\n                }\r\n           \r\n        }\r\n        #Get consensus char.\r\n        cons_char <- consensus[consensus$position == i, \"character\"] \r\n        \r\n        #Compare the characters of the current position(i) \r\n        #to the consensus char.\r\n        logic <- msa_cloumn$character == cons_char \r\n        #Cleaning colors according to the 'logic'.\r\n        if(cons_char == \"X\") {\r\n            msa_cloumn$color <- NA\r\n        }\r\n        if(disagreement){\r\n            msa_cloumn[logic, \"color\"] <- NA\r\n        }else{\r\n            msa_cloumn[!logic, \"color\"] <- NA\r\n        }\r\n        msa_cloumn\r\n    }) %>% do.call(\"rbind\", .)\r\n    return(c)\r\n}\r\n\r\n##' calling the consensus sequence.\r\n##'\r\n##' @param tidy sequence alignment with data frame, generated by tidy_msa().\r\n##' @param ignore_gaps a logical value. When selected TRUE, gaps in \r\n##' column are treated as if that row didn't exist.\r\n##' @param ref a character string. Specifying the reference sequence \r\n##' which should be one of input sequences when 'consensus_views' is TRUE.\r\n##' @keywords get_consensus\r\n##' @noRd\r\nget_consensus <- function(tidy, ignore_gaps = FALSE, ref = NULL) {\r\n    if(!is.null(ref)) {\r\n        if(ignore_gaps) {\r\n            warning(\"The argument 'ignore_gaps' is \r\n                    invalid when 'ref' is specified!\")\r\n        }\r\n        if ('label' %in% names(tidy)) { ##work for ggtreeExtra\r\n            ref <- match.arg(ref, levels(factor(tidy$label)))\r\n            cons <- tidy[tidy$label == ref,]\r\n        }else {\r\n            ref <- match.arg(ref, levels(tidy$name))\r\n            cons <- tidy[tidy$name == ref,]\r\n        }\r\n        return(cons)\r\n    }\r\n    #Iterate through each columns\r\n    cons <- lapply(unique(tidy$position), function(i) { \r\n        msa_cloumn <- tidy[tidy$position == i, ]\r\n        cons <- data.frame(position = i)\r\n        if(ignore_gaps) {\r\n            msa_cloumn <- msa_cloumn[!msa_cloumn$character %in% \"-\",]\r\n        }\r\n        #Gets the highest frequency characters\r\n        fre <- table(msa_cloumn$character) %>% data.frame\r\n        max_element <- fre[fre[2] == max(fre[2]),]\r\n        max_number <-  max_element %>% nrow\r\n        if(max_number == 1) {\r\n            cons$character <- max_element[1,1]\r\n        }else {\r\n            cons$character <- \"X\"\r\n        }\r\n        cons\r\n        }) %>% do.call(\"rbind\", .)\r\n\r\n        cons$name = \"Consensus\"\r\n        cons$character <- as.character(cons$character) #debug 'as.character'\r\n        return(cons)\r\n}\r\n\r\n\r\norder_name <- function(name, order = NULL, \r\n                       consensus_views = FALSE,\r\n                       ref = NULL) {\r\n    name_uni <- unique(name)\r\n    if(is.null(ref)){\r\n        #placed 'consensus' at the top\r\n        name_expect <- name_uni[!name_uni %in% \"Consensus\"] %>%\r\n            rev %>% \r\n            as.character\r\n        name <- factor(name, levels = c(name_expect, \"Consensus\"))\r\n    }else {\r\n        name_expect <- name_uni[!name_uni %in% ref] %>%\r\n            rev %>%\r\n            as.character\r\n        name <- factor(name, levels = c(name_expect, ref))\r\n    }\r\n\r\n    return(name)\r\n}\r\n\r\n"
  },
  {
    "path": "R/data.R",
    "content": "#' A sample data used in ggmsa\n#'\n#' A dataset containing the alignment sequences of \n#' the phenylalanine hydroxylase protein (PH4H) \n#' within nine species\n#'\n#'\n#' @docType data\n#' @keywords datasets\n#' @name sample.fasta\n#' @format A MSA fasta with 9 sequences and 456 positions.\nNULL\n\n\n\n#' GVariation\n#'\n#' A folder containing 4 MAS files as a sample\n#' data set to identify the sequence recombination event.\n#'\n#' \\itemize{\n#'   \\item A.Mont.fas MSA with sequences of 'Mont' and 'CF_YL21'\n#'   \\item B.Oz.fas MSA with sequences of 'Oz' and 'CF_YL21'\n#'   \\item C.Wilga5.fas MSA with sequences of 'Wilga5' and 'CF_YL21'\n#'   \\item sample_alignment.fa MSA with sequences of 'Mont', 'CF_YL21', \n#'   'Oz', and 'Wilga5'\n#' }\n#' @docType data\n#' @keywords datasets\n#' @name GVariation\n#' @format a folder \n#' @source \\url{https://link.springer.com/article/10.1007/s11540-015-9307-3}\nNULL\n\n\n\n#' Rfam\n#'\n#' A folder containing seed alignment sequences and \n#' corresponding consensus RNA secondary structure. \n#'\n#' \\itemize{\n#'   \\item RF00458.fasta seed alignment sequences of Cripavirus internal \n#'   ribosome entry site (IRES)\n#'   \\item RF03120.fasta seed alignment sequences of Sarbecovirus 5'UTR\n#'   \\item RF03120_SS.txt consensus RNA secondary structure of \n#'   Sarbecovirus 5'UTR\n#'  \n#' }\n#' @docType data\n#' @keywords datasets\n#' @name Rfam\n#' @format a folder \n#' @source \\url{https://rfam.xfam.org/}\nNULL\n\n\n\n#' Gram-negative_AKL\n#'\n#' Amino acids in the adenylate kinase lid (AKL) domain\n#' from Gram-negative bacteria. \n#'\n#' @docType data\n#' @keywords datasets\n#' @name Gram-negative_AKL.fasta\n#' @format A MSA fasta with 100 sequences and 36 positions.\n#' @source \\url{http://biovis.net/year/2013/info/redesign-contest}\nNULL\n\n\n\n#' Gram-positive_AKL\n#'\n#' Amino acids in the adenylate kinase lid (AKL) domain\n#' from Gram-positive bacteria. \n#'\n#' @docType data\n#' @keywords datasets\n#' @name Gram-positive_AKL.fasta\n#' @format A MSA fasta with 100 sequences and 36 positions.\n#' @source \\url{http://biovis.net/year/2013/info/redesign-contest}\nNULL\n\n\n\n#' A sample DNA alignment sequences\n#'\n#' DNA alignment sequences with 24 sequences and 56 positions.\n#'\n#'\n#' @docType data\n#' @keywords datasets\n#' @name LeaderRepeat_All.fa\n#' @format A MSA fasta \nNULL\n\n\n\n#' microRNA data used in ggmsa\n#'\n#'Fasta format sequences of mature miRNA sequences \n#'from miRBase\n#'\n#'\n#' @docType data\n#' @keywords datasets\n#' @name seedSample.fa\n#' @format A MSA fasta with 6 sequences and 22 positions.\n#' @source \\url{https://www.mirbase.org/ftp.shtml}\nNULL\n\n\n\n#' sequence-link-tree\n#'\n#' Alignment sequences used to demonstrate circular MSA layout\n#'\n#' @docType data\n#' @keywords datasets\n#' @name sequence-link-tree.fasta\n#' @format A MSA fasta with 28 sequences and 480 positions.\nNULL\n\n\n\n#' TP53 MSA\n#'\n#' Alignment sequences of used to show graphical combination\n#'\n#' @docType data\n#' @keywords datasets\n#' @name tp53.fa\n#' @format A MSA fasta with 5 sequences and 404 positions.\nNULL\n\n\n\n#' genome locus\n#'\n#' The local genome map shows the 30000 sites around the TP53 gene.\n#'\n#' @docType data\n#' @keywords datasets\n#' @name TP53_genes.xlsx\n#' @format xlsx\nNULL\n\n"
  },
  {
    "path": "R/dms.R",
    "content": "##' assign dms value to alignments.\n##'\n##' @title assign_dms\n##' @param x data frame from tidy_msa()\n##' @param dms dms data frame\n##' @return tree\n##' @export\n##' @author Lang Zhou\n\nassign_dms <- function(x, dms) {\n    dms_value <- lapply(unique(x$position), function(i) {\n        xx <- x[x$position == i,]\n        dmss <- dms[dms$site_RBD == i,]\n        \n        wt <- unique(dmss[,\"wildtype\"])\n        xx$mutation <- paste0(wt, xx$position, xx$character)\n        xx$bind_avg  <- lapply(seq_along(xx$mutation),function(j) {\n            bind_avg <- dmss[dmss$mutation_RBD %in% xx[j,\"mutation\"],\"bind_avg\"]\n            return(bind_avg)\n        }) %>% unlist\n        \n        return(xx)\n    }) %>% do.call(\"rbind\",.)\n    return(dms_value )\n}\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/facet_msa.R",
    "content": "##' The MSA would be plot in a field that you set.\n\n##' @title segment MSA\n##' @param field a numeric vector of the field size.\n##' @return ggplot layers\n##' @examples\n##' library(ggplot2)\n##' f <- system.file(\"extdata/sample.fasta\", package=\"ggmsa\")\n##' # 2 fields\n##' ggmsa(f, end = 120, font = NULL, color=\"Chemistry_AA\") + \n##'   facet_msa(field = 60)\n##' # 3 fields\n##' ggmsa(f, end = 120, font = NULL,  color=\"Chemistry_AA\") + \n##'   facet_msa(field = 40)\n##' @export\n##' @author Lang Zhou\nfacet_msa <- function(field) {\n    structure(list(field = field),\n              class = \"facet_msa\"\n              )\n}\n\nfacet_data <- function(msaData, field) {\n\n    if(min(msaData$position) > 1){\n        pos_reset <- msaData$position - min(msaData$position)\n        pos_reset[pos_reset == 0] <- 1\n    }else {\n        pos_reset <- msaData$position\n    }\n    msaData$facet <- pos_reset %/% field\n\n\n    msaData[(pos_reset %% field) == 0, \"facet\"] <- \n        msaData[(pos_reset %% field) == 0, \"facet\"] - 1\n\n    return(msaData)\n\n}\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/geom_GC.R",
    "content": "##' Multiple sequence alignment layer for ggplot2. It plot points of GC content.\n\n##' @title geom_GC\n##' @param show.legend logical. Should this layer be included in the legends?\n##' @return a ggplot layer\n##' @examples\n##' #plot GC content\n##' f <- system.file(\"extdata/LeaderRepeat_All.fa\", package=\"ggmsa\")\n##' ggmsa(f, font = NULL, color=\"Chemistry_NT\") + geom_GC()\n##' @export\n##' @author Lang Zhou\ngeom_GC <- function(show.legend = FALSE) {\n    structure(list(show.legend = show.legend),\n              class = \"GCcontent\")\n}\n\n\ngeom_GC1 <- function(tidyData, show.legend = FALSE){\n    tidy <- tidyData\n    #tidy <- tidy_msa(msa = msa, start = start, end = end)\n    GC_pos <- getOption(\"GC_pos\")\n\n    GC <- content_GC(tidy)\n    GC <-GC[GC$character == \"GC\",]\n    col_num <- levels(factor(tidy$position))\n    col_len <- length(col_num) + GC_pos\n    ly_GC <- geom_point(data = GC,\n                        mapping = aes_(x = ~col_len, \n                                       y = ~ypos, \n                                       size = ~fre),\n                        color = \"#51a6e9\", \n                        na.rm = TRUE, \n                        show.legend = show.legend)\n    return(ly_GC)\n}\n##' get GC content\n\n##' @title content_GC\n##' @param data  Multiple aligned sequence files or objects\n##'  for representing nucleotide sequences\n##' @return A data frame\n##' @noRd\n##' @author Lang Zhou\ncontent_GC<- function(data){\n    tidy <- data\n    tidy$name <- factor(tidy$name, levels = unique(tidy$name))\n    tidy$ypos <- as.numeric(tidy$name)\n    seq_num <- unique(tidy$ypos)\n    lchar_num <- lapply(seq_num, function(j){\n        clo <- tidy[tidy$ypos == j, ]\n        y <- prop.table(table(clo$character))\n        y[\"GC\"] <- y[\"G\"] + y[\"C\"]\n        num <-setNames(rep(0,5), c(\"A\", \"T\", \"G\", \"C\", \"GC\"))\n        num[names(y)] <- y\n        return(num)\n    })\n\n    char_num <- do.call(rbind,lchar_num)\n    char_num <- as.data.frame(char_num)\n    char_num[\"ypos\"] =  seq_num\n    char_num2 <- gather(char_num,character,fre, \"A\", \"T\", \"C\",\"G\",\"GC\")\n    return(char_num2)\n}\n\n\n\n"
  },
  {
    "path": "R/geom_asterisk.R",
    "content": "##' a ggplot2 layer of asterisk as a polygon\n##'\n##'\n##' @title a ggplot2 layer of asterisk as a polygon\n##' @param mapping aes mapping\n##' @param data a data frame\n##' @param stat the statistical transformation to use on the data \n##' for this layer, as a string.\n##' @param position position adjustment, either as a string, \n##' or the result of a call to a position adjustment function.\n##' @param na.rm a logical value\n##' @param show.legend a logical value\n##' @param inherit.aes a logical value\n##' @param ... additional parameters\n##' @importFrom ggplot2 layer\n##' @return ggplot2 layer\n## @export\n##' @noRd\n##' @author Lang Zhou\n##' @examples\n##' #library(ggplot2)\n##' #ggplot(mtcars, aes(mpg, disp)) + geom_asterisk()\ngeom_asterisk <- function(mapping = NULL, \n                          data = NULL, \n                          stat = \"identity\",\n                          position = \"identity\", \n                          na.rm = FALSE, \n                          show.legend = NA,\n                          inherit.aes = TRUE, ...) {\n\n  layer(geom = Geomasterisk, \n        mapping = mapping, \n        data = data, \n        stat = stat,\n        position = position,\n        show.legend = show.legend,\n        inherit.aes = inherit.aes,\n        params = list(na.rm = na.rm, ...))\n}\n\n##' @importFrom grid polygonGrob\n##' @importFrom grid gpar\nSeedStar <- function(x = NULL , y = NULL) {\n\n    char_width <- getOption(\"asterisk_width\")\n    char_scale_2 <- getOption(\"char_scale_2\")\n\n    x_width <- char_scale_2 * diff(range(star$y))\n    star$x = star$x * x_width/diff(range(star$x))\n\n    char_scale <- diff(range(star$x))/diff(range(star$y))\n    star$x = star$x * (char_width * char_scale)/diff(range(star$x))\n    star$y = star$y * char_width/diff(range(star$y))\n\n    star$x = star$x - min(star$x)  - (char_width * char_scale)/2 + x\n    star$y = star$y - min(star$y)  - char_width/2 + y\n\n    polygonGrob(star$x, star$y, gp = gpar(fill = \"black\") )\n\n}\n\n\n##' @importFrom ggplot2 ggproto\n##' @importFrom ggplot2 Geom\n##' @importFrom ggplot2 draw_key_polygon\n##' @importFrom ggplot2 aes\n##' @importFrom grid gTree\nGeomasterisk <- ggproto(\"Geomasterisk\", Geom,\n                         required_aes = c(\"x\", \"y\"),\n                         default_aes = aes(fill = \"black\"),\n                         draw_key = draw_key_polygon,\n\n                         draw_panel = function(data, panel_params, coord) {\n                             data <- coord$transform(data, panel_params)\n                             grobs <- lapply(seq_len(nrow(data)), function(i) {\n                                          SeedStar(data$x[i], data$y[i])\n                                      })\n                             class(grobs) <- \"gList\"\n                             ggplot2:::ggname(\"geom_asterisk\", \n                                              gTree(children = grobs))\n                         }\n\n)\n\n\n\n\n"
  },
  {
    "path": "R/geom_msa.R",
    "content": "##' Multiple sequence alignment layer for ggplot2. \r\n##' It creates background tiles with/without sequence characters.\r\n##'\r\n##' @title geom_msa\r\n##' @param data sequence alignment with data frame, generated by tidy_msa().\r\n##' @param font font families, possible values are 'helvetical', 'mono', \r\n##' and 'DroidSansMono', 'TimesNewRoman'. Defaults is 'helvetical'.\r\n##' @param mapping aes mapping\r\n##' If font = NULL, only plot the background tile.\r\n##' @param color A Color scheme. One of 'Clustal', 'Chemistry_AA', 'Shapely_AA',\r\n##'  'Zappo_AA', 'Taylor_AA', 'LETTER','CN6',, 'Chemistry_NT', 'Shapely_NT', \r\n##'  'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\r\n##' @param custom_color A data frame with two column called \"names\" and \r\n##' \"color\".Customize the color scheme.\r\n##' @param char_width a numeric vector. Specifying the character width in \r\n##' the range of 0 to 1. Defaults is 0.9.\r\n##' @param by_conservation a logical value. The most conserved regions have\r\n##'  the brightest colors.\r\n##' @param none_bg a logical value indicating whether background \r\n##' should be displayed. Defaults is FALSE.\r\n##' @param position_highlight A numeric vector of the position that\r\n##'  need to be highlighted.\r\n##' @param seq_name a logical value indicating whether sequence names\r\n##'  should be displayed. Defaults is 'NULL' which indicates that the \r\n##'  sequence name is displayed when 'font = null', but 'font = char' \r\n##'  will not be displayed. If 'seq_name = TRUE' the sequence name will \r\n##'  be displayed in any case. If 'seq_name = FALSE' the sequence name will not\r\n##'   be displayed under any circumstances.\r\n##' @param border a character string. The border color.\r\n##' @param consensus_views a logical value that opening consensus views.\r\n##' @param use_dot a logical value. Displays characters as dots instead of\r\n##'  fading their color in the consensus view.\r\n##' @param disagreement a logical value. Displays characters that disagreement\r\n##'  to consensus(excludes ambiguous disagreements).\r\n##' @param ignore_gaps a logical value. When selected TRUE, \r\n##' gaps in column are treated as if that row didn't exist.\r\n##' @param ref a character string. Specifying the reference sequence\r\n##'  which should be one of input sequences when 'consensus_views' is TRUE.\r\n##' @param position Position adjustment, either as a string, or\r\n##'  the result of a call to a position adjustment function,\r\n##' default is 'identity' meaning 'position_identity()'.\r\n##' @param show.legend logical. Should this layer be included in the legends?\r\n##' @param dms logical. \r\n##' @param position_color logical. \r\n##' @param ... additional parameter\r\n##' @return A list\r\n##' @importFrom ggplot2 scale_fill_manual\r\n##' @importFrom utils modifyList\r\n##' @export\r\n##' @examples\r\n##' library(ggplot2)\r\n##'aln <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\r\n##'tidy_aln <- tidy_msa(aln, start = 150, end = 170)\r\n##'ggplot() + geom_msa(data = tidy_aln, font = NULL) + coord_fixed()\r\n##' @author Guangchuang Yu, Lang Zhou\r\ngeom_msa <- function(data, font = \"helvetical\",\r\n                     mapping = NULL,\r\n                     color = \"Chemistry_AA\",\r\n                     custom_color = NULL,\r\n                     char_width = 0.9,\r\n                     none_bg = FALSE,\r\n                     by_conservation = FALSE,\r\n                     position_highlight = NULL,\r\n                     seq_name = NULL,\r\n                     border = NULL,\r\n                     consensus_views = FALSE,\r\n                     use_dot = FALSE,\r\n                     disagreement = TRUE,\r\n                     ignore_gaps = FALSE,\r\n                     ref = NULL,\r\n                     position = \"identity\",\r\n                     show.legend = FALSE,\r\n                     dms = FALSE,\r\n                     position_color = FALSE,\r\n                     ... ) {\r\n\r\n    data <- msa_data(data,\r\n                     font = font,\r\n                     color = color,\r\n                     custom_color = custom_color,\r\n                     char_width = char_width,\r\n                     by_conservation = by_conservation,\r\n                     consensus_views  = consensus_views,\r\n                     use_dot = use_dot,\r\n                     disagreement = disagreement,\r\n                     ignore_gaps = ignore_gaps,\r\n                     ref = ref)\r\n\r\n    #legend work\r\n    xx <- data[,c(\"character\",\"color\")] %>% unique()\r\n    xx <- xx[!is.na(xx$color),]\r\n    labs <- lapply(unique(xx$color) %>% seq_along, function(i) {\r\n        cols <- unique(xx$color)[i]\r\n        dup_char <- xx[xx$color == cols, \"character\"]\r\n        lab <- paste0(dup_char, collapse = \",\")\r\n    }) %>% do.call(\"rbind\",.) %>% as.vector()\r\n\r\n    cols <- xx$color %>% unique()\r\n    names(cols) <- cols\r\n    sacle_tile_cols <- scale_fill_manual(values = cols,\r\n                                         breaks = cols,\r\n                                         labels = labs)\r\n\r\n\r\n    bg_data <- data\r\n\r\n    #work to ggtreeExtra\r\n    if (is.null(mapping)) {\r\n        mapping <- aes_(x = ~position, y = ~name, fill = ~I(color))\r\n    }\r\n    \r\n    #dms color work\r\n    if (dms) {\r\n        mapping <- modifyList(mapping, aes_(fill = ~bind_avg))\r\n    }\r\n    if (position_color) {\r\n        mapping <- modifyList(mapping, aes_(fill = ~I(pos_color)))\r\n    }\r\n    \r\n    \r\n    #'seq_name' work\r\n    if (!isTRUE(seq_name)) {\r\n        if ('y' %in% colnames(data) || isFALSE(seq_name) ) {\r\n            y <- as.numeric(bg_data$name)\r\n            mapping <- modifyList(mapping, aes_(y = ~y)) #\"~y\" is seq numbers\r\n        }\r\n    }\r\n\r\n    #'position_highlight' work\r\n    if (!is.null(position_highlight)) {\r\n        none_bg = TRUE\r\n        bg_data <- bg_data[bg_data$position %in% position_highlight,]\r\n        bg_data$postion <- as.factor(bg_data$position)\r\n        mapping <- modifyList(mapping, aes_(x = ~position, \r\n                                            fill = ~color, \r\n                                            width = 1))\r\n    }\r\n\r\n    #'border' work\r\n    if(is.null(border)){\r\n        ly_bg <- geom_tile(mapping = mapping, data = bg_data, color = 'grey', \r\n                           inherit.aes = FALSE, position = position, \r\n                           show.legend = show.legend)\r\n    }else{\r\n        ly_bg <- geom_tile(mapping = mapping, data = bg_data, color = border,\r\n                           inherit.aes = FALSE, position = position, \r\n                           show.legend = show.legend)\r\n    }\r\n\r\n    if (!all(c(\"yy\", \"order\", \"group\") %in% colnames(data))) {\r\n        if(position_color) {\r\n            return(list(ly_bg))\r\n        }else{\r\n            return(list(ly_bg, sacle_tile_cols))\r\n        }\r\n    }\r\n\r\n    if ('y' %in% colnames(data)) {\r\n        data$yy = data$yy - as.numeric(data$name) + data$y\r\n    }\r\n\r\n    label_mapping <- aes_(x = ~x, y = ~yy, group = ~group)\r\n\r\n    # use_dot work\r\n    if (consensus_views && !use_dot) {\r\n        if(show.legend) {\r\n            stop(\"legends catn't be shown in the consensus view!\")\r\n        }\r\n        label_mapping <- modifyList(label_mapping, aes_(fill = ~I(font_color)))\r\n    }\r\n    ly_label <- geom_polygon(mapping = label_mapping, data = data, \r\n                             inherit.aes = FALSE, position = position)\r\n\r\n    #'none_bg' work\r\n    if (none_bg & is.null(position_highlight)) {\r\n        return(ly_label)\r\n    }\r\n\r\n    if(consensus_views) {\r\n        return(list(ly_bg, ly_label))\r\n    }else {\r\n        if(position_color){\r\n            return(list(ly_bg, ly_label))\r\n        }else{\r\n            return(list(ly_bg, ly_label, sacle_tile_cols))\r\n        }\r\n    }\r\n\r\n}\r\n\r\n"
  },
  {
    "path": "R/geom_msaBar.R",
    "content": "##' Multiple sequence alignment layer for ggplot2.\n##'  It plot sequence conservation bar.\n\n##' @title geom_msaBar\n\n##' @return A list\n##' @examples\n##' #plot multiple sequence alignment and conservation bar.\n##' f <- system.file(\"extdata/sample.fasta\", package=\"ggmsa\")\n##' ggmsa(f, 221, 280, font = NULL, seq_name = TRUE) + geom_msaBar()\n##' @export\n##' @author Lang Zhou\ngeom_msaBar <- function() {\n    structure(list(),\n              class = \"msaBar\")\n}\n\n##' @importFrom ggplot2 geom_col\nly_bar <- function(tidy){\n    data <- bar_data(tidy)\n    mapping <- aes_(x = ~pos, y = ~Freq, fill = ~Freq)\n    ly_bar <- geom_col(data = data, \n                       mapping = mapping, \n                       width = 1, \n                       show.legend = FALSE)\n    return(ly_bar)\n}\n\n\n##' get bar data\n\n##' @title bar_data\n##' @param tidy  Multiple aligned sequence files or \n##' object for representing nucleotide sequences\n##' @return A data frame\n##' @noRd\n##' @author Lang Zhou\nbar_data <- function(tidy){\n    character_position <- unique(tidy$position)\n    conservation_score <- lapply(character_position, function(j) {\n        cloumn_data <- tidy[tidy$position == j, ]\n        character_frequency <- table(cloumn_data$character) %>% as.data.frame\n        max_frequency <- character_frequency[character_frequency[2] ==\n                                                 max(character_frequency[2]),]\n        max_frequency$Var1 <- as.character(max_frequency$Var1)\n        if(nrow(max_frequency) == 1) {\n            max_frequency <- max_frequency[1,]\n        }else {\n            max_frequency <- max_frequency[1,]\n        }\n}) %>% do.call(\"rbind\", .)\n    conservation_score[\"pos\"] <- character_position\n    return(conservation_score)\n}\n"
  },
  {
    "path": "R/geom_seed.R",
    "content": "##' Highlighting the seed in miRNA sequences\r\n##'\r\n##'\r\n##' @title geom_seed\r\n##' @param seed a character string.Specifying the miRNA seed sequence\r\n##'  like 'GAGGUAG'.\r\n##' @param star a logical value indicating whether asterisks should \r\n##' be displayed.\r\n##' @return a ggplot layer\r\n##' @author Lang Zhou\r\n##' @examples\r\n##' miRNA_sequences <- system.file(\"extdata/seedSample.fa\", package=\"ggmsa\")\r\n##' ggmsa(miRNA_sequences, font = 'DroidSansMono', \r\n##'       color = \"Chemistry_NT\", none_bg = TRUE) +\r\n##' geom_seed(seed = \"GAGGUAG\", star = FALSE)\r\n##' ggmsa(miRNA_sequences, font = 'DroidSansMono', \r\n##'       color = \"Chemistry_NT\") +\r\n##' geom_seed(seed = \"GAGGUAG\", star = TRUE)\r\n##' @export\r\ngeom_seed <- function(seed, star = FALSE) {\r\n    structure(list(seed = seed,\r\n                   star = star),\r\n              class = \"seed\")\r\n}\r\n\r\n\r\ngeom_seed1 <- function(tidyData, seed, star) {\r\n    get_asteriskScale(tidyData)\r\n    tidyData$y <- as.numeric(tidyData$name)\r\n    seq_first <- tidyData[tidyData$y == 1,]\r\n    char <- seq_first$character\r\n    char <- paste(char, collapse = \"\")\r\n    seedPos <- regexpr(seed,char)\r\n    #locate <- str_locate(char, seed)\r\n    #df_locate <- as.data.frame(locate)\r\n    #seedPos <- df_locate$start # start position of seed region\r\n    seedLen <- nchar(seed) # length of seed region\r\n    numSeq <- max(tidyData$y) # number of sequences\r\n    shadingLen <- getOption(\"shadingLen\") #shading width\r\n    shading_alpha <- getOption(\"shading_alpha\")\r\n\r\n    x <- seedPos - .5 #the x coordinate of the lower left corner\r\n    y <- 1 - .5 - shadingLen #the y coordinate of the lower left corner\r\n    yy <- numSeq + .5 + shadingLen # #the y coordinate of the top right corner\r\n    xx <- x + seedLen #the x coordinate of the top right corner\r\n\r\n    shadingData <- data.frame(x = c(x, x, xx, xx),\r\n                    y = c(y, yy, yy, y),\r\n                    t = c('a', 'a', 'a','a'))\r\n    starData <- data.frame(star_x = seq(seedPos, length.out = nchar(seed)),\r\n                            star_y = rep(y, times = nchar(seed)))\r\n\r\n    if(isTRUE(star)) {\r\n        ly_star <- geom_asterisk(data = starData, \r\n                                 aes_(x = ~star_x, y = ~star_y))\r\n        return(ly_star)\r\n    }\r\n\r\n    mapping <- aes_(x= ~x, y= ~y, group= ~t, fill = ~I('#bebebe'))\r\n    ly_seed <- geom_polygon(data = shadingData, \r\n                            mapping = mapping, \r\n                            alpha = shading_alpha)\r\n    return(ly_seed)\r\n }\r\n\r\n\r\nget_asteriskScale <- function(tidyData) {\r\n    m <- max(tidyData$position)\r\n    seq_name <- factor(tidyData$name, levels = unique(tidyData$name))\r\n    n <- max(as.numeric(seq_name))\r\n    char_scale <- diff(range(star$x))/diff(range(star$y))\r\n    char_scale_2 <- char_scale * 3/2 * n/m\r\n\r\n    return(options(\"char_scale_2\" = char_scale_2))\r\n\r\n}\r\n"
  },
  {
    "path": "R/ggmaf.R",
    "content": "##' plot MAF\n##'\n##' @title ggmaf \n##' @param data a tidy MAF data frame.You can get it by tidy_maf_df() \n##' @param ref character, the name of reference genome. \n##' eg:\"hg38.chr1_KI270707v1_random\"\n##' @param block_start a numeric vector(>0). The start block to plot.\n##' @param block_end a numeric vector(< max block). The end block to plot.\n##' @param facet_field a numeric vector. The field in a facet panel.\n##' @param heights two numeric vector.The plot proportion between \n##' \"Genomic location\" panel(upon) and \"Alignment\" panel(down).\n##' Default:c(0.4,0.6)\n##' @param facet_heights Numeric vectors.The facet proportion.\n##' @return ggplot object\n##' @export\n##' @author Lang Zhou\nggmaf <- function(data, \n                  ref, \n                  block_start = NULL, \n                  block_end = NULL, \n                  facet_field = NULL, \n                  heights = c(0.4,0.6),\n                  facet_heights = NULL) {\n  \n  d <- data[data$block_number %in% c(block_start : block_end),]\n  \n  if(is.null(facet_field)) {\n    maf_p <- maf_plot(d = d, ref = ref)\n    p <- plot_list(gglist = maf_p, heights = heights)\n    return(p)\n  }else {\n    d <- facet_maf(mafData = d, field = facet_field)\n    p_ls <- lapply(unique(d$facet), function(i) {\n      facet_d <- d[d$facet == i,]\n      maf_p <- maf_plot(d = facet_d, ref = ref)\n      pp <- plot_list(gglist = maf_p, heights = heights)\n      return(pp)\n    })\n    p <- plot_list(gglist = p_ls, ncol =  1, heights = facet_heights)\n    return(p)\n  }\n}\n\n\n##' tidy MAF data frame \n##'\n##' @title tidy_maf_df\n##' @param maf_df a MAF data frame.You can get it by read_maf() \n##' @param ref character, the name of reference genome. \n##' eg:\"hg38.chr1_KI270707v1_random\"\n##' @return data frame\n##' @export\n##' @author Lang Zhou\ntidy_maf_df <- function(maf_df,ref) {\n  ##add ref position to other genome\n  block_num <- unique(maf_df$block)\n  tidy_df <- lapply(block_num, function(i) {\n    x <- maf_df[maf_df$block == i,]\n    x$ref_start <- x[x$src == ref, \"start\"]\n    x$ref_end <- x[x$src == ref, \"end_gap\"]\n    return(x)\n  })%>% do.call(\"rbind\", .)\n  \n  tidy_df$block_number <- factor(tidy_df$block, levels = \n                                   unique(tidy_df$block)) %>% as.numeric\n  tidy_df$bs <- paste0(tidy_df$src,\"-\",tidy_df$block) \n  tidy_df$merge_y <- factor(tidy_df$src) %>% as.numeric\n  tidy_df$label <- paste0(\"B\",tidy_df$block_number)\n  tidy_df <- order_aln(tidy_df,ref)\n  return(tidy_df)\n  \n}\n\n\n#put the ref sequence the first in each block, new col \"y\"\norder_aln <- function(tidy_df, ref) {\n    block_num <- unique(tidy_df$block)\n    lev <- sapply(block_num, function(i) {\n        x <- tidy_df[tidy_df$block == i,]\n        order <- c(ref, x$src[!x$src %in% ref]) \n        \n        lev <- paste0(order, \"-\",x$block)  \n        return(lev)\n    })%>% unlist %>% rev\n    tidy_df$y <- factor(tidy_df$bs,levels = lev) %>% as.numeric\n    return(tidy_df)\n}\n\n##' @importFrom utils getFromNamespace\n##' @importFrom ggplot2 aes_\n##' @importFrom ggplot2 geom_text\nmaf_plot <- function(d, ref, \n                     positive_color = \"#a9c9d4\",\n                     negative_color = \"#ffa389\") {\n  geom_rrect <- getFromNamespace(\"geom_rrect\",\"statebins\")\n  ##plot down panel\n  p_maf_aln <- ggplot(data = d) + \n    geom_rrect(mapping=aes_(xmin =~ ref_start,\n                            xmax =~ ref_end,\n                            ymin =~ y - 0.3,\n                            ymax =~ y + 0.3,\n                            fill =~ strand)) +\n    geom_rrect(data = d,\n               mapping=aes_(xmin =~ ref_start,\n                            xmax =~ ref_end,\n                            ymin =~ max(y) + 1 - 0.3,\n                            ymax =~ max(y) + 1 + 0.3),\n               fill = \"#a9c9d4\",color = \"black\") +\n    scale_y_continuous(breaks = c(d$y,max(d$y + 1)),labels = c(d$bs, ref)) +\n    scale_fill_manual(breaks = c(\"+\",\"-\"),\n                      values = c(positive_color,negative_color)) +\n    theme_void() +\n    theme(axis.text.x = element_text(),\n          axis.text.y = element_text(),\n          panel.grid.minor.y = element_blank(),\n          panel.grid.major.y = element_line(color = \"grey\"))\n  \n  ##plot upon panel\n  aim <- d[d$src != ref, ]\n  p_maf_genomePos <- ggplot(data = aim) + \n    geom_rrect(mapping = aes_(xmin =~ start,\n                              xmax =~ end_gap,\n                              ymin =~ merge_y - 0.3,\n                              ymax =~ merge_y + 0.3,\n                              fill =~ strand),\n               color = \"black\", \n               size = 0.5, \n               alpha = 0.8, \n               show.legend = FALSE) + \n    scale_y_continuous(breaks = unique(aim$merge_y),\n                       labels = unique(aim$src)) +\n    scale_fill_manual(breaks = c(\"+\",\"-\"),\n                      values = c(positive_color,negative_color)) +\n    theme_void() + theme(panel.grid.major.y = element_line(color = \"grey\"),\n                         axis.text.x = element_text(),\n                         axis.text.y = element_text(),\n                         strip.text = element_blank()) + \n    geom_text(aes_(x =~ (start + end_gap)/2, \n                   y =~ merge_y,label =~ label), \n              size = 3) +\n    facet_wrap(~src, scales = \"free\", ncol = 1)\n  return(list(p_maf_genomePos, p_maf_aln))\n}\n\n#assign facet number to blocks\nfacet_maf <- function(mafData, field) {\n    \n    if(min(mafData$block_number) > 1){\n        pos_reset <- mafData$block_number - min(mafData$block_number) + 1\n        #pos_reset[pos_reset == 0] <- 1\n    }else {\n        pos_reset <- mafData$block_number\n    }\n    mafData$facet <- pos_reset %/% field\n    \n    mafData[(pos_reset %% field) == 0, \"facet\"] <-\n        mafData[(pos_reset %% field) == 0, \"facet\"] - 1\n    \n    return(mafData)\n}\n\n\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/ggmsa.R",
    "content": "##' Plot multiple sequence alignment using ggplot2 with multiple color schemes \r\n##' supported.\r\n##'\r\n##'\r\n##' @title ggmsa\r\n##' @param msa Multiple aligned sequence files or objects representing either \r\n##' nucleotide sequences or AA sequences.\r\n##' @param start a numeric vector. Start position to plot.\r\n##' @param end a numeric vector. End position to plot.\r\n##' @param font font families, possible values are 'helvetical', 'mono', and \r\n##' 'DroidSansMono', 'TimesNewRoman'.  Defaults is 'helvetical'. \r\n##' If font = NULL, only plot the background tile.\r\n##' @param color a Color scheme. One of 'Clustal', 'Chemistry_AA', \r\n##' 'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \r\n##' 'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\r\n##' @param custom_color A data frame with two column called \"names\" and \r\n##' \"color\".Customize the color scheme.\r\n##' @param char_width a numeric vector. Specifying the character width in \r\n##' the range of 0 to 1. Defaults is 0.9.\r\n##' @param by_conservation a logical value. The most conserved regions have \r\n##' the brightest colors.\r\n##' @param none_bg a logical value indicating whether background should be\r\n##'  displayed. Defaults is FALSE.\r\n##' @param position_highlight A numeric vector of the position that need to be\r\n##'  highlighted.\r\n##' @param seq_name a logical value indicating whether sequence names \r\n##' should be displayed. Defaults is 'NULL' which indicates that the \r\n##' sequence name is displayed when 'font = null', but 'font = char' \r\n##' will not be displayed. If 'seq_name = TRUE' the sequence name will \r\n##' be displayed in any case. If 'seq_name = FALSE' the sequence name \r\n##' will not be displayed under any circumstances.\r\n##' @param border a character string. The border color.\r\n##' @param consensus_views a logical value that opening consensus views.\r\n##' @param use_dot a logical value. Displays characters as dots instead \r\n##' of fading their color in the consensus view.\r\n##' @param disagreement a logical value. Displays characters that \r\n##' disagreememt to consensus(excludes ambiguous disagreements).\r\n##' @param ignore_gaps a logical value. When selected TRUE, gaps in column \r\n##' are treated as if that row didn't exist.\r\n##' @param ref a character string. Specifying the reference sequence which \r\n##' should be one of input sequences when 'consensus_views' is TRUE.\r\n##' @param show.legend logical. Should this layer be included in the legends?\r\n##' @return ggplot object\r\n##' @importFrom tidyr gather\r\n##' @importFrom ggplot2 ggplot\r\n##' @importFrom ggplot2 aes_\r\n##' @importFrom ggplot2 theme\r\n##' @importFrom ggplot2 theme_minimal\r\n##' @importFrom ggplot2 geom_tile\r\n##' @importFrom ggplot2 geom_polygon\r\n##' @importFrom ggplot2 xlab\r\n##' @importFrom ggplot2 ylab\r\n##' @importFrom ggplot2 coord_fixed\r\n##' @importFrom ggplot2 geom_point\r\n##' @importFrom ggplot2 element_blank\r\n##' @importFrom magrittr %>%\r\n##' @importFrom stats setNames\r\n##' @importFrom grid unit\r\n##' @examples\r\n##' #plot multiple sequences by loading fasta format\r\n##' fasta <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\r\n##' ggmsa(fasta, 164, 213, color=\"Chemistry_AA\")\r\n##'\r\n##'\\dontrun{\r\n##' #XMultipleAlignment objects can be used as input in the 'ggmsa'\r\n##' AAMultipleAlignment <- Biostrings::readAAMultipleAlignment(fasta)\r\n##' ggmsa(AAMultipleAlignment, 164, 213, color=\"Chemistry_AA\")\r\n##'\r\n##' #XStringSet objects can be used as input in the 'ggmsa'\r\n##' AAStringSet <- Biostrings::readAAStringSet(fasta)\r\n##' ggmsa(AAStringSet, 164, 213, color=\"Chemistry_AA\")\r\n##'\r\n##' #Xbin objects from 'seqmagick' can be used as input in the 'ggmsa'\r\n##' AAbin <- seqmagick::fa_read(fasta)\r\n##' ggmsa(AAbin, 164, 213, color=\"Chemistry_AA\")\r\n##' }\r\n##' @export\r\n##' @author Guangchuang Yu\r\nggmsa <- function(msa,\r\n                  start = NULL,\r\n                  end = NULL,\r\n                  font = \"helvetical\",\r\n                  color = \"Chemistry_AA\",\r\n                  custom_color = NULL,\r\n                  char_width = 0.9,\r\n                  none_bg = FALSE,\r\n                  by_conservation = FALSE,\r\n                  position_highlight = NULL,\r\n                  seq_name = NULL,\r\n                  border = NULL,\r\n                  consensus_views = FALSE,\r\n                  use_dot = FALSE,\r\n                  disagreement = TRUE,\r\n                  ignore_gaps = FALSE,\r\n                  ref = NULL,\r\n                  show.legend = FALSE) {\r\n\r\n    data <- tidy_msa(msa, start = start, end = end)\r\n\r\n    ggplot() + geom_msa(data, font = font,\r\n                        color = color,\r\n                        custom_color = custom_color,\r\n                        char_width = char_width,\r\n                        none_bg = none_bg,\r\n                        by_conservation = by_conservation,\r\n                        position_highlight = position_highlight,\r\n                        seq_name = seq_name,\r\n                        border = border,\r\n                        consensus_views = consensus_views,\r\n                        use_dot = use_dot,\r\n                        disagreement = disagreement,\r\n                        ignore_gaps = ignore_gaps,\r\n                        ref = ref,\r\n                        show.legend = show.legend) +\r\n               theme_msa()\r\n\r\n}\r\n\r\n\r\n\r\n\r\n\r\n\r\n"
  },
  {
    "path": "R/import-functions.R",
    "content": "##' @importFrom utils globalVariables\nglobalVariables(\".\")\nglobalVariables(\"fre\") #geom_GC.R:\nglobalVariables(\"read.delim\") #arc.R\nglobalVariables(c(\"name\", \"position_adj\", \"y_adj\")) #SeqBundles.R\n\n\n\n\n"
  },
  {
    "path": "R/method-plot.R",
    "content": "##' plot method for SeqDiff object\n##'\n##' @name plot\n##' @rdname plot-methods\n##' @exportMethod plot\n##' @aliases plot,SeqDiff,ANY-method\n##' @docType methods\n##' @param x SeqDiff object\n##' @param width bin width\n##' @param title plot title\n##' @param xlab xlab\n##' @param by one of 'bar' and 'area'\n##' @param fill fill color of upper part of the plot\n##' @param colors color of lower part of the plot\n##' @param xlim limits of x-axis\n##' @return plot\n##' @importFrom ggplot2 ggtitle\n##' @importFrom ggplot2 xlim\n##' @importFrom ggplot2 ggplot_gtable\n##' @importFrom ggplot2 ggplot_build\n##' @importFrom grid unit.pmax\n##' @importFrom aplot plot_list\n##' @author guangchuang yu\n##' @examples\n##' fas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n##'                   pattern=\"fas\", full.names=TRUE)\n##' x1 <- seqdiff(fas[1], reference=1)\n##' plot(x1)\nsetMethod(\"plot\", signature(x=\"SeqDiff\"),\n          function(x, width=50, title=\"auto\",\n                   xlab = \"Nucleotide Position\",\n                   by=\"bar\", fill=\"firebrick\",\n                   colors=c(A=\"#ff6d6d\", C=\"#769dcc\", G=\"#f2be3c\", T=\"#74ce98\"),\n                   xlim = NULL) {\n              nn <- names(x@sequence)\n              if (is.null(title) || is.na(title)) {\n                  title <- \"\"\n              } else if (title == \"auto\") {\n                  title <- paste(nn[-x@reference], \n                                 \"nucelotide differences relative to\", \n                                 nn[x@reference])\n              }\n\n              p1 <- plot_difference_count(x@diff, width, by=by, fill=fill) + \n                  ggtitle(title)\n              p2 <- plot_difference(x@diff, colors=colors, xlab)\n\n              if (!is.null(xlim)) {\n                  p1 <- p1 + xlim(xlim)\n                  p2 <- p2 + xlim(xlim)\n              }\n\n              plot_list(p1, p2, ncol=1, heights=c(.7, .4))\n          }\n          )\n\n\n\n##' @importFrom ggplot2 ggplot\n##' @importFrom ggplot2 aes_\n##' @importFrom ggplot2 geom_segment\n##' @importFrom ggplot2 xlab\n##' @importFrom ggplot2 ylab\n##' @importFrom ggplot2 scale_y_continuous\n##' @importFrom ggplot2 theme_minimal\n##' @importFrom ggplot2 theme\n##' @importFrom ggplot2 element_blank\n##' @importFrom ggplot2 scale_color_manual\nplot_difference <- function(x, colors, xlab=\"Nucleotide Position\") {\n    x$difference <-  x$difference %>% toupper\n    yy = 4:1\n    names(yy) = c(\"A\", \"C\", \"G\", \"T\")\n    x$y <- yy[x$difference]\n    n <- sum(is.na(x$y))\n    if (n > 0) {\n        message(n, \" sites contain deletions or ambiguous bases, \n                which will be ignored in current implementation...\")\n    }\n    x <- x[!is.na(x$y),]\n    p <- ggplot(x, aes_(x=~position, y=~y, color=~difference))\n\n    p + geom_segment(aes_(x=~position, xend=~position, y=~y, yend=~y+.8)) +\n        xlab(xlab) + ylab(NULL) +\n        scale_y_continuous(breaks=yy, labels=names(yy)) +\n        theme_minimal() +\n        theme(legend.position=\"none\")+\n        theme(axis.text.x=element_blank(), axis.ticks.x = element_blank()) +\n        scale_color_manual(values=colors)\n}\n\n##' @importFrom ggplot2 geom_col\n##' @importFrom ggplot2 geom_area\n##' @importFrom ggplot2 theme_bw\nplot_difference_count <- function(x, width, by = 'bar', fill='red') {\n    by <- match.arg(by, c(\"bar\", \"area\"))\n    if (by == 'bar') {\n        geom <- geom_col(fill=fill, width=width)\n        keep0 <- FALSE\n    } else if (by == \"area\") {\n        geom <- geom_area(fill=fill)\n        keep0 <- TRUE\n    }\n    d <- nucleotide_difference_count(x, width, keep0)\n    p <- ggplot(d, aes_(x=~position, y=~count))\n    p + geom + xlab(NULL) + ylab(\"Difference\") + theme_bw()\n}\n\n"
  },
  {
    "path": "R/method-show.R",
    "content": "##' show method\n##'\n##'\n##' @name show\n##' @docType methods\n##' @rdname show-methods\n##' @title show method\n##' @param object SeqDiff object\n##' @return message\n##' @importFrom methods show\n##' @exportMethod show\n##' @aliases SeqDiff-class\n##'   show,SeqDiff-method\n##' @usage show(object)\n##' @examples\n##' fas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n##'                   pattern=\"fas\", full.names=TRUE)\n##' x1 <- seqdiff(fas[1], reference=1)\n##' x1\nsetMethod(\"show\",signature(object=\"SeqDiff\"),\n          function(object) {\n              message(\"sequence differences of\", \n                      paste0(names(object@sequence), collapse=\" and \"), \n                      '\\n')\n              d <- object@diff$difference %>% table %>% as.data.frame\n              message(sum(d$Freq), \" \", \"sites differ:\\n\")\n              freq <- d[,2]\n              names(freq) <- d[,1]\n              print(freq)\n          })\n"
  },
  {
    "path": "R/methods-diff.R",
    "content": "##' @method diff SeqDiff\n##' @export\ndiff.SeqDiff <- function(x, ...) {\n    x@diff\n}\n\n\n"
  },
  {
    "path": "R/methods-ggplot_add.R",
    "content": "##' @method ggplot_add seqlogo\r\n##' @export\r\nggplot_add.seqlogo <- function(object, plot, object_name) {\r\n    msaData <- plot$layers[[1]]$data\r\n    logo_tidyData <- msa2tidy(msaData)\r\n    logo_font <- object$font\r\n    logo_color <- object[[\"color\"]]\r\n    adaptive <- object$adaptive\r\n    top <- object$top\r\n    logo_custom_color <- object[[\"custom_color\"]]\r\n    show.legend <- object$show.legend\r\n\r\n    ly_logo <- geom_logo(data  = logo_tidyData, \r\n                         font = logo_font, \r\n                         color = logo_color,\r\n                         adaptive = adaptive, \r\n                         top = top, \r\n                         custom_color = logo_custom_color, \r\n                         show.legend = show.legend)\r\n    ggplot_add(ly_logo, plot, object_name)\r\n}\r\n\r\n##' @method ggplot_add seed\r\n##' @export\r\nggplot_add.seed <- function(object, plot, object_name) {\r\n    msaData <- plot$layers[[1]]$data\r\n    seed_tidyData <- msa2tidy(msaData)\r\n    seed <- object$seed\r\n    star <- object$star\r\n\r\n    ly <- geom_seed1(seed_tidyData, seed, star)\r\n\r\n    ggplot_add(ly, plot, object_name)\r\n}\r\n\r\n\r\n\r\n##' @method ggplot_add GCcontent\r\n##' @export\r\nggplot_add.GCcontent <- function(object, plot, object_name) {\r\n    msaData <- plot$layers[[1]]$data\r\n    show.legend <- object$show.legend\r\n    GC_tidyData <- msa2tidy(msaData)\r\n\r\n    ly <- geom_GC1(GC_tidyData, show.legend = show.legend )\r\n\r\n    ggplot_add(ly, plot, object_name)\r\n}\r\n\r\n\r\n##' @importFrom ggplot2 facet_wrap\r\n##' @importFrom ggplot2 ggplot_add\r\n##' @importFrom ggplot2 scale_x_continuous\r\n##' @importFrom ggplot2 coord_cartesian\r\n##' @importFrom ggplot2 geom_blank\r\n##' @method ggplot_add facet_msa\r\n##' @export\r\nggplot_add.facet_msa <- function(object, plot, object_name){\r\n    msaData <- plot$layers[[1]]$data\r\n    field <- object$field\r\n    facetData <- facet_data(msaData, field)\r\n\r\n    ##update data\r\n    plot$layers[[1]]$data <- facetData #ly_bg\r\n    if (length(plot$layers) > 1){\r\n        plot$layers[[2]]$data <- facetData #ly_label\r\n    }\r\n\r\n    region <- diff(range(facetData$position))\r\n    xl_scale <- facet_scale(facetData, field)\r\n\r\n    if (region %% field == 0) {\r\n        plot + facet_wrap(.~facet, ncol = 1, scales = \"free_x\") +\r\n            scale_x_continuous(expand = c(0,0), \r\n                               breaks = xl_scale, \r\n                               labels = xl_scale) +\r\n            coord_cartesian()\r\n    }else {\r\n        max_pos <- facetData$position %>% max\r\n        min_pos <- facetData$position %>% min\r\n        max_facet <- facetData$facet %>% max\r\n        minpos_maxfacet <- facetData[facetData$facet == \r\n                                         max_facet,\"position\"] %>% min\r\n        expand_pos <-  (region %/% field + 1) * field + min_pos\r\n\r\n        dummy <- data.frame(x = c(minpos_maxfacet, expand_pos), \r\n                            facet = max_facet)\r\n        plot +\r\n            facet_wrap(.~facet, ncol = 1, scales = \"free_x\") +\r\n            geom_blank(aes_(x = ~x), dummy, inherit.aes = FALSE) +\r\n            scale_x_continuous(expand = c(0,0), \r\n                               breaks = xl_scale, \r\n                               labels = xl_scale) +\r\n            coord_cartesian()\r\n    }\r\n\r\n}\r\n\r\n##' @method ggplot_add msaBar\r\n##' @importFrom aplot insert_top\r\n##' @importFrom ggplot2 coord_cartesian\r\n##' @export\r\nggplot_add.msaBar <- function(object, plot, object_name){\r\n    msaData <- plot$layers[[1]]$data\r\n    bar_tidyData <- msa2tidy(msaData)\r\n    ly <- ly_bar(bar_tidyData)\r\n\r\n    p_bar <- ggplot() + ly_bar(bar_tidyData) + bar_theme(bar_tidyData)\r\n    plot <- plot + coord_cartesian()\r\n    p_bar %>% insert_top(plot, height = 3)\r\n}\r\n\r\n\r\n##' @method ggplot_add nucleotideeHelix\r\n##' @export\r\nggplot_add.nucleotideeHelix <- function(object, plot, object_name){\r\n    msa_data <- plot$layers[[1]]$data\r\n    tidy_data <- msa2tidy(msa_data)\r\n    seq_numbers <- levels(tidy_data$name) %>% length\r\n\r\n    helix_data <- object$helix_data\r\n    color_by <- object$color_by\r\n    overlap <- object$overlap\r\n\r\n    if(is.data.frame(helix_data)) {\r\n        helix_tidy <- tidy_helix(helix_data, color_by = color_by)\r\n    }else {\r\n        helix_tidy <- tidy_list_helix(helix_data, color_by = color_by)\r\n    }\r\n    ly <- layer_helix(helix_data = helix_tidy, \r\n                      overlap = overlap, \r\n                      seq_numbers = seq_numbers)\r\n    ggplot_add(ly, plot, object_name)\r\n}\r\n"
  },
  {
    "path": "R/msa_data.R",
    "content": "##' This function parses FASTA files or other sequence objects. \r\n##' And assign color to each molecule (amino acid or nucleotide) according to\r\n##'  the selected color scheme.\r\n##'\r\n##'\r\n##' @title msa_data\r\n##' @param tidymsa sequence alignment with data frame, generated by tidy_msa().\r\n##' @param font font families, possible values are 'helvetical', 'mono', \r\n##' and 'DroidSansMono', 'TimesNewRoman'. . Defaults is 'helvetical'. \r\n##' If you specify font = NULL, only the background box will be printed.\r\n##' @param color A Color scheme. One of 'Clustal', 'Chemistry_AA', '\r\n##' Shapely_AA', 'Zappo_AA', 'Taylor_AA','LETTER','CN6', 'Chemistry_NT', \r\n##' 'Shapely_NT', 'Zappo_NT', 'Taylor_NT'.Defaults is 'Chemistry_AA.\r\n##' @param custom_color A data frame with two cloumn called \"names\" and \r\n##' \"color\".Customize the color scheme.\r\n##' @param order vectors.Specified sequences order.\r\n##' @param char_width a numeric vector. Specifying the character \r\n##' width in the range of 0 to 1. Defaults is 0.9.\r\n##' @param by_conservation a logical value. The most conserved \r\n##' regions have the brightest colors.\r\n##' @param consensus_views a logical value that opeaning consensus views.\r\n##' @param use_dot a logical value. Displays characters as dots \r\n##' instead of fading their color in the consensus view.\r\n##' @param disagreement a logical value. Displays characters that \r\n##' disagreememt to consensus(excludes ambiguous disagreements).\r\n##' @param ignore_gaps a logical value. When selected TRUE, gaps \r\n##' in column are treated as if that row didn't exist.\r\n##' @param ref a character string. Specifying the reference sequence \r\n##' which should be one of input sequences when 'consensus_views' is TRUE.\r\n##' @return A data frame\r\n##' @examples\r\n##' fasta <- system.file(\"extdata/sample.fasta\", package=\"ggmsa\")\r\n##' data <- msa_data(fasta, 20, 120, \r\n##'                  font = \"helvetical\", \r\n##'                  color = 'Chemistry_AA' )\r\n## @export\r\n##' @noRd\r\n##' @author Guangchuang Yu, Lang Zhou\r\nmsa_data <- function(tidymsa, font = \"helvetical\",\r\n                     color = \"Chemistry_AA\",\r\n                     custom_color = NULL,\r\n                     char_width = 0.9,\r\n                     by_conservation = FALSE,\r\n                     consensus_views = FALSE,\r\n                     use_dot = FALSE,\r\n                     disagreement = TRUE,\r\n                     ignore_gaps = FALSE,\r\n                     ref = NULL) {\r\n\r\n    if (is.null(custom_color)) {\r\n        color <- match.arg(color, c(\"Clustal\", \"Chemistry_AA\", \"Shapely_AA\", \r\n                                    \"Zappo_AA\", \"Taylor_AA\",\"Chemistry_NT\",\r\n                                    \"Shapely_NT\", \"Zappo_NT\", \"Taylor_NT\", \r\n                                    \"LETTER\", \"CN6\", \"Hydrophobicity\" ))\r\n    }\r\n    y <- tidymsa\r\n\r\n    ## add color\r\n    if (color == \"Clustal\"){\r\n        y <- color_Clustal(y)\r\n    }else {\r\n        if (consensus_views) {\r\n            consensus <- get_consensus(y, #extract a consensus/ref sequence\r\n                                       ignore_gaps = ignore_gaps, \r\n                                       ref = ref)\r\n            tc <- color_scheme(y, color) %>% #assigning color for other seq.\r\n                  tidy_color(consensus, disagreement, ref = ref)# tidy colors\r\n            \r\n            y <- color_scheme(consensus, color) %>% #assigning color for con/ref\r\n                 rbind(tc) #add consensus sequence\r\n\r\n            if (use_dot){\r\n                y[is.na(y$color), \"character\"] <- \".\"\r\n            }else {\r\n                y$font_color <- \"#000000\"\r\n                y[is.na(y$color), \"font_color\"] <- \"#aaacaf\"\r\n                y[is.na(y$color), \"color\"] <- \"#ffffff\"\r\n            }\r\n\r\n        }else {\r\n            y <- color_scheme(y, color, custom_color)\r\n        }\r\n    }\r\n\r\n    if (by_conservation){\r\n        y <- color_visibility(y)\r\n    }\r\n\r\n\r\n    if (is.null(font)) {\r\n        return(y)\r\n    }\r\n\r\n    ## calling internal polygons\r\n    font_f <- font_fam[[font]]\r\n    \r\n    #debug using'as.character()'\r\n    data_sp <- font_f[as.character(unique(y$character))]\r\n\r\n    ## To adapt to tree data\r\n    if (!'name' %in% names(y) & !consensus_views) {\r\n        if ('label' %in% names(y)) {\r\n            names(y)[names(y) == 'label'] <- \"name\"\r\n        }else {\r\n            stop(\"unknown sequence name...\")\r\n        }\r\n    }\r\n\r\n    if(!is.factor(y$name) & !consensus_views){\r\n        lev <- unique(data.frame(y[,c(\"name\",\"y\")]))\r\n        \r\n        # y is the order of the nodes in the tree\r\n        lev <- lev[order(lev$y), \"name\"] \r\n        y$name <- factor(y$name, levels = lev)\r\n    } else if(consensus_views) {\r\n        y$name <- order_name(y$name, \r\n                             consensus_views = consensus_views, \r\n                             ref = ref)\r\n    }\r\n    y$ypos <- as.numeric(y$name)\r\n\r\n    # for ggtreeExtra\r\n    if (\"new_position\" %in% colnames(y)) {\r\n        scale_n <- 5 * length(unique(y$name))/diff(range(y$new_position))\r\n        char_width <- char_width * \r\n          diff(range(y$new_position))/diff(range(y$position))\r\n    }\r\n\r\n    yy <- lapply(seq_len(nrow(y)), function(i) {\r\n        d <- y[i, ]\r\n        dd <- data_sp[[d$character]]\r\n        if(d$character == \".\"){ # '.' without zooming\r\n          if (\"new_position\" %in% colnames(d)){\r\n              dd$x <- dd$x - min(dd$x) + d$new_position - diff(range(dd$x))/2\r\n          }else{\r\n              dd$x <- dd$x - min(dd$x) + d$position - diff(range(dd$x))/2\r\n          }\r\n          dd$y <- dd$y - min(dd$y) + d$ypos - diff(range(dd$y))/2\r\n        }else {# other characters\r\n            char_scale <- diff(range(dd$x))/diff(range(dd$y))#equal proportion\r\n            #y_width = char_width, x-width scaled proportionally\r\n            if(diff(range(dd$x)) <= diff(range(dd$y))) {\r\n                dd$x <- dd$x * (char_width * char_scale)/diff(range(dd$x))\r\n                # for ggtreeExtra\r\n                if (\"new_position\" %in% colnames(d)){\r\n                    dd$y <- (dd$y * char_width)/diff(range(dd$y)) * scale_n\r\n                    dd$x <- dd$x - min(dd$x) + d$new_position - \r\n                      (char_width * char_scale)/2\r\n                    dd$y <- dd$y - min(dd$y) + d$ypos - scale_n * char_width/2\r\n                }else{\r\n                    dd$y <- (dd$y * char_width)/diff(range(dd$y))\r\n                    dd$x <- dd$x - min(dd$x) + d$position - \r\n                      (char_width * char_scale)/2\r\n                    dd$y <- dd$y - min(dd$y) + d$ypos - char_width/2\r\n                }\r\n            }else{#x_width = char_width, y-width scaled proportionally                                       \r\n                dd$x <- dd$x * char_width/diff(range(dd$x))\r\n                # for ggtreeExtra\r\n                if (\"new_position\" %in% colnames(d)){\r\n                    dd$y <- dd$y * \r\n                      char_width/(diff(range(dd$y)) * char_scale) * scale_n\r\n                    dd$x <- dd$x - min(dd$x) + d$new_position - char_width/2\r\n                    dd$y <- dd$y - min(dd$y) + d$ypos - \r\n                      (scale_n * char_width/char_scale)/2\r\n                }else{\r\n                    dd$y <- dd$y * char_width/(diff(range(dd$y)) * char_scale)\r\n                    dd$x <- dd$x - min(dd$x) + d$position - char_width/2\r\n                    dd$y <- dd$y - min(dd$y) + d$ypos - \r\n                      (char_width/char_scale)/2\r\n                }\r\n            }\r\n        }\r\n        cn <- colnames(d)\r\n        cn <- cn[!cn %in% c('x','y', 'ypos')]\r\n        for (nn in cn) {\r\n            dd[[nn]] <- d[[nn]]\r\n        }\r\n\r\n        dd$group <- paste0(\"V\", d$position, \"L\", d$ypos)\r\n        return(dd)\r\n    })\r\n\r\n    ydf <- do.call(rbind, yy)\r\n    colnames(ydf)[colnames(ydf) == 'y'] <- 'yy'\r\n    ydf$y <- as.numeric(ydf$name)\r\n    ydf <- cbind(label = ydf$name, ydf)\r\n    return(ydf)\r\n}\r\n\r\n##' Convert msa file/object to tidy data frame.\r\n##'\r\n##'\r\n##' @title tidy_msa\r\n##' @param msa multiple sequence alignment file or sequence object in \r\n##' DNAStringSet, RNAStringSet, AAStringSet, BStringSet, DNAMultipleAlignment, \r\n##' RNAMultipleAlignment, AAMultipleAlignment, DNAbin or AAbin\r\n##' @param start start position to extract subset of alignment\r\n##' @param end end position to extract subset of alignemnt\r\n##' @return tibble data frame\r\n##' @export\r\n##' @examples\r\n##' fasta <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\r\n##' aln <- tidy_msa(msa = fasta, start = 10, end = 100)\r\n##' @author Guangchuang Yu\r\ntidy_msa <- function(msa, start = NULL, end = NULL) {\r\n    if(inherits(msa, \"character\") && length(msa) > 1) {\r\n        aln <- msa\r\n    }else {\r\n        aln <- prepare_msa(msa)\r\n    }\r\n    alnmat <- lapply(seq_along(aln), function(i) {\r\n        ##Preventing function collisions\r\n        base::strsplit(as.character(aln[[i]]), '')[[1]]\r\n    }) %>% do.call('rbind', .)\r\n    ## for DNAbin and AAbin\r\n    alndf <- as.data.frame(alnmat, stringsAsFactors = FALSE)\r\n\r\n    if(unique(names(aln)) %>% length == length(aln)) {\r\n        alndf$name = names(aln)\r\n    }else{\r\n      stop(\"Sequences must have unique names\")\r\n    }\r\n    cn = colnames(alndf)\r\n    cn <- cn[!cn %in% \"name\"]\r\n    df <- gather(alndf, \"position\", \"character\", cn)\r\n\r\n    y <- df\r\n    y$position = as.numeric(sub(\"V\", \"\", y$position))\r\n    y$character = toupper(y$character)\r\n\r\n    y$name = factor(y$name, levels=rev(names(aln)))\r\n\r\n\r\n    if (is.null(start)) start <- min(y$position)\r\n    if (is.null(end)) end <- max(y$position)\r\n\r\n    y <- y[y$position >=start & y$position <= end, ]\r\n\r\n    return(y)\r\n}\r\n\r\n\r\n\r\n\r\n\r\n##' This function converts the msa_data to the tidy data.\r\n##'\r\n##' @param msaData sequence alignment data generated by msa_data().\r\n##' @noRd\r\nmsa2tidy <- function(msaData) {\r\n  if (\"order\" %in% names(msaData)) {\r\n    msaData <- msaData[msaData$order == 1,]\r\n  }\r\n  df_tidy <- data.frame(name = msaData$name,\r\n                        position = msaData$position,\r\n                        character = msaData$character)\r\n  df_tidy$character <- as.character(df_tidy$character)\r\n\r\n  return(df_tidy)\r\n}\r\n\r\n\r\n"
  },
  {
    "path": "R/pp_interactive.R",
    "content": "\nmake_gap <- function(gap, previous_seq) {\n    gap_df <- previous_seq[rep(1, each=gap),] \n    gap_start <- max(previous_seq$position) + 1\n    gap_df$position <- gap_start : (gap_start + gap - 1 )\n    gap_df$character <- \"-\"\n    \n    if(\"pos_previous\"  %in% names(gap_df)) {\n        gap_df$pos_previous <- 0\n    }\n    \n    return(gap_df)\n}\n\n##' merge two MSA\n##'\n##' @title merge_seq\n##' @param previous_seq previous MSA\n##' @param subsequent_seq subsequent MSA\n##' @param gap gap length\n##' @param adjust_name logical value. merge seq name or not\n##' @return tidy MSA data frame\n##' @export\n##' @author Lang Zhou\nmerge_seq <- function(previous_seq, gap, subsequent_seq, adjust_name = TRUE) {\n    \n    name_pre <- levels(previous_seq$name)\n    name_subse <- levels(subsequent_seq$name)\n    \n    if(length(name_pre) != length(name_subse)) {\n        stop(\"The sequences number of previous_seq and subsequent_seq is inconsistent\")\n    }\n    \n    gap_df <- make_gap(gap = gap, previous_seq = previous_seq)\n    subsequent_seq$position <- \n        subsequent_seq$position - min(subsequent_seq$position) + 1\n    subsequent_seq$position <- \n        subsequent_seq$position + max(previous_seq$position) + gap\n    \n    t_merge <- rbind(previous_seq,gap_df,subsequent_seq)\n    \n    if (adjust_name) {\n        rownames(t_merge) <- seq(nrow(t_merge))\n        names(t_merge)[1] <- \"name_previous\"\n        t_merge$name <- \"\"\n        \n        for(i in seq(length(name_pre))) {\n            t_merge[t_merge$name_previous %in% c(name_pre[i], name_subse[i]),\"name\"] <- \n                paste0(name_pre[i],\"-\", name_subse[i])\n        }\n        t_merge$name <- factor(t_merge$name)\n    }\n    return(t_merge)\n}\n\n\n##' tidy protein-protein interactive position data\n##'\n##' @title tidy_hdata\n##' @param gap gap length\n##' @param inter protein-protein interactive position data\n##' @param previous_seq previous MSA\n##' @param subsequent_seq subsequent MSA\n##' @importFrom R4RNA as.helix\n##' @return helix data\n##' @export\n##' @author Lang Zhou\ntidy_hdata <- function(gap, inter, previous_seq,subsequent_seq) {\n    inter$j <- inter$Res.no..2 - \n        min(subsequent_seq$position) + \n        max(previous_seq$position) + gap + 1\n    hdata <- data.frame(i = inter$Res.no.1, \n                        j = inter$j,\n                        length = 1, \n                        value = NA, \n                        colour = \"blue\")\n    hdata <- as.helix(hdata)\n    return(hdata)\n}\n\n##' reset MSA position\n##'\n##' @title reset_pos\n##' @param seq_df MSA data\n##' @return data frame\n##' @export\n##' @author Lang Zhou\nreset_pos <- function(seq_df) {\n    names(seq_df)[2] <- \"pos_previous\"\n    seq_df$position <- \"\"\n    \n    for(i in unique(seq_df$pos_previous)%>% seq) {\n        uni <- unique(seq_df$pos_previous)\n        seq_df[seq_df$pos_previous == uni[i],\"position\"] <- i\n    }\n    \n    seq_df$position <- as.numeric(seq_df$position)\n    return(seq_df)\n    \n}\n\n##' reset hdata data position\n##'\n##' @title simplify_hdata \n##' @param hdata data from tidy_hdata()\n##' @param sim_msa MSA data frame\n##' @return data frame\n##' @export\n##' @author Lang Zhou\nsimplify_hdata <- function(hdata, sim_msa) {\n    \n    new_hdata <- lapply(seq(nrow(hdata)), function(a) {\n        n <- hdata[a,]\n        n$pre_i <- n$i\n        n$i <- sim_msa[sim_msa$pos_previous == n$i,\"position\"] %>% unique\n        return(n)\n    }) %>% do.call(\"rbind\",.)\n    \n    new_hdata <- lapply(seq(nrow(new_hdata)), function(a) {\n        n <- new_hdata[a,]\n        n$pre_j <- n$j\n        n$j <- sim_msa[sim_msa$pos_previous == n$j,\"position\"] %>% unique\n        return(n)\n    }) %>% do.call(\"rbind\",.)\n    \n    new_hdata <- as.helix(new_hdata)\n    \n    return(new_hdata)\n    \n}\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/prepare_fasta.R",
    "content": "##' preparing multiple sequence alignment\n##'\n##' This function supports both NT or AA sequences; It supports multiple \n##' input formats such as \"DNAStringSet\", \"BStringSet\", \"AAStringSet\", \n##' DNAbin\", \"AAbin\" and a filepath.\n##' @title prepare_msa\n##' @param msa a multiple sequence alignment file or object\n##' @return BStringSet based object\n##' @importFrom Biostrings DNAStringSet\n##' @importFrom Biostrings RNAStringSet\n##' @importFrom Biostrings AAStringSet\n##' @importFrom methods missingArg\n##' @importFrom seqmagick fa_read\n## @export\n##' @author Lang Zhou and Guangchuang Yu\n##' @noRd\nprepare_msa <- function(msa) {\n    if (missingArg(msa)) {\n        stop(\"no input...\")\n    } else if (inherits(msa, \"character\")) {\n        msa <- fa_read(msa)\n    } else if (!class(msa) %in% supported_msa_class) {\n        stop(\"multiple sequence alignment object no supported...\")\n    }\n\n    res <- switch(class(msa),\n                  DNAbin = DNAbin2DNAStringSet(msa),\n                  AAbin = AAbin2AAStringSet(msa),\n                  DNAMultipleAlignment = DNAStringSet(msa),\n                  RNAMultipleAlignment = RNAStringSet(msa),\n                  AAMultipleAlignment = AAStringSet(msa),\n                  msa ## DNAstringSet, RNAStringSet, AAString, BStringSet\n                  )\n    return(res)\n}\n\n\nDNAbin2DNAStringSet <- function(msa) {\n    seqs <- vapply(seq_along(msa),\n                   function(i) paste0(as.character(msa[i]) %>% unlist, \n                                      collapse=''),\n                   character(1))\n    names(seqs) <- names(msa)\n\n    switch(class(msa),\n           DNAbin = DNAStringSet(seqs),\n           AAbin = AAStringSet(seqs))\n}\n\nAAbin2AAStringSet <- DNAbin2DNAStringSet\n\n\n\nsupported_msa_class <- c(\"DNAStringSet\",  \n                         \"RNAStringSet\", \n                         \"AAStringSet\", \n                         \"BStringSet\",\n                         \"DNAMultipleAlignment\", \n                         \"RNAMultipleAlignment\", \n                         \"AAMultipleAlignment\",\n                         \"DNAbin\", \n                         \"AAbin\")\n\n\n\n"
  },
  {
    "path": "R/read_maf.R",
    "content": "##' read 'multiple alignment format'(MAF) file\n##'\n##' @title read_maf\n##' @param multiple_alignment_format a multiple alignment format(MAF) file\n##' @return data frame\n##' @export\n##' @author Lang Zhou\nread_maf <- function(multiple_alignment_format) {\n    \n    line <- readLines(multiple_alignment_format)\n    head <- sapply(line, function(i) substring(i,1,1))\n    rm(line)# 'line' in names(heads) \n    \n    #remove header\n    head <- head[-seq(which(head == \"#\"))]\n    \n    #split block\n    blank <- which(head == \"\")\n    block_ls <- lapply(seq(blank), function(i) {\n        if (blank[i] == min(blank)) {\n            x <- names(head)[1:blank[i]]\n        }else {\n            x <- names(head)[blank[i-1]:blank[i]]\n        }\n        return(x)\n    })\n    names(block_ls) <- paste0(\"block_\",seq(length(block_ls)))\n    \n    #extra lines starting with \"s\"\n    s_block <- lapply(seq(length(block_ls)), function(i) {\n        blocki <- block_ls[[i]]\n        line_s <- blocki[sapply(blocki, function(j) substring(j,1,1))  == \"s\"] \n    }) \n    names(s_block) <- names(block_ls)\n    \n    #get a MAF df\n    s_name <- c(\"type\", \"src\", \"start\", \"size\", \"strand\", 'src_size', \"text\")\n    seq_df <-lapply(seq(length(s_block)), function(i) {\n        \n        blocki <- s_block[[i]]\n        seq_df <- lapply(seq(length(blocki)), function(j) {\n            x <- blocki[[j]]\n            #extra all columns\n            x <- strsplit(x, \" \") %>% unlist \n            x1 <- x[sapply(x, nchar) > 0]\n            #convert to data frame\n            seq <- t(as.matrix(x1)) %>% as.data.frame()\n            names(seq) <- s_name\n            seq[,c(\"start\",\"size\",'src_size')] <- \n                seq[,c(\"start\",\"size\",'src_size')] %>%as.numeric()\n            \n            seq$size_gap <- nchar(seq$text)\n            seq$end <- seq$start + seq$size\n            seq$end_gap <- seq$start + seq$size_gap\n            seq$block <- names(s_block[i])\n            return(seq)\n        })%>% do.call(\"rbind\", .)\n        return(seq_df)\n        \n    }) %>% do.call(\"rbind\", .)\n}\n"
  },
  {
    "path": "R/seqdiff.R",
    "content": "\n##' calculate difference of two aligned sequences\n##'\n##'\n##' @title seqdiff\n##' @param fasta fasta file\n##' @param reference which sequence serve as reference, 1 or 2\n##' @return SeqDiff object\n##' @export\n##' @importFrom Biostrings readBStringSet\n##' @importClassesFrom Biostrings BStringSet\n##' @importFrom methods new\n##' @author guangchuang yu\n##' @examples\n##' fas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n##'                   pattern=\"fas\", full.names=TRUE)\n##' seqdiff(fas[1], reference=1)\nseqdiff <- function(fasta, reference=1) {\n    sequence <- readBStringSet(fasta)\n    if (length(sequence) != 2 && length(width(sequence)) != 1) {\n        stop(\"fas should contains 2 aligned sequences...\")\n    }\n    diff <- nucleotide_difference(sequence, reference)\n    new(\"SeqDiff\",\n        file = fasta,\n        sequence = sequence,\n        reference = reference,\n        diff = diff)\n}\n\n##' @importFrom magrittr %>%\n##' @importFrom Biostrings toString\n##' @importFrom Biostrings width\nnucleotide_difference <- function(x, reference=1) {\n    n <- width(x[1])\n    nn <- seq_len(n)\n    s1 <- x[1] %>% toString %>% substring(nn, nn)\n    s2 <- x[2] %>% toString %>% substring(nn, nn)\n\n    pos <- which(s1 != s2)\n    if (reference == 1) {\n        diff <- s2[pos]\n    } else {\n        diff <- s1[pos]\n    }\n\n    return(data.frame(position = pos,\n                      difference = diff,\n                      stringsAsFactors = FALSE))\n}\n\n\n\n\n##' @importFrom dplyr group_by\n##' @importFrom dplyr summarize\n##' @importFrom dplyr select\n##' @importFrom dplyr n\nnucleotide_difference_count <- function(x, width=50, keep0=FALSE) {\n    n <- max(x$position)\n    bin <- rep(seq_len(ceiling(n/width)), each=width)\n    position <- c(seq_len(n)[!duplicated(bin)], n)\n    x$bin <- bin[x$pos]\n    y <- x %>% group_by(bin) %>%\n        summarize(position=min(position), count = n()) %>%\n        select(-bin)\n    y$position <- position[findInterval(y$position, position)]\n    if (keep0) {\n        itv <- seq(1, n, width)\n        yy <- data.frame(position = itv[!itv %in% y$position],\n                         count = 0)\n        y <- rbind(y, yy)\n        y <- y[order(y$position, decreasing=FALSE),]\n    }\n    return(y)\n}\n\n"
  },
  {
    "path": "R/seqlogo.R",
    "content": "##' plot sequence logo for MSA based 'ggolot2'\n\n##' @title seqlogo\n##' @param msa Multiple sequence alignment file or object for representing \n##' either nucleotide sequences or peptide sequences.\n##' @param start Start position to plot.\n##' @param end End position to plot.\n##' @param font font families, possible values are 'helvetical', 'mono', and \n##' 'DroidSansMono', 'TimesNewRoman'.  Defaults is 'DroidSansMono'. \n##' If font=NULL, only the background tiles is drawn.\n##' @param color A Color scheme. One of 'Clustal', 'Chemistry_AA', \n##' 'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6','Chemistry_NT', \n##' 'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\n##' @param custom_color A data frame with two cloumn called \"names\" and \n##' \"color\".Customize the color scheme.\n##' @param adaptive A logical value indicating whether the overall height of \n##' seqlogo corresponds to the number of sequences. If FALSE, seqlogo \n##' overall height = 4,fixedly.\n##' @param top  A logical value. If TRUE, seqlogo is aligned to the top of MSA.\n##' @return ggplot object\n##' @examples\n##' #plot sequence motif independently\n##' nt_sequence <- system.file(\"extdata\", \"LeaderRepeat_All.fa\", \n##'                            package = \"ggmsa\")\n##' seqlogo(nt_sequence, color = \"Chemistry_NT\")\n##' @export\n##' @author Lang Zhou\nseqlogo <- function(msa, \n                    start = NULL, \n                    end = NULL, \n                    font = \"DroidSansMono\", \n                    color = \"Chemistry_AA\", \n                    adaptive = FALSE, \n                    top = FALSE, \n                    custom_color = NULL) {\n  \n    data <- tidy_msa(msa, start = start, end = end)\n    ggplot() + geom_logo(data, \n                         font = font, \n                         color = color, \n                         adaptive = adaptive, \n                         top = top, \n                         custom_color = custom_color) +\n        theme_minimal() + xlab(NULL) + ylab(NULL) +\n        theme(legend.position = 'none') + \n        theme(panel.grid = element_blank(), axis.text.y = element_blank()) +\n        coord_fixed()\n}\n\n##' Multiple sequence alignment layer for ggplot2. It plot sequence motifs.\n\n##' @title geom_seqlogo\n##' @param font font families, possible values are 'helvetical', 'mono', \n##' and 'DroidSansMono', 'TimesNewRoman'. Defaults is 'DroidSansMono'.\n##' @param color A Color scheme. One of 'Clustal', 'Chemistry_AA', \n##' 'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \n##' 'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.\n##' @param custom_color A data frame with two cloumn called \"names\" and \n##' \"color\".Customize the color scheme.\n##' @param adaptive A logical value indicating whether the overall height \n##' of seqlogo corresponds to the number of sequences.If is FALSE, \n##' seqlogo overall height = 4,fixedly.\n##' @param top A logical value. If TRUE, seqlogo is aligned to the top of MSA.\n##' @param show.legend logical. Should this layer be included in the legends?\n##' @param ... additional parameter\n##' @return A list\n##' @examples\n##' #plot multiple sequence alignment and sequence motifs\n##' f <- system.file(\"extdata/LeaderRepeat_All.fa\", package=\"ggmsa\")\n##' ggmsa(f,font = NULL,color = \"Chemistry_NT\") + geom_seqlogo()\n##' @export\n##' @author Lang Zhou\ngeom_seqlogo <- function(font = \"DroidSansMono\", color = \"Chemistry_AA\", \n                         adaptive = TRUE, top = TRUE, custom_color = NULL, \n                         show.legend = FALSE, ...) {\n    structure(list(font = font,\n                   color = color,\n                   adaptive = adaptive,\n                   top = top,\n                   custom_color = custom_color,\n                   show.legend = show.legend),\n              class = \"seqlogo\")\n}\n\n\ngeom_logo <- function(data, font = \"DroidSansMono\", color = \"Chemistry_AA\",\n                      adaptive = FALSE, top = TRUE, custom_color = NULL, \n                      show.legend = FALSE, ...) {\n    mapping  <- aes_(x = ~logo_x, \n                     y = ~logo_y,  \n                     group = ~group, \n                     fill = ~I(color))\n    logo_data <- seqlogo_data(data, font = font, color = color, \n                              adaptive = adaptive, top = top, \n                              custom_color = custom_color)\n\n    ly_logo <- geom_polygon(mapping = mapping, data = logo_data, \n                            inherit.aes = FALSE, show.legend = show.legend)\n    return(ly_logo)\n}\n\nseqlogo_data <- function(data, font = \"DroidSansMono\", \n                         color = \"Chemistry_AA\", adaptive = FALSE, \n                         top = TRUE, custom_color = NULL){\n    tidy <- data\n\n    if (color == \"Clustal\") {\n        tidy <- color_Clustal(tidy)\n    } else{\n        tidy <- color_scheme(tidy, color, custom_color)\n    }\n\n    if (adaptive) {\n        seq_number  <-  as.character(unique(tidy[[1]]))\n        total_heigh <- length(seq_number) / 6\n    } else {\n        total_heigh <- 4\n    }\n\n    #total_heigh <- getOption(\"total_heigh\")\n    logo_width <- getOption(\"logo_width\")\n    ## assign the start postion to the first label\n    col_num <- as.numeric(levels(factor(tidy$position))) \n    moti_da <- lapply(col_num, function(j){\n        ## Calculate the char frequency in each column\n        clo <- tidy[tidy$position == j, ] \n        fre <- prop.table(table(clo$character))\n        ## total_heigh is overall hight, the height of each char is assigned.\n        ywidth <- sort(total_heigh * fre ) \n        ## calling color scheme\n        column_char_color <- data.frame(unique(clo[c(\"character\", \"color\")])) \n        font_f <- font_fam[[font]]\n        motif_char <- font_f[names(ywidth)]\n        ds_ <- lapply(seq_along(motif_char), function(i){\n            ds_ <- motif_char[[i]]\n            names(ds_)[names(ds_) == \"x\"] <- \"logo_x\"\n            names(ds_)[names(ds_) == \"y\"] <- \"logo_y\"\n            ds_$char <- names(motif_char[i])\n            #width = .9\n            ds_$logo_x <- ds_$logo_x * logo_width/diff(range(ds_$logo_x)) \n            #hight = overall hight * frequency\n            ds_$logo_y <- ds_$logo_y * ywidth[[i]]/diff(range(ds_$logo_y))\n            ymotif <- sum(ywidth[0:(i - 1)]) # sum-hight currently\n            #  moving char horizontally\n            ds_$logo_x <- ds_$logo_x - min(ds_$logo_x) - logo_width/2 + j\n            ds_$logo_y <- ds_$logo_y - min(ds_$logo_y) - ywidth[[i]]/2 + \n                          ymotif + ywidth[[i]]/2\n            if (top) {\n              ds_$logo_y <- ds_$logo_y + nrow(tidy[tidy$position == j, ]) + .5\n            }\n            ## ds_$y - min(ds_$y) - ywidth[[i]]/2: Centered at zero\n            ## + ymotif: sum-hight that are below the char currently\n            ## + ywidth[[i]]/2: the char height currently\n            ds_$group <- paste0(\"P\", j, '-', \"Char\", names(motif_char[i]))\n            ds_$color <- column_char_color[column_char_color$character == \n                                           unique(ds_$char), \"color\"]\n            return(ds_)\n         })\n        ds <- do.call(rbind, ds_)\n        return(ds)\n  })\n    moti_da <- do.call(rbind, moti_da)\n    moti_da$name <- as.character(tidy[1,1])\n    other_cn <- names(moti_da)[!names(moti_da) == 'name']\n    moti_da <- moti_da[c(\"name\", other_cn)]\n    add_col <- tidy[,!names(tidy) %in% names(moti_da)]\n    moti_da <- cbind(add_col[1,], moti_da, row.names = NULL)\n    return(moti_da)\n}\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n\n"
  },
  {
    "path": "R/simplot.R",
    "content": "##' Sequence similarity plot\n##'\n##'\n##' @title simplot\n##' @param file alignment fast file\n##' @param query query sequence\n##' @param window sliding window size (bp)\n##' @param step step size to slide the window (bp)\n##' @param group whether grouping sequence.(eg. For \"A-seq1,A-seq-2,B-seq1 and \n##' B-seq2\", using sep = \"-\" and id = 1 to divide sequences into groups A and \n##' B)\n##' @param id position to extract id for grouping; only works if group = TRUE\n##' @param sep separator to split sequence name; only works if group = TRUE\n##' @param sd whether display standard deviation of \n##' similarity among each group; only works if group=TRUE\n##' @param smooth FALSE(default)or TRUE; whether display smoothed spline.\n##' @param smooth_params a list that add params for geom_smooth,\n##' (default: smooth_params = list(method = \"loess\", se = FALSE))\n##' @return ggplot object\n##' @importFrom Biostrings readDNAStringSet\n##' @importFrom ggplot2 aes_\n##' @importFrom ggplot2 geom_line\n##' @importFrom ggplot2 ggtitle\n##' @importFrom ggplot2 geom_ribbon\n##' @importFrom ggplot2 geom_smooth\n##' @importFrom magrittr %<>%\n##' @importFrom dplyr group_by_\n##' @importFrom dplyr summarize_\n##' @export\n##' @author guangchuang yu\n##' @examples\n##' fas <- system.file(\"extdata/GVariation/sample_alignment.fa\", \n##'                     package=\"ggmsa\")\n##' simplot(fas, 'CF_YL21')\nsimplot <- function(file, \n                    query, \n                    window=200, \n                    step=20, \n                    group=FALSE, \n                    id, \n                    sep, \n                    sd=FALSE,\n                    smooth = FALSE,\n                    smooth_params = list(method = \"loess\", \n                                         se = FALSE)) {\n    aln <- readDNAStringSet(file)\n    nn <- names(aln)\n    if (group) {\n        g <- vapply(strsplit(nn, sep), function(x) x[id], character(1))\n    }\n\n    idx <- which(nn != query)\n    w <- width(aln[query])\n    start <- seq(1, w, by=step)\n    end <- start + window - 1\n    start <- start[end <= w]\n    end <- end[end <= w]\n    res <- lapply(idx, function(i) {\n        x <- toCharacter(aln[i]) == toCharacter(aln[query])\n        pos <- round((start+end)/2)\n        sim <- vapply(seq_along(start), function(j) {\n            mean(x[start[j]:end[j]])\n        }, numeric(1))\n\n        y <- data.frame(sequence=nn[i], position = pos, similarity = sim)\n        if(group) {\n            y$group <- g[i]\n        }\n        return(y)\n    }) %>% do.call(rbind, .)\n\n    if (group) {\n        res %<>% group_by_(~position, ~group) %>%\n            summarize_(msim=~mean(similarity), sd=~sd(similarity))\n    }\n\n\n    if (group) {\n        p <- ggplot(res, aes_(x=~position, y=~msim, group=~group))\n        if (sd) p <- p + geom_ribbon(aes_(ymin=~msim-sd, \n                                          ymax=~msim+sd, \n                                          fill=~group), alpha=.25)\n        if (smooth) {\n            smooth_layer <- do.call(geom_smooth, \n                                    smooth_params)\n            p <- p + smooth_layer\n        } else {\n            p <- p + geom_line(aes_(color=~group))\n        }\n        \n        \n    } else {\n        mapping = aes_(x=~position, \n                       y=~similarity,\n                       group=~sequence, \n                       color=~sequence)\n        p <- ggplot(res, mapping = mapping) \n        \n        if (smooth) {\n            smooth_layer <- do.call(geom_smooth, \n                                    smooth_params)\n            p <- p + smooth_layer\n            \n        } else {\n            p <- p + geom_line()\n        }\n    }\n\n    p + xlab(\"Nucleotide Position\") + ylab(\"Similarity (%)\") +\n        ggtitle(paste(\"Sequence similarities compare to\", query)) +\n        theme_minimal() +\n        theme(legend.title=element_blank()) \n}\n\n\ntoCharacter <- function(x) {\n    unlist(strsplit(toString(x),\"\"))\n}\n\n\n"
  },
  {
    "path": "R/theme_msa.R",
    "content": "##' Theme for ggmsa.\n##'\n##' @title theme_msa\n##' @importFrom ggplot2 theme_minimal\n##' @importFrom ggplot2 labs\n##' @export\n##' @author Lang Zhou\ntheme_msa <- function(){\n  list(\n    xlab(NULL),\n    ylab(NULL),\n    labs(fill = \"Fills\"),\n    coord_fixed(),\n    scale_x_continuous(expand = c(0,0)),\n    theme_minimal() +\n        theme(\n            strip.text = element_blank(),\n            panel.spacing.y = unit(.4, \"in\"),\n            panel.grid = element_blank())\n  )\n}\n\n\n##' @importFrom grDevices colorRampPalette\n##' @importFrom RColorBrewer brewer.pal\n##' @importFrom ggplot2 coord_cartesian\n##' @importFrom ggplot2 scale_x_continuous\n##' @importFrom ggplot2 scale_y_continuous\n##' @importFrom ggplot2 scale_fill_gradientn\nbar_theme <- function(tidy){\n    data <- bar_data(tidy)\n    color_palettes <- colorRampPalette(brewer.pal(n = 9, \n                                                  name = \"Blues\")[c(4:7)])\n    list(\n        xlab(NULL),\n        ylab(\"consensus\"),\n        scale_x_continuous(breaks = data[[3]], \n                           labels = data[[1]],\n                           expand = c(0,0)),\n        scale_y_continuous(breaks = NULL),\n        scale_fill_gradientn(colours = color_palettes(100)),\n        theme_minimal() +\n            theme(panel.grid.minor.x = element_blank(), \n                  panel.grid.major.x = element_blank())\n        )\n}\n\nfacet_scale <- function(facetData, field) {\n    facet0_pos <- facetData[facetData$facet == 0,\"position\"]\n    msa_start <- min(facet0_pos)\n    \n    ## x labels of facet 0\n    facet0_xl_scale <- pretty(min(facet0_pos):max(facet0_pos)) \n    \n    ## assign the start postion to the first label\n    facet0_xl_scale[1] <- msa_start \n    xl_scale <- facet0_xl_scale\n    for(i in max(facetData$facet) %>% seq_len) {\n        scale_i <- facet0_xl_scale + field * i\n        if(msa_start > 1) scale_i[1] <- scale_i[1] + 1\n        #print(scale_i)\n        xl_scale <- xl_scale %>% c(scale_i)\n    }\n    max_pos <- facetData$position %>% max\n    xl_scale <- xl_scale[xl_scale <= max_pos]\n    return(xl_scale)\n}\n\n\n\n\n"
  },
  {
    "path": "R/zzz.R",
    "content": "#' @importFrom utils packageDescription\n.onAttach <- function(libname, pkgname){\n    #options(total_heigh = 4)\n    options(logo_width = 0.9)\n    options(asterisk_width = .03)\n    options(GC_pos = 2)\n    options(shadingLen = .5)\n    options(shading_alpha = .3)\n    \n    pkgVersion <- packageDescription(pkgname, fields=\"Version\")\n    msg <- paste0(pkgname, \" v\", pkgVersion, \"  \",\n                  \"Document: http://yulab-smu.top/ggmsa/\", \"\\n\\n\")\n    citation <- paste0(\"If you use \", pkgname,\n                       \" in published research, please cite:\\n\",\n                       \"L Zhou, T Feng, S Xu, F Gao, TT Lam, Q Wang, T Wu, \",\n                       \"H Huang, L Zhan, L Li, Y Guan, Z Dai*, G Yu* \",\n                       \"ggmsa: a visual exploration tool for multiple sequence alignment and associated data. \",\n                       \"Briefings in Bioinformatics. DOI:10.1093/bib/bbac222\")\n    packageStartupMessage(paste0(msg, citation))\n    \n}"
  },
  {
    "path": "README.Rmd",
    "content": "---\noutput: \n  md_document:\n    variant: gfm\nhtml_preview: TRUE\n---\n<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  fig.path = \"man/figures/REAMED-\",\n  message = FALSE,\n  warning = FALSE\n)\n```\n#  ggmsa:a visual exploration tool for multiple sequence alignment and associated data <img src=\"man/figures/logo.png\" height=\"140\" align=\"right\" />\n\n```{r echo=FALSE, results=\"hide\", message=FALSE}\nlibrary(badger)\n```\n\n```{r, echo = FALSE, results='asis'}\ncat(\n\tbadge_devel(\"YuLab-SMU/ggmsa\", \"blue\"),\n\tbadge_lifecycle(\"experimental\", \"orange\"),\n\tbadge_license(\"Artistic-2.0\")\n)\n```\n<!-- badges: start -->\n<!-- [![CRAN_Release_Badge](https://www.r-pkg.org/badges/version-ago/ggmsa)](https://cran.r-project.org/package=ggmsa)-->\n<!-- [![CRAN_Download_Badge](https://cranlogs.r-pkg.org/badges/grand-total/ggmsa?color=green)](https://cran.r-project.org/package=ggmsa)-->\n<!-- badges: end -->\n\n\n`ggmsa` is designed for visualization and annotation of multiple sequence alignment. It implements functions to visualize publication-quality multiple sequence alignments (protein/DNA/RNA) in R extremely simple and powerful. \n\nFor details, please visit <http://yulab-smu.top/ggmsa/>\n\n\n##  :hammer: Installation\n\nThe released version from `Bioconductor`\n\n```{r eval=FALSE}\nif (!requireNamespace(\"BiocManager\", quietly=TRUE))\n    install.packages(\"BiocManager\")\n## BiocManager::install(\"BiocUpgrade\") ## you may need this\nBiocManager::install(\"ggmsa\")\n```\n\nAlternatively, you can grab the development version from github using devtools:\n\n```{r eval=FALSE}\nif (!requireNamespace(\"devtools\", quietly=TRUE))\n    install.packages(\"devtools\")\ndevtools::install_github(\"YuLab-SMU/ggmsa\")\n```\n\n\n##  :bulb: Quick Example \n\n```{r fig.height = 2.5, fig.width = 11, message=FALSE, warning=FALSE, dpi=300}\nlibrary(ggmsa)\nprotein_sequences <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\nggmsa(protein_sequences, start = 221, end = 280, char_width = 0.5, seq_name = TRUE) + geom_seqlogo() + geom_msaBar()\n```\n\n\n##  :books: Learn more\n\nCheck out the guides for learning everything there is to know about all the different features:\n\n- [Getting Started](https://yulab-smu.github.io/ggmsa/articles/ggmsa.html)\n- [Annotations](https://yulab-smu.github.io/ggmsa/articles/guides/Annotations.html)\n- [Color Schemes and Font Families](https://yulab-smu.github.io/ggmsa/articles/guides/Color_schemes_And_Font_Families.html)\n- [Theme](https://yulab-smu.github.io/ggmsa/articles/guides/MSA_theme.html)\n- [Other Modules](https://yulab-smu.github.io/ggmsa/articles/guides/Other_Modules.html)\n- [View Modes](https://yulab-smu.github.io/ggmsa/articles/guides/View_modes.html)\n\n##  :runner: Author\n\n- [Guangchuang Yu](https://guangchuangyu.github.io)  Professor, PI \n- [Lang Zhou](https://github.com/nyzhoulang)  Master's Student\n- [Shuangbin Xu](https://github.com/xiangpin)  PhD Student\n\n**YuLab**  <https://yulab-smu.top/>\n\n**Department of Bioinformatics, School of Basic Medical Sciences, Southern Medical University**\n\n\n\n## :sparkling_heart: Contributing\n\nWe welcome any contributions! By participating in this project you agree to abide\nby the terms outlined in the [Contributor Code of Conduct](https://github.com/YuLab-SMU/ggmsa/blob/master/CONDUCT.md).\n\n\n"
  },
  {
    "path": "README.md",
    "content": "<!-- README.md is generated from README.Rmd. Please edit that file -->\n\n# ggmsa:a visual exploration tool for multiple sequence alignment and associated data <img src=\"man/figures/logo.png\" height=\"140\" align=\"right\" />\n\n[![](https://img.shields.io/badge/devel%20version-1.3.2-blue.svg)](https://github.com/YuLab-SMU/ggmsa)\n[![](https://img.shields.io/badge/lifecycle-experimental-orange.svg)](https://lifecycle.r-lib.org/articles/stages.html#experimental)\n[![License:\nArtistic-2.0](https://img.shields.io/badge/license-Artistic--2.0-blue.svg)](https://cran.r-project.org/web/licenses/Artistic-2.0)\n<!-- badges: start -->\n<!-- [![CRAN_Release_Badge](https://www.r-pkg.org/badges/version-ago/ggmsa)](https://cran.r-project.org/package=ggmsa)-->\n<!-- [![CRAN_Download_Badge](https://cranlogs.r-pkg.org/badges/grand-total/ggmsa?color=green)](https://cran.r-project.org/package=ggmsa)-->\n<!-- badges: end -->\n\n`ggmsa` is designed for visualization and annotation of multiple\nsequence alignment. It implements functions to visualize\npublication-quality multiple sequence alignments (protein/DNA/RNA) in R\nextremely simple and powerful.\n\nFor details, please visit <http://yulab-smu.top/ggmsa/>\n\n## :hammer: Installation\n\nThe released version from `Bioconductor`\n\n``` r\nif (!requireNamespace(\"BiocManager\", quietly=TRUE))\n    install.packages(\"BiocManager\")\n## BiocManager::install(\"BiocUpgrade\") ## you may need this\nBiocManager::install(\"ggmsa\")\n```\n\nAlternatively, you can grab the development version from github using\ndevtools:\n\n``` r\nif (!requireNamespace(\"devtools\", quietly=TRUE))\n    install.packages(\"devtools\")\ndevtools::install_github(\"YuLab-SMU/ggmsa\")\n```\n\n## :bulb: Quick Example\n\n``` r\nlibrary(ggmsa)\nprotein_sequences <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\nggmsa(protein_sequences, start = 221, end = 280, char_width = 0.5, seq_name = TRUE) + geom_seqlogo() + geom_msaBar()\n```\n\n![](man/figures/REAMED-unnamed-chunk-6-1.png)<!-- -->\n\n## :books: Learn more\n\nCheck out the guides for learning everything there is to know about all\nthe different features:\n\n-   [Getting\n    Started](https://yulab-smu.github.io/ggmsa/articles/ggmsa.html)\n-   [Annotations](https://yulab-smu.github.io/ggmsa/articles/guides/Annotations.html)\n-   [Color Schemes and Font\n    Families](https://yulab-smu.github.io/ggmsa/articles/guides/Color_schemes_And_Font_Families.html)\n-   [Theme](https://yulab-smu.github.io/ggmsa/articles/guides/MSA_theme.html)\n-   [Other\n    Modules](https://yulab-smu.github.io/ggmsa/articles/guides/Other_Modules.html)\n-   [View\n    Modes](https://yulab-smu.github.io/ggmsa/articles/guides/View_modes.html)\n\n## :runner: Author\n\n-   [Guangchuang Yu](https://guangchuangyu.github.io) Professor, PI\n-   [Lang Zhou](https://github.com/nyzhoulang) Master’s Student\n-   [Shuangbin Xu](https://github.com/xiangpin) PhD Student\n\n**YuLab** <https://yulab-smu.top/>\n\n**Department of Bioinformatics, School of Basic Medical Sciences,\nSouthern Medical University**\n\n## :sparkling_heart: Contributing\n\nWe welcome any contributions! By participating in this project you agree\nto abide by the terms outlined in the [Contributor Code of\nConduct](https://github.com/YuLab-SMU/ggmsa/blob/master/CONDUCT.md).\n"
  },
  {
    "path": "inst/CITATION",
    "content": "citHeader(\"To cite ggmsa in publications use:\")\n\ncitEntry(\n    entry  = \"book\",\n    title = \"Data Integration, Manipulation and Visualization of Phylogenetic Treess\",\n    author = person(\"Guangchuang\", \"Yu\"),\n\tpublisher = \"Chapman and Hall/{CRC}\",\n    year = \"2022\",\n\tedition = \"1st edition\",\n    url = \"https://www.amazon.com/Integration-Manipulation-Visualization-Phylogenetic-Computational-ebook/dp/B0B5NLZR1Z/\",\n    textVersion = paste(\"Guangchuang Yu. (2022).\",\n                        \"Data Integration, Manipulation and Visualization of Phylogenetic Trees (1st edition).\",\n                        \"Chapman and Hall/CRC.\")   \n)\n\n\ncitEntry(\n    entry  = \"article\",\n    title  = \"ggmsa: a visual exploration tool for multiple sequence alignment and associated data \",\n    author = personList(\n        as.person(\"Lang Zhou\"),\n        as.person(\"Tingze Feng\"),\n        as.person(\"Shuangbin Xu\"),\n        as.person(\"Fangluan Gao\"),\n        as.person(\"Tommy T Lam\"),\n        as.person(\"Qianwen Wang\"),\n        as.person(\"Tianzhi Wu\"),\n        as.person(\"Huina Huang\"),\n        as.person(\"Li Zhan\"),\n        as.person(\"Lin Li\"),\n        as.person(\"Yi Guan\"),\n        as.person(\"Zehan Dai\"),\n        as.person(\"Guangchuang Yu\")\n        ),\n    journal = \"BRIEFINGS IN BIOINFORMATICS\",\n    volume  = \"23\",\n    issue   = \"4\",\n    year    = \"2022\",\n    month   = \"06\",\n    ISSN    = \"1467-5463\",\n    doi     = \"10.1093/bib/bbac222\",\n    PMID    = \"35671504\",\n    url     = \"https://academic.oup.com/bib/article-abstract/23/4/bbac222/6603927\",\n    textVersion = paste(\"L Zhou, T Feng, S Xu, F Gao, TT Lam, Q Wang, T Wu, H Huang, L Zhan, L Li, Y Guan, Z Dai, G Yu.\",\n                        \"ggmsa: a visual exploration tool for multiple sequence alignment and associated data.\",\n                        \"Bioinformatics. 2022, 23(4):bbac222. 10.1093/bib/bbac222\")\n)"
  },
  {
    "path": "inst/extdata/GVariation/A.Mont.fas",
    "content": ">Mont\nATGGCAACTTACACATCAACAATCCAGTTTGGTTCCATTGAATGCAAACTTCCATACTCACCCGCTCCTTTTGGGCTAGTTGCGGGGAAACGAGAAGTTTTAACCACCACTGACCCCTTCGCAAGTTTGGAGATGCAGCTTAGTGCGCGATTACGAAGGCAAGAGTTTGCAACTATTCGAACATCCAAGAATGGTACTTGCATGTATCGATACAAGACTGATGTCCAGATTGCGCGCATTCAAAAGAAGCGCGAGGAAAGAGAAAGAGAGGAATATAATTTCCAAATGGCTGCGTCAAGTGTTGTGTCGAAGATCACTATTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAGCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACCAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACTCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGTTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGCATCTCGCCAGGACGGACAAGTGGACTAACCAAGTTCGTGCTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGTAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCCTATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTATGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGATTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGAGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTTGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTCTAAATCGATTGGGGGCAGACAAAGATCGCTTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAAGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTGAAAGGAAAGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATAATATCAAGAAAGGAGACATCTCGTTCTTTAGGAATAAACTATCTGCAAAAGCAAATTGGAACTTGTATCTGTCATGTGATAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGAGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAAACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTATAAAAGACAGCCAGGGGTGAGTAAGAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAGTCAACATTTTACCCGCCAACTAAGAAGCACCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGAATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTTCTCGCGATGTTGATTAACATTAGTGAGGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACCATGATGGATTTGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACGAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCCCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTGGAGCATGCCCTGAGCTTGGGTCCACAATATCACCTTTTAGAGAAGGAGGAATCATAATGTCTGAGTCAGCAGCGCTAAAACTGCTCCTAAAGGGAATTTTTAGGCCCAAAGTGATGAAGCAATTGCTACTGGATGAACCATATTTGCTCATTTTATCGATATTATCTCCTGGTATACTTATGGCCATGTACAACAATGGGATATTTGAGTTAGCGGTGAAGTTGTGGATCAATGAGAAACAATCTATAGCCATGATAGCATCGTTATTGTCCGCCTTGGCTTTACGAGTGTCAGCAGCAGAAACACTCGTTGCACAGAGGATTATAATTGACACGGCAGCAACAGATCTTCTCGATGCTACGTGTGATGGATTCAACTTACATCTAACATATCCCACTGCACTCATGGTGTTGCAAGTTGTTAAGAACAGAAATGAATGTGATGATACGTTGTTTAAAGCAGGTTTTTCACATTACAACATGAGTGTCGTGCAGATTATGGAAAAAAATTATCTAAGCCTCTTGGGCGATGCTTGGAAAGATTTAACCTGGCGAGAAAAATTATCCGCAACATGGCACTCATACAAAGCAAAGCGCTCTATCACTCAGTTCATAAAACCCATAGGCAAAGCAGATTTAAAAGGGTTGTACAACATATCACCGCAAGCATTCTTGGGTCAGGGCGTACAGAGAGTCAAAGGCACCGCCTCAGGGTTGAATGAGCGACTCAATAATTATATCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATTTTCCGGCGCTTGCCAACTTTTGTAACTTTCATTAATTCATTATTAGTTATTAGTATGCTAACTAGTGTAGTAGCAGTGTGTCAAGCAATAATTCTAGATCAAAGGAAGTATAGAAAAGAAATTGAGTTGATGCAGATTGAGAAGAATGAAATTGTTTGTATGGAGTTGTATGCGAGTCTGCAGCGCAAACTTGAGCGTGAATTCACATGGGATGAATATATGGAATATTTGAAATCTGTGAATCCCCAGATAGTTCAATTCGCGCAAGCTCAAATGGAAGAATATAATGTGCGACATCAGCGCTCCACACCAGGTGTTAAGAATTTAGAGCAGGTGGTAGCATTTATAACTCTAATTATCATGATGTTTGATGCTGAAAGGAGCGACTGTGTATTTAAGACTCTCAACAAATTCAAAGGCATCGTTTCTTCAATGGATCATGAAGTTAGACACCAGTCCTTGGATGATGTAATCAAGAATTTCGATGAAAGGAACGAAGTTATTGATTTTGAACTAAATGAGGATACAATTAAAACATCATCAGTGTTGGACACAAAGTTTAGCGACTGGTGGGATCGGCAAATCCAAATGGGACACACACTTCCCCATTATAGAACTGAGGGACACTTCATGGAATTCACAAGGGCAACTGCTGTACAAGTGGCCAACGACATCGCGCATAGTGAGCACCTAGACTTTCTAGTGAGGGGAGCTGTTGGGTCTGGAAAATCTACTGGACTGCCTGTCCATCTCAGTGCAGCTGGATCTGTGCTTTTGATAGAACCAACTCGACCACTTGCAGAAAACGTGTTCAAGCAATTATCCAGTGAACCGTTTTTCAAGAAGCCAACACTGCGCATGCGAGGAAATAGTGTGTTTGGTTCCTCTCCAATCTCCATTATGACTAGCGGCTTTGCGTTGCACTACTATGCTAATAATCGCTCTCAGCTAACTCAGTTTAATTTCATAATTTTTGATGAATGTCATGTTTTAGATCCTTCTGCAATGGCATTTCGTAGCTTGTTAAGTGTGTATCACCAAACATGCAAAGTGTTAAAGGTGTCAGCCACTCCAGTGGGAAGGGAGGTCGAGTTCACAACACAACAACCAGTTAAATTGGTGGTTGAGGATACACTTTCATTCCAATCTTTTGTTGATGCGCAAGGCTCAAAAACCAATGCTGACGTAGTTCAGCATGGTTCGAACATACTCGTGTATGTGTCGAGTTACAATGAAGTGGATACATTAGCCAAGCTTCTAACAGATAGGAATATGATAGTCTCAAAAGTTGATGGCAGAACAATGAAGCACGGATGCTTAGAAATTGTAACGAAAGGGACTAGTGCAAAGCCACATTTTGTCGTAGCAACCAACATTATTGAAAATGGAGTAACTTTAGATATAGATGTAGTTGTAGATTTTGGGCTTAAAGTCTCACCGTTTTTAGATATTGACAATAGGAGCATAGCATACAATAAGATTAGTGTTAGCTATGGAGAAAGAATTCAGAGGTTGGGCCGTGTTGGGCGCTTTAAGAAGGGAGTGGCATTGCGTATTGGACACACCGAAAAGGGAATTATTGAGATTCCAAGTATGATTGCTAGTGAAGCTGCGCTTGCGTGCTTTGCATACAATTTGCCAGTAATGACAGGGGGTGTTTCAACTAGCCTCATTGGCAATTGTACTGTTCGTCAAGTTAAAACTATGCAACAATTTGAGCTGAGTCCATTCTTTATACAAAATTTTGTTGCCCATGATGGATCAATGCATCCTGTCATACATGACATTCTTAAGAAGTATAAACTGCGAGATTGTATGACGCCCTTGTGTGATCAATCCATACCTTACAGAGCCTCAAGCACTTGGTTGTCTGTTAGTGAGTACGAACGACTCGGAGTGGTTTTGGACATTCCAAAACAGATCAAGATTGCATTCCACATCAAGGATATCCCTCCTAAGTTGCATGAAATGCTTTGGGAAACAGTTATCAAATATAAGGATGTTTGTTTGTTTCCAAGTATTCGGGCTTCATCCATTAGCAAAATTGCATACACACTGCGCACTGATCTTTTTGCAATTCCCAGAACCCTAATTCTAGTTGAAAGATTGCTCGAGGAGGAACGAGTGAAACAGAGTCAATTCAGAAGTCTCATTGATGAAGGATGCTCAAGCATGTTTTCAATTGTTAATTTAACAAACACTCTTAGAGCTAGATATGCAAAGGATTACACTGCAGAAAACATACAGAAGCTCGAGAAAGTGAGGAGTCAGTTAAAGGAGTTCTCAAATTTAAATGGCTCTGCATGCGAGGAGAACTTAATGAAGAGGTATGAATCTCTACAGTTTGTGCATCATCAAGCAACAACTGCACTCGCAAAGGATTTGAAGTTGAAAGGAGTTTGGAAGAAGTCATTAGTTGTGCAGGACTTAATCATAGCGGGTGCCGTTGCTATTGGTGGAATAGGGCTCATCTATAGTTGGTTTACTCAATCAGTTGAAACTGTGTCTCACCAGGGCAAGAACAAATCCAAAAGAATTCAAGCATTGAAGTTTCGACACGCCCGCGATAAGAGGGCTGGTTTTGAAATTGATAACAATGATGATACAATAGAAGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGCACCACTGTTGGTATGGGCAAGTCAAGCAGGAGGTTTGTTAATATGTATGGATTTGACCCAACAGAATATTCATTCATCCAGTTCGTTGATCCGCTCACTGGAGCTCAAATTGAAGAGAACGTCTATGCTGATATTAGAGACATCCAAGAGCGCTTTAGTGATGTCCGCAAGAAAATGGTAGAGGATGATGAAATCGAATTGCAAGCATTGGGCAGCAACACAATCATTCATGCTTACTTCAGGAAGGATTGGTCTGACAAGGCTCTAAAAATTGATTTGATGCCACACAACCCACTCAAAATCTGTGATAAATCGAATGGCATTGCTAAGTTTCCTGAAAGAGAACTTGAGTTGAGGCAAACTGGGCCAGCAACAGAGGTTGATGTGAAAGACATTCCAAAACAGGAAGTGGAGCATGAAGCCAAATCACTCATGAGAGGTTTAAGGGATTTCAATCCAATTGCTCAAACAGTTTGCAGAGTAAAAGTGTCTGTTGAATATGGAACGTCTGAAATGTATGGGTTCGGTTTTGGTGCGTATATTATAGTAAACCACCATCTATTCAAGAGTTTCAATGGATCCATGGAAGTGCGATCAATGCATGGAACATTCAGAGTGAAGAATTTGCATAGCCTGAGCGTTTTACCGATCAAAGGCAGAGACATTATCATCATAAAGATGCCAAAGGATTTCCCTGTTTTCCCACAAAAACTGCACTTCCGAGCTCCAGTGCAGAATGAGAGGATTTGTTTGGTTGGAACTAATTTTCAAGAAAAACATGCATCATCAATCATCACAGAAACGAGTACTACATACAATGTACCGGGCAGCACTTTTTGGAAGCATTGGATTGAAACAAATGATGGGCATTGTGGATTACCAGTAGTGAGTACAGCTGATGGATGTCTAGTTGGAATACACAGCTTGGCGAATAATGTGCAAACCACGAATTATTATTCAGCCTTTGATGAGGATTTTGAAAGTAAGTATCTCCGAACTGATGAGCATAATGAGTGGACCAAATCGTGGGTATATAACCCAGATACTGTGTTGTGGGGTCCATTGAAGCTCAAAGAGAGTACCCCTAAAGGCCTGTTTAAGACAACAAAACTTGTACAGGATTTAATTGATCATGATGTTGTTGTAGAGCAAGCTAAACATTCTGCGTGGATGTATGAGGCTCTAACAGGGAATTTGCAAGCTGTGGCGACAATGAAGAGTCAGCTAGTGACAAAGCACGTGGTCAAAGGGGAGTGTCGGCACTTCAAAGAGTTCTTAACTGTGGATTCGGAAGCAGAAGCTTTCTTCAGGCCTTTGATGGATGCTTATGGGAAGAGCTTGTTAAATAGAGAAGCATATATAAAGGACATAATGAAATACTCAAAGCCTATTGATGTTGGAATAGTAGACTGTGATGCTTTTGAAGAGGCTATCAATAGGGTTATCATTTATCTGCAAGTACATGGCTTCCAGAAATGCAATTACATCACCGATGAGCAGGAAATTTTCAAAGCTCTCAATATGAAAGCTGCTGTCGGGGCTATGTATGGAGGCAAGAAGAAAGACTACTTCGAGCATTTTACTGAGGCGGATAAAGAGGAAATTGTTATGCAAAGTTGCTTACGATTGTACAAGGGCTCACTTGGCATATGGAATGGATCATTGAAAGCAGAACTTCGGTGCAAAGAGAAGATACTTGCAAATAAGACAAGGACATTCACTGCTGCACCTTTAGATACTCTACTGGGTGGGAAGGTGTGCGTTGATGATTTTAATAATCAATTCTACTCAAAGAACATTGAATGCTGCTGGACTGTTGGAATGACTAAGTTTTATGGAGGTTGGGACAAATTGCTTCGGCGTCTACCTGAAAATTGGGTGTACTGCGATGCCGATGGTTCACAATTCGATAGTTCACTCACCCCATACCTAATTAATGCTGTTCTCATCATCAGAAGCACATACATGGAAGATTGGGACTTGGGGTTGCAAATGTTGCGCAATTTGTACACAGAAATAATTTACACACCAATCTCAACTCCAGATGGAACAATTGTCAAGAAGTTTAGAGGTAATAATAGCGGTCAACCTTCTACCGTTGTGGATAATTCTCTCATGGTTGTTCTTGCTATGCATTACGCTCTCATTAAGGAGTGCGTTGAGTTTGAAGAAATCGACAGCACGTGTGTATTCTTTGTTAATGGTGATGACTTATTGATTGCTGTGAATCCGGAGAAAGAGAGCATTCTCGATAGAATGTCACAACATTTCTCAGATCTTGGTTTGAACTATGATTTTTCGTCGAGAACAAGAAGGAAGGAGGAATTGTGGTTCATGTCCCATAGAGGCCTGCTAATTGAGGGTATGTACGTGCCAAAGCTTGAAGAAGAGAGAATTGTATCCATTCTGCAATGGGATAGGGCTGATCTGCCAGAGCACAGATTAGAAGCGATTTGTGCAGCAATGATAGAATCCTGGGGTTATTTTGAGTTAACGCACCAAATTAGGAGATTCTACTCATGGTTGTTACAACAGCAACCTTTTTCAACGATAGCACAGGAAGGAAAAGCTCCATACATAGCGAGCATGGCATTGAAGAAGCTGTACATGAATAGGACAGTAGATGAGGAGGAACTGAAGGCTTTCACTGAAATGATGGTTGCCTTGGATGACGAATTTGAGTGCGATACTTATGAAGTGCACCATCAAGGAAATGACACAATCGATGCAGGAGGAAGCACTAAGAAGGATGCAAAACAAGAGCAAGGTAGCATTCAACCAAATCTCAACAAGGAAAAGGAAAAGGACGTGAATGTTGGAACATCTGGAACTCATACTGTGCCACGAATTAAAGCTATCACGTCCAAAATGAGAATGCCTAAGAGTAAAGGTGCAACTGTACTAAATTTGGAACACTTACTCGAGTATGCTCCACAGCAAATTGACATCTCAAATACTCGAGCAACTCAATCACAGTTTGATACGTGGTATGAAGCAGTACAACTTGCATACGACATAGGAGAAACTGAAATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAACATCAACGGAGTTTGGGTTATGATGGATGGAGATGAACAAGTCGAATACCCACTGAAACCAATCGTTGAGAATGCAAAACCAACACTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAGTTCGTAATCTGCGCGATGGAAGTCTGGCTCGCTATGCTTTTGACTTTTATGAAGTTACATCACGGACACCAGTGAGGGCTAGAGAGGCACACATTCAAATGAAGGCCGCAGCTTTAAAATCAGCTCAATCTCGACTTTTCGGATTGGATGGTGGCATTAGTACACAAGAGGAAAACACAGAGAGGCACACCACCGAGGATGTTTCTCCAAGTATGCATACTCTACTTGGAGTGAAGAACATGTGA\n>CF_YL21\nATGGCAATCTACACGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATATTGTGAAGGAACGAGAAGTGCTGGCTTCCATCGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGCAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATCCAGAAGAGACTGGAAAGGAAGGATAGGGAAGAATACCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTATTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAGCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACCAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGTTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGCATCTCGCCAGGACGGATAAGTGGAATAACCAAGTTCGTGTTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGTAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTATGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGATTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGAGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTTGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCTTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAAGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTAAAAGGAAAGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATAATATCAAGAAAGTAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGATAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGAGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAAACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTTTAAAAGACAGCCAGGGGTGAGTAAGAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATCAACATTTTACCCGCCAACTAAGAAGCGCCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGAATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAGGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTCGGAACCTGGCCAACCATGATGGATCTGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACGAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCCCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTAATGCATGCCCTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTGAAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTCTATCAATATTATCCCCTGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTGAGGTTGTGGATTAATGAGAAGCAATCCATAGCGATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTGATGCTGCAGCTACAGACCTCCTCGATGCTACGTGTGACGGGTTCAACCTGCATCTAACGTACCCCACTGCATTGATGGTGTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACTCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGTACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTTTCCGCAACATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATACAACATATCACCACAAGCATTCTTGGGCCGAAGCGTCCAGGTGGTCAAAGGCACTGCCTCAGGATTGAGCGAGCGATTTAATAATTACTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCACTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGTGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAAGAAGTATAGGAGAGAGATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATACGCAAGTTTACAGCGCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTGAACCCTCAGATAGTTCAATTTGCTCAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTTATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGACTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCTTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTGGATGATGTGATCAAGAATTTTGATGAGAGGAATGAGACTATTGATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATCCAAATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTTACAAGGGCAACTGCTGTTCAAGTGGCTAATGACATTGCCCACAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTGTTCATCTTAGTGTAGCTGGATCTGTGCTTTTAATTGAGCCAACGCGGCCACTAACGGAGAACGTTTTCAAACAGCTATCTAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCCCCAATCTCCGTTATGACTAGCGGGTTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAATTTTGTAATATTTGATGAGTGCCATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGTAAAGTATTAAAAGTGTCAGCTACTCCAGTGGGAAGAGAGGTTGAATTCACAACACAGCAGCCAGTCAAGTTAATAGTGGAAGACACACTGTCTTTCCAATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTCCAGTTTGGTTCAAACATACTTGTTTACGTGTCGAGCTACAATGAAGTTGACAACTTGGCTAAGCTCCTAACAGATAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATGAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAATGGAGTGACTTTGGACATAGATGTAGTAGTGGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCATTGCTTACAATAAGGTGAGTGTTAGTTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTAGCATTGCGCATTGGACACACTGAGAAGGGAATTATCGAAATTCCAAGCATGATCGCTACTGAGGCGGCTCTTGCTTGCTTTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAGGTCAAAACAATGCAGCAATTTGAATTGAGCCCCTTCTTTATCCAGAATTTCGTTGCTCATGATGGATCAATGCATCCTGTCATACATGACATTCTCAAAAAGTATAAACTCCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTGGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCTTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAGAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGAGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAGATTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGGTGCTCAAGCATGTTTTCAATTGTCAACTTGACCAACACTCTCAGGGCCAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGAAGTCAATTAAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTTCGTTCACCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCTAAAGACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTGTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGAAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGTACCACAGTTGGTATGGGCAAGTCAAGCAGGAGGTTCATTAACATGTACGGGTTTGACCCGACAGAGTACTCATTCATCCAATTCGTTGATCCACTCACTGGGGCACAAATAGAAGAGAATGTCTATGCTGACATTAGAGATGTTCAAGAAAGATTTAGTGAGGTGCGAAAGAAAATGGTTGAGAATGATGATATTGAAATGCAAGCCTTGGGTAGTAACACGACCATACATGCATACTTTAGGAAAGATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAACCCACTCAAAGTTTGTGACAAAACAAATGGCATTGCCAAACTTCCTGAGAGAGAGTTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCGCAGGAAGTGGAGCATGAAGCCAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCTCAAACAGTTTGTAGGCTGAAAGTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTTAGGAGTTACAATGGTTCCATGGAGGTGCGATCCATGCATGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCCAATTAAAGGCAGGGACATCATCCTCATCAAAATGCCGAAGGATTTCCCTGTTTTTCCACAGAAATTGCATTTCCGAGCTCCTACACAGAATGAAAGAGTTTGTTTAGTTGGGACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGCACTACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGTGAGCACAGCCGATGGATGTCTAGTCGGAATTCATAGTTTGGCAAACAATGCACACACCACGAACTACTACTCAGCCTTCGATGAAGATTTTGAAAGCAAGTACCTCCGAACCAATGAGCACAATGAATGGGTCAAATCTTGGGTTTATAATCCAGACACAGTGTTGTGGGGCCCGTTGAAGCTTAAAGATAGCACTCCCAAAGGGTTATTCAAAACAACAAAGCTTGTGCAAGATCTAATCGATCATGATGAAGTCGCGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCGCAACAATGAAGAGTCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGATGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCATATGGGAAAAGCTTGCTGAATAGAGATGCGTACATCAAAGACATAATGAAGTACTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAAGAAGCCATCAATAGGGTTATCATCTACTTGCAAGTGCACGGCTTCAAGAAGTGTGCATATGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAAGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGTCATGCAAAGCTGTCTGCGACTGTATAAAGGCTTGCTCGGCATTTGGAACGGATCGTTGAAGGCAGAGCTCCGGTGTAAGGAAAAGATACTTGCAAACAAGACGAGGACGTTCACTGCTGCACCTCTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGATGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACAGTTGGGATGACTAAGTTTTATGGTGGTTGGGATAAACTGCTTCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCATACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGACTGGGATGTGGGGTTGCAAATGCTGCGTAATTTATACACTGAGATTATTTACACACCTATCTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCAGCCTTCTACTGTTGTGGACAACTCTCTTATGGTTGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTGAAGAGATTGACAGCACGTGCGTGTTCTTCGTCAATGGTGATGATTTGCTGATTGCCGTGAATCCGGATAAAGAGGGCATTCTTGATAGATTGTCACAGCACTTCTCAGATCTTGGTTTAAATTATGATTTTTCGTCAAGAACAAGAAATAAGGAAGAGTTATGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTCTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCTGAACTAACACACCAAATTAGAAGATTCTACTCATGGTTATTGCAACAGCAACCTTTTGCAACAATAGCGCAGGAAGGGAAGGCTCCTTATATAGCAAGCATGGCATTAAGGAAACTGTATATGGATAGAGCTGTGGATGAGGAAGAGCTGAGAGCCTTCACTGAAATGATGGTCGCATTAGACGATGAGTTTGAGTGTGACTCTTATGAAGTACACCACCAGGGAAACGATACAATCGATGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGATGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCAAAGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACTCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTCTGGGTTATGATGGATGGGGATGAACAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGTGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAATGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACAGAGAGGCACACCGCCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAAAACATGTGA"
  },
  {
    "path": "inst/extdata/GVariation/B.Oz.fas",
    "content": ">Oz\nATGGCAACTTACATGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCTTCTTGCGGGCATATTGTGAAGGAGCGAGAAGTGCTGGCTTCCGTTGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGCAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACTTACCGATACAAAACTGATGCCCAGATAACGCGCATTCAGAAGAAACTGGAGAGGAAGGATAGGGAAGAATATCACTTCCAGATGGCCGCTCCTAGTATTGTGTCAAAAATTACAATAGCTGGTGGAGATCCTCCATCAAAGTCTGAGCCACAAGCACCAAGAGGGATCATTCATACAACTCCAAGGGTGCGTAAAGTCAAGACACGTCCCATAATAAAGTTGACAGAAGGCCAGATGAATCATCTCATTAAGCAGGTGAAGCAGATTATGTCGGAGAAGAGAGGGTCTGTCCACTTAATTAGTAAGAAGACCACTCATGTTCAATATAAGGAGATACTTGGAGCAACTCGCGCAGCGGTTCGAACTGCACATATGATGGGTTTGCGACGGAGAGTGGACTTCCGATGTGATATGTGGACAGTCGGACTTTTGCAACGTCTCGCTCGGACGGACAAATGGTCCAATCAAGTCCGCACTATCAACATACGAAGGGGTGATAGTGGAGTCATTTTGAACACAAAAAGCCTCAAAGGCCACTTTGGTAGAAGTTCAGGAGACTTGTTCATAGTGCGTGGATCACACGAAGGGAAATTGTACGATGCACGTTCTAGAGTTACTCAGAGTGTTTTGAACTCAATGATCCAGTTTTCGAATGCTGATAATTTTTGGAAGGGTCTAGACGGTAATTGGGCACAACTGAGATATCCTTCGGATCACACATGTGTAGCTGGTTTACCTGTCGAAGATTGTGGTAGAGTTGCTGCATTGATGGCACACAGTATCCTCCCGTGCTACAAGATAACCTGCCCCACCTGTGCTCAACAGTATGCCAGCTTGCCGGTTAGCGATCTGTTTAAGCTGTTGCATAAACATGCGAGAGATGGTTTGAACCGATTGGGAGCGGATAAAGACCGGTTTATACATGTTAATAAGTTCTTGATAGCGTTAGAGCATCTAACTGAACCGGTGGATTTGAATCTCGAGCTTTTCAATGAGATATTTAAATCCATAGGGGAGAAGCAGCAAGCACCGTTCAAGAATTTAAATGTCTTAAATAATTTCTTCCTGAAAGGAAAAGAAAATACAGCTCATGAATGGCAAGTGGCTCAATTGAGTTTGCTCGAATTAGCAAGGTTCCAGAAGAATAGAACTGATAACATCAAGAAAGGTGATATATCTTTCTTCAGAAATAAATTATCTGCCAAGGCAAACTGGAATCTGTATTTGTCGTGCGACAACCAGTTGGATAAAAATGCAAATTTTCTGTGGGGACAAAGGGAGTATCATGCTAAGCGGTTTTTCTCAAACTTCTTTGAGGAAATTGATCCAGCAAAGGGATACTCAGCATATGAAATCCGCAAGCATCCAAATGGAACAAGGAAGCTCTCAATTGGTAACTTAGTTGTCCCACTTGATTTAGCTGAGTTTAGGCAGAAGATGAAAGGTGACTATAGGAAACAACCAGGAGTTAGCAGAAAGTGCACGAGTTCGAAAGATGGTAATTATGTGTATCCCTGTTGTTGCACAACACTTGATGATGGTTCAGCTATTGAATCAACATTCTATCCACCAACCAAAAAGCACCTTGTAATAGGCAATAGCGGTGACCAAAAATTTGTTGATTTACCAAAAGGGGATTCGGAGATGTTATACATTGCCAAGCAGGGTTATTGTTATATCAACGTGTTTCTTGCAATGCTTATTAACATTAGCGAGGAGGATGCAAAGGATTTCACAAAGAAAGTTCGCGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACTATGATGGATTTGGCGACCACTTGTGCTCAAATGAGAATATTCTATCCTGACGTGCATGATGCAGAGCTGCCTAGAATATTGGTTGACCATGACACTCAAACGTGTCACGTGGTTGACTCATTTGGCTCGCAAACAACTGGATATCATATTCTAAAAGCATCCAGCGTGTCTCAACTTATCTTGTTTGCAAATGATGAATTAGAATCTGATATAAAACATTATAGAGTTGGTGGTGTTCCTAATGCATGCCCTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTAAAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTTTATCAATATTATCTCCTGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTAAGGTTGTGGATTAATGAGAAACAATCCATAGCTATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATCATTGATGCTGCAGCTACAGACCTCCTTGATGCTACGTGTGATGGGTTCAACCTACATCTAACGTACCCCACTGCATTGATGGTGTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACCCTATTCAAGGCGGGTTTTTCAAGTTACAACACGAGCGTCGTACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTATCCGCAACATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATACAACATATCACCACAAGCATTCTTGGGCCGAAGCGCCCAGGTGGTCAAAGGTACTGCCTCAGGATTGAGTGAGCGATTTAATAATTATTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCACTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGCGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAGGAAATATAGGAGAGAAATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATATGCAAGTTTACAGCGCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTAAACCCTCAGATAGTTCAGTTTGCTCAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTTATGGCTTTAGTCATTATGGTGTTCGATGCTGAAAGGAGTGATTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCTTTCCTCAATGGACTATGAAGTTAGACATCAGTCCTTAGACGATGTGATCAAGAATTTTGATGAGAGGAATGAGATTATTGATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATCCAGATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTCACAAGAGCAACTGCTGTCCAAGTGGCTAATGACATTGCCCATAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTACCTGTTCATCTTAGTGTAGCCGGATCTGTGCTTTTAATTGAACCAACGCGACCACTAGCGGAGAACGTTTTCAAACAGCTATCTAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCTCCAATCTCCGTCATGACTAGCGGATTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAACTTTGTAATATTTGATGAGTGCCATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGCAAAGTATTAAAAGTGTCAGCTACTCCAGTGGGAAGAGAGGTTGAATTTACAACACAGCAACCAGTCAAGTTAATAGTGGAGGACACACTGTCTTTCCAATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTTCAGTTTGGTTCAAACGTACTTGTGTACGTGTCGAGCTACAATGAAGTTGATACCTTGGCTAAGCTCCTAACAGACAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATGAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAATGGAGTGACTTTGGACATAGACGTGGTTGTAGATTTTGGGTTAAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCATTGCTTACAATAAGGTGAGTGTTAGCTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTAGCATTGCGCATTGGACACACTGAGAAGGGAATTATTGAAATTCCAAGCATGATCGCTACAGAGGCAGCTCTTGCTTGCTTTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAAGTTAAAACAATGCAGCAATTTGAATTGAGTCCCTTCTTTATCCAGAATTTCGTTGCCCATGATGGATCAATGCATCCTGTCATACATGACATTCTTAAAAAGTATAAACTTCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTGGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCCTTAGAAATTCCAAAGCAAGCCAAAATTGCATTCCATATCAAAGAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGAGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAGACTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGATGCTCAAGCATGTTTTCAATTGTCAACCTGACAAACACTCTCAGAGCTAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGAAGTCAATTGAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTTCGTTCATCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTAAAGGGGACTTGGAAGAAGTCATTAGTGGCCAAAGACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTGTCTCACCAAGGGAAAAATAAATCCAAAAGAATTCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGAAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAAAAGGGAAAAGGTAAAGGTACCACAGTTGGTATGGGCAAGTCAAGCAGGAGGTTCATCAACATGTATGGGTTTGATCCAACAGAGTACTCATTCATCCAATTCGTTGATCCACTCACTGGGGCGCAAATAGAAGAGAATGTCTATGCTGACATTAGAGATATTCAAGAGAGATTTAGTGAAGTGCGAAAGAAAATGGTTGAGAATGATGACATTGAAATGCAAGCCTTGGGTGGTAACACGACCATACATGCATACTTTAGGAAAGATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAATCCACTCAAAGTTTGTGACAAGACAAATGGCATTGCCAAATTTCCTGAGAGAGAGCTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGATGTGAAGGACATACCAGCACAGGAAGTGGAGCATGAAGCTAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCCCAAACAGTTTGTAGGCTGAAAGTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTCAGGAGTTACAATGGTTCCATGGAGGTGCGATCCATGCACGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCCAATTAAAGGTAGGGATATCATCCTCATCAAAATGCCGAAAGATTTCCCTGTCTTTCCACAGAAATTGCATTTCCGAGCTCCTACACAGAATGAAAGAGTTTGTTTAGTTGGAACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGCACCACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGATTACCAGTGGTGAGCACCACCGATGGATGTCTAGTCGGAATTCACAGTTTGGCAAACAACAAACACACCACGAACTACTACTCAGCCTTCGATGAAGATTTTGAAAGCAAGTATCTCCGAACCAATGAGCACAATGAATGGGTCAAGTCTTGGATTTATAATCCAGACACAGTGTTGTGGGGCCCGTTGAAACTTAAAGACAGCACTCCCAAAGGATTATTCAAAACAACAAAGCTTGTGCAAGATCTAATCGATCATGATGTAGTGGTGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCGCAACAATGAAGAGCCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGATGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCGTATGGGAAAAGCTTGCTGAATAGAGATGCATACATCAAGGACATAATGAAGTATTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAGGAAGCCATCAATAGGGTTATCATCTACCTGCAAGTGCACGGCTTCAAGAAGTGCGCATACGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAAGCTGCAGTTGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGTCATGCAAAGCTGTCTGCGATTGTATAAAGGCTTGCTTGGCATTTGGAATGGATCATTGAAGGCAGAGCTCCGGTGTAAGGAAAAGATACTTGCAAATAAGACGAGGACATTCACTGCTGCACCTTTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGATGATTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACGGTTGGGATGACTAAGTTTTATGGTGGTTGGGATAAACTGCTGCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCATACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGATTGGGATGTGGGGTTGCAAATGTTGCGCAATTTATACACTGAGATTGTTTACACACCTATTTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCAGCCTTCTACTGTTGTGGACAACTCTCTTATGGTCGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTGAAGAGATTGACAGCACGTGCGTGTTCTTTGTCAATGGTGATGATTTGCTGATTGCTGTGAATCCGGATAAAGAGGGCATTCTTGACAGATTGTCACAACACTTCTCAGATCTTGGTTTGAATTATGATTTCTCGTCAAGAACAAGAAATAAGGAGGAATTGTGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTCTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCTGAACTAACACACCAAATCAGGAGATTCTACTCATGGTTATTGCAACAGCAACCCTTTGCAACAATAGCGCAGGAAGGGAAGGCTCCTTATATAGCAAGCATGGCATTAAGGAAATTGTATATGGATAGGGCTGTGGATGAGGAAGAGCTGAGAGCCTTCACTGAAATGATGGTCGCATTAGACGATGAGTTTGAATTTGACTCTTATGAAGTACACCATCAAGCAAATGACACAATCGATGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCGGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGATGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCAAGGGAGCAACCGTGCTAAACCTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACTCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTTTGGGTTATGATGGATGGGAATGAACAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGTGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAATGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACAGAGAGGCACACCACCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAGAACATGTGA\n>CF_YL21\nATGGCAATCTACACGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATATTGTGAAGGAACGAGAAGTGCTGGCTTCCATCGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGCAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATCCAGAAGAGACTGGAAAGGAAGGATAGGGAAGAATACCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTATTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAGCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACCAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGTTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGCATCTCGCCAGGACGGATAAGTGGAATAACCAAGTTCGTGTTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGTAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTATGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGATTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGAGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTTGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCTTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAAGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTAAAAGGAAAGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATAATATCAAGAAAGTAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGATAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGAGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAAACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTTTAAAAGACAGCCAGGGGTGAGTAAGAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATCAACATTTTACCCGCCAACTAAGAAGCGCCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGAATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAGGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTCGGAACCTGGCCAACCATGATGGATCTGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACGAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCCCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTAATGCATGCCCTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTGAAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTCTATCAATATTATCCCCTGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTGAGGTTGTGGATTAATGAGAAGCAATCCATAGCGATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTGATGCTGCAGCTACAGACCTCCTCGATGCTACGTGTGACGGGTTCAACCTGCATCTAACGTACCCCACTGCATTGATGGTGTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACTCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGTACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTTTCCGCAACATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATACAACATATCACCACAAGCATTCTTGGGCCGAAGCGTCCAGGTGGTCAAAGGCACTGCCTCAGGATTGAGCGAGCGATTTAATAATTACTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCACTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGTGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAAGAAGTATAGGAGAGAGATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATACGCAAGTTTACAGCGCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTGAACCCTCAGATAGTTCAATTTGCTCAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTTATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGACTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCTTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTGGATGATGTGATCAAGAATTTTGATGAGAGGAATGAGACTATTGATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATCCAAATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTTACAAGGGCAACTGCTGTTCAAGTGGCTAATGACATTGCCCACAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTGTTCATCTTAGTGTAGCTGGATCTGTGCTTTTAATTGAGCCAACGCGGCCACTAACGGAGAACGTTTTCAAACAGCTATCTAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCCCCAATCTCCGTTATGACTAGCGGGTTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAATTTTGTAATATTTGATGAGTGCCATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGTAAAGTATTAAAAGTGTCAGCTACTCCAGTGGGAAGAGAGGTTGAATTCACAACACAGCAGCCAGTCAAGTTAATAGTGGAAGACACACTGTCTTTCCAATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTCCAGTTTGGTTCAAACATACTTGTTTACGTGTCGAGCTACAATGAAGTTGACAACTTGGCTAAGCTCCTAACAGATAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATGAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAATGGAGTGACTTTGGACATAGATGTAGTAGTGGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCATTGCTTACAATAAGGTGAGTGTTAGTTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTAGCATTGCGCATTGGACACACTGAGAAGGGAATTATCGAAATTCCAAGCATGATCGCTACTGAGGCGGCTCTTGCTTGCTTTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAGGTCAAAACAATGCAGCAATTTGAATTGAGCCCCTTCTTTATCCAGAATTTCGTTGCTCATGATGGATCAATGCATCCTGTCATACATGACATTCTCAAAAAGTATAAACTCCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTGGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCTTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAGAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGAGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAGATTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGGTGCTCAAGCATGTTTTCAATTGTCAACTTGACCAACACTCTCAGGGCCAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGAAGTCAATTAAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTTCGTTCACCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCTAAAGACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTGTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGAAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGTACCACAGTTGGTATGGGCAAGTCAAGCAGGAGGTTCATTAACATGTACGGGTTTGACCCGACAGAGTACTCATTCATCCAATTCGTTGATCCACTCACTGGGGCACAAATAGAAGAGAATGTCTATGCTGACATTAGAGATGTTCAAGAAAGATTTAGTGAGGTGCGAAAGAAAATGGTTGAGAATGATGATATTGAAATGCAAGCCTTGGGTAGTAACACGACCATACATGCATACTTTAGGAAAGATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAACCCACTCAAAGTTTGTGACAAAACAAATGGCATTGCCAAACTTCCTGAGAGAGAGTTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCGCAGGAAGTGGAGCATGAAGCCAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCTCAAACAGTTTGTAGGCTGAAAGTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTTAGGAGTTACAATGGTTCCATGGAGGTGCGATCCATGCATGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCCAATTAAAGGCAGGGACATCATCCTCATCAAAATGCCGAAGGATTTCCCTGTTTTTCCACAGAAATTGCATTTCCGAGCTCCTACACAGAATGAAAGAGTTTGTTTAGTTGGGACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGCACTACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGTGAGCACAGCCGATGGATGTCTAGTCGGAATTCATAGTTTGGCAAACAATGCACACACCACGAACTACTACTCAGCCTTCGATGAAGATTTTGAAAGCAAGTACCTCCGAACCAATGAGCACAATGAATGGGTCAAATCTTGGGTTTATAATCCAGACACAGTGTTGTGGGGCCCGTTGAAGCTTAAAGATAGCACTCCCAAAGGGTTATTCAAAACAACAAAGCTTGTGCAAGATCTAATCGATCATGATGAAGTCGCGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCGCAACAATGAAGAGTCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGATGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCATATGGGAAAAGCTTGCTGAATAGAGATGCGTACATCAAAGACATAATGAAGTACTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAAGAAGCCATCAATAGGGTTATCATCTACTTGCAAGTGCACGGCTTCAAGAAGTGTGCATATGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAAGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGTCATGCAAAGCTGTCTGCGACTGTATAAAGGCTTGCTCGGCATTTGGAACGGATCGTTGAAGGCAGAGCTCCGGTGTAAGGAAAAGATACTTGCAAACAAGACGAGGACGTTCACTGCTGCACCTCTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGATGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACAGTTGGGATGACTAAGTTTTATGGTGGTTGGGATAAACTGCTTCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCATACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGACTGGGATGTGGGGTTGCAAATGCTGCGTAATTTATACACTGAGATTATTTACACACCTATCTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCAGCCTTCTACTGTTGTGGACAACTCTCTTATGGTTGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTGAAGAGATTGACAGCACGTGCGTGTTCTTCGTCAATGGTGATGATTTGCTGATTGCCGTGAATCCGGATAAAGAGGGCATTCTTGATAGATTGTCACAGCACTTCTCAGATCTTGGTTTAAATTATGATTTTTCGTCAAGAACAAGAAATAAGGAAGAGTTATGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTCTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCTGAACTAACACACCAAATTAGAAGATTCTACTCATGGTTATTGCAACAGCAACCTTTTGCAACAATAGCGCAGGAAGGGAAGGCTCCTTATATAGCAAGCATGGCATTAAGGAAACTGTATATGGATAGAGCTGTGGATGAGGAAGAGCTGAGAGCCTTCACTGAAATGATGGTCGCATTAGACGATGAGTTTGAGTGTGACTCTTATGAAGTACACCACCAGGGAAACGATACAATCGATGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGATGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCAAAGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACTCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTCTGGGTTATGATGGATGGGGATGAACAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGTGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAATGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACAGAGAGGCACACCGCCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAAAACATGTGA"
  },
  {
    "path": "inst/extdata/GVariation/C.Wilga5.fas",
    "content": ">Wilga5\nATGGCAACTTACATGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATATTGTGAAGGAACGAGAAGTGCTGGCTTCCGTTGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGCAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATCCAGAAGAAACTGGAAAGGAAGGATAGGGAAGAATATCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTATTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAGCAAAAACATATCACACGCCAAAGTTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACCAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGTTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGCATCTCGCCAGGACGGACAAGTGGAATAACCAAGTTCGTGCTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGTAATACTAATCTCAAAGGAAGCTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTATGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTCTGGAAGGGATTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGCGGCAGAGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAATTTGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCTTTGTGCACGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGGGTCTAGAAATTTTCAATGAAGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTGAAAGGAAAAGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAGGCTTACTTGAATGGGCAAGATTCCAAAAGAACAGAACGGATAATATCAAGAAAGGAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGATAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGAGGAAATTGATCCAGCGAAGGGCTATTCAGCATATGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAAACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTATAAAAGACAGCCAGGGGTGAGTAAGAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATCAACATTTTACCCGCCAACTAAGAAGCACCTTGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGAATTCTGAGATGTTATATATTGCCAGGCAAGGCTTTTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAGGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACCATGATGGATCTGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACGAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCCCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAACATTATAGAGTTGGTGGTGTTCCTAATGCATGCCCTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCTCTGAAACTGCTTTTGAAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGTTGTTAGATGAGCCTTATCTGTTGATTCTATCAATATTATCCCCTGGCATACTGATGGCTATGTACAATAATGGGATTTTTGAACTTGCGGTAAGGTTGTGGATTAATGAGAAACAATCCATAGCTATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTGATACTGCAGCTACAGATCTCCTTGATGCTACGTGCGATGGGTTCAACCTACATCTAACGTACCCCACTGCGTTGATGGTGTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACCCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGTACAGATCATGGAAAAAAATTATCTAAATCTCTTAAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTATCCGCAACATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATACAACATATCACCACAAGCATTCTTGGGCCGAGGCGCCAAGGTGGTCAAAGGCACTGCCTCAGGATTGTGCGAGCGATTTAATAATTATTTCTACACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCACTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACCAGCGTAGTGGCAGTCTGTCAGGCAATAATTTTAGATCAGAGGAAGTATAGGAGAGAAATCGAGTTGATGCAGATAGAGAAGAATGAGATCGTCTGCATGGAGCTATATGCAAGTTTACAGCGCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAGTCAGTAAACCCTCAGATAGTTCAGTTTGCTCAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTTATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGATTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCTTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTAGACGATGTGATCAAGAATTTTGATGAGAGGAATGAGGTTATTGATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATCCAGATGGGGCATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTCACAAGAGCAACTGCTGTCCAAGTGGCTAATGACATTGCCCATAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTGTTCATCTTAGTGTAGCCGGATCTGTGCTTTTAATTGAACCAACGCGACCACTAGCGGAGAACGTTTTCAAACAGCTATCTAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCTCCAATCTCCGTCATGACTAGCGGATTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAACTTTGTAATATTTGATGAGTGCCATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGCAAAGTATTAAAAGTGTCAGCTACTCCAGTGGGAAGAGAGGTTGAATTTACAACACAGCAACCAGTCAAGTTAATAGTGGAGGACACACTGTCTTTTCTATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTTCAGTTTGGTTCAAACGTACTTGTGTACGTGTCGAGCTACAATGAAGTTGATACCTTGGCTAAGCTCCTAACAGACAAGAATATGATGGTCACAAAAGTTGATGGCAGAACAATGAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAATGGAGTGACTTTGGACATAGACGTGGTTGTAGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCATTGCTTACAATAAGGTGAGTGTTAGCTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTAGCATTGCGCATTGGACACACTGAGAAGGGAATTATTGAAATTCCAAGCATGATCGCTACAGAGGCGGCTCTTGCTTGCTTTGCATATAACTTGCCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAAGTTAAAACAATGCAGCAATTTGAATTGAGTCCCTTCTTTATCCAGAATTTTGTTGCCCATGATGGATCAATGCATCCTGTCATACATGACATTCTTAAAAAGTATAAACTTCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTGGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCCTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAGAGATCCCTCCTAAGCTCCATGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGAGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTTTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAGACTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGACGAAGGATGCTCAAGCATGTTTTCAATTGTCAACTTGACAAACACTCTCAGAGCTAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGAAGTCAATTGAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTTCGTTCATCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCCAAAGACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTGTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGAAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAAAAGGGAAAAGGTAAAGGTACCACAGTTGGTATGGGCAAGTCAAGCAGGAGGTTCATCAACATGTATGGGTTTGACCCAACAGAGTACTCATTCATCCAATTCGTTGATCCACTCACTGGGGCGCAAATAGAAGAGAATGTCTATGCTGACATTAGAGATATTCAAGAGAGATTTAGTGAAGTGCGAAAGAAAATGGTTGAGAATGATGACATTGAAATGCAAGCCTTGGGCAGTAACACGACCATACATGCATACTTCAGGAAAGATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAATCCACTCAAAGTTTGTGACAAGACAAATGGCATTGCCAAATTTCCTGAGAGAGAACTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCACAGGAGGTGGAGCATGAAGCTAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCCCAAACAGTTTGTAGGCTGAAAGTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCGTACATAATAGCGAACCACCATTTGTTCAGGAGTTACAATGGTTCCATGGAGGTGCGATCCATGCACGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGTGTTCTGCCAATTAAAGGTAGGGATATCATCCTCATCAAAATGCCGAAAGATTTCCCTGTCTTTCCACAGAAATTGCATTTCCGAGCTCCTACACAGAATGAAAGAGTTTGTTTAGTTGGAACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAGGAGGCACCACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGTGAGCACCACCGATGGATGTCTAGTCGGAATTCACAGTTTGGCAAACAACAAACACACCACGAATTACTACTCAGCCTTCGATGAAGATTTTGAAAGCAAGTATCTCCGAACCAATGAGCACAATGAATGGGTCAAGTCTTGGATTTATAATCCAGACACAGTATTGTGGGGCCCGTTGAAACTTAAAGACAGCACTCCCAAAGGATTATTCAAAACAACAAAGCTTGTGCAAGATCTAATTGATCATGATGAAGTGGTGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCGCAACAATGAAGAGCCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGATGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCGTATGGGAAAAGTTTGCTGAATAGAGATGCATACATCAAGGACATAATGAAGTATTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAGGAAGCCATCAATAGGGTTATCATCTACCTGCAAGTGCACGGCTTCAAGAAGTGCGCATACGTTACTGACGAGCAAGAAATTTTTAAAGCGCTCAACATGAAAGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGTCATGCAAAGCTGTCTGCGATTGTATAAAGGCTTGCTTGGCATTTGGAATGGATCATTGAAGGCAGAGCTCCGGTGTAAGGAAAAGATACTTGCAAATAAGACGAGGACATTCACTGCTGCACCTTTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGATGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACGGTTGGGATGACAAAGTTTTATGGTGGTTGGGATAAACTGCTGCGGCGTTTACCTGAGAATTGGGTTTACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCATACTTAATCAATGCAGTTCTCACCATTAGAAGCACATACATGGAAGATTGGGATGTGGGGTTGCAAATGTTGCGCAATTTATACACTGAGATTGTTTACACACCTATTTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGGAATAACAGTGGTCAGCCTTCTACTGTTGTGGACAACTCTCTTATGGTCGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTGAAGAGATTGACAGCACGTGCGTGTTCTTTGTCAATGGTGATGATTTGCTGATTGCTGTGAATCCGGATAAAGAGGGCATTCTTGACAGATTGTCACAACACTTCTCAGATCTTGGTTTGAATTATGATTTCTCGTCAAGAACAAGAAATAAGGAGGAGTTGTGGTTTATGTCTCATAGAGGCCTACTGATTGAGGCAATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTCTCCAATGGGACAGAGCAGACTTGGCTGAACATAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCTGAACTAACACACCAAATCAGGAGATTCTACTCATGGTTATTGCAACAGCAACCCTTTGCAACAATAGCGCAGGAAGGGAAGGCTCCTTATATAGCAAGCATGGCATTAAGGAAATTGTATATGGATAGGGCTGTGGATGAGGAAGAGCTGAGAGCCTTCACTGAAATGATGGTCGCATTAGATGATGAGTTTGAATTTGACTCTTATGAAGTATACCATCAAGCAAATGACACAATCGATGCAGGAGAAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGATGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGGATGCCCAAAAGCAAGGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACTCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTTTGGGTTATGATGGATGGGAATGAACAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGTGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAATGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACAGAGAGGCACACCACCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAGAACATGTGA\n>CF_YL21\nATGGCAATCTACACGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATATTGTGAAGGAACGAGAAGTGCTGGCTTCCATCGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGCAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATCCAGAAGAGACTGGAAAGGAAGGATAGGGAAGAATACCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTATTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAGCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACCAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGTTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGCATCTCGCCAGGACGGATAAGTGGAATAACCAAGTTCGTGTTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGTAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTATGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGATTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGAGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTTGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCTTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAAGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTAAAAGGAAAGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATAATATCAAGAAAGTAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGATAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGAGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAAACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTTTAAAAGACAGCCAGGGGTGAGTAAGAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATCAACATTTTACCCGCCAACTAAGAAGCGCCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGAATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAGGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTCGGAACCTGGCCAACCATGATGGATCTGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACGAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCCCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTAATGCATGCCCTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTGAAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTCTATCAATATTATCCCCTGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTGAGGTTGTGGATTAATGAGAAGCAATCCATAGCGATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTGATGCTGCAGCTACAGACCTCCTCGATGCTACGTGTGACGGGTTCAACCTGCATCTAACGTACCCCACTGCATTGATGGTGTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACTCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGTACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTTTCCGCAACATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATACAACATATCACCACAAGCATTCTTGGGCCGAAGCGTCCAGGTGGTCAAAGGCACTGCCTCAGGATTGAGCGAGCGATTTAATAATTACTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCACTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGTGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAAGAAGTATAGGAGAGAGATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATACGCAAGTTTACAGCGCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTGAACCCTCAGATAGTTCAATTTGCTCAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTTATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGACTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCTTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTGGATGATGTGATCAAGAATTTTGATGAGAGGAATGAGACTATTGATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATCCAAATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTTACAAGGGCAACTGCTGTTCAAGTGGCTAATGACATTGCCCACAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTGTTCATCTTAGTGTAGCTGGATCTGTGCTTTTAATTGAGCCAACGCGGCCACTAACGGAGAACGTTTTCAAACAGCTATCTAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCCCCAATCTCCGTTATGACTAGCGGGTTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAATTTTGTAATATTTGATGAGTGCCATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGTAAAGTATTAAAAGTGTCAGCTACTCCAGTGGGAAGAGAGGTTGAATTCACAACACAGCAGCCAGTCAAGTTAATAGTGGAAGACACACTGTCTTTCCAATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTCCAGTTTGGTTCAAACATACTTGTTTACGTGTCGAGCTACAATGAAGTTGACAACTTGGCTAAGCTCCTAACAGATAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATGAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAATGGAGTGACTTTGGACATAGATGTAGTAGTGGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCATTGCTTACAATAAGGTGAGTGTTAGTTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTAGCATTGCGCATTGGACACACTGAGAAGGGAATTATCGAAATTCCAAGCATGATCGCTACTGAGGCGGCTCTTGCTTGCTTTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAGGTCAAAACAATGCAGCAATTTGAATTGAGCCCCTTCTTTATCCAGAATTTCGTTGCTCATGATGGATCAATGCATCCTGTCATACATGACATTCTCAAAAAGTATAAACTCCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTGGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCTTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAGAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGAGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAGATTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGGTGCTCAAGCATGTTTTCAATTGTCAACTTGACCAACACTCTCAGGGCCAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGAAGTCAATTAAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTTCGTTCACCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCTAAAGACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTGTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGAAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGTACCACAGTTGGTATGGGCAAGTCAAGCAGGAGGTTCATTAACATGTACGGGTTTGACCCGACAGAGTACTCATTCATCCAATTCGTTGATCCACTCACTGGGGCACAAATAGAAGAGAATGTCTATGCTGACATTAGAGATGTTCAAGAAAGATTTAGTGAGGTGCGAAAGAAAATGGTTGAGAATGATGATATTGAAATGCAAGCCTTGGGTAGTAACACGACCATACATGCATACTTTAGGAAAGATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAACCCACTCAAAGTTTGTGACAAAACAAATGGCATTGCCAAACTTCCTGAGAGAGAGTTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCGCAGGAAGTGGAGCATGAAGCCAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCTCAAACAGTTTGTAGGCTGAAAGTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTTAGGAGTTACAATGGTTCCATGGAGGTGCGATCCATGCATGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCCAATTAAAGGCAGGGACATCATCCTCATCAAAATGCCGAAGGATTTCCCTGTTTTTCCACAGAAATTGCATTTCCGAGCTCCTACACAGAATGAAAGAGTTTGTTTAGTTGGGACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGCACTACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGTGAGCACAGCCGATGGATGTCTAGTCGGAATTCATAGTTTGGCAAACAATGCACACACCACGAACTACTACTCAGCCTTCGATGAAGATTTTGAAAGCAAGTACCTCCGAACCAATGAGCACAATGAATGGGTCAAATCTTGGGTTTATAATCCAGACACAGTGTTGTGGGGCCCGTTGAAGCTTAAAGATAGCACTCCCAAAGGGTTATTCAAAACAACAAAGCTTGTGCAAGATCTAATCGATCATGATGAAGTCGCGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCGCAACAATGAAGAGTCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGATGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCATATGGGAAAAGCTTGCTGAATAGAGATGCGTACATCAAAGACATAATGAAGTACTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAAGAAGCCATCAATAGGGTTATCATCTACTTGCAAGTGCACGGCTTCAAGAAGTGTGCATATGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAAGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGTCATGCAAAGCTGTCTGCGACTGTATAAAGGCTTGCTCGGCATTTGGAACGGATCGTTGAAGGCAGAGCTCCGGTGTAAGGAAAAGATACTTGCAAACAAGACGAGGACGTTCACTGCTGCACCTCTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGATGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACAGTTGGGATGACTAAGTTTTATGGTGGTTGGGATAAACTGCTTCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCATACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGACTGGGATGTGGGGTTGCAAATGCTGCGTAATTTATACACTGAGATTATTTACACACCTATCTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCAGCCTTCTACTGTTGTGGACAACTCTCTTATGGTTGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTGAAGAGATTGACAGCACGTGCGTGTTCTTCGTCAATGGTGATGATTTGCTGATTGCCGTGAATCCGGATAAAGAGGGCATTCTTGATAGATTGTCACAGCACTTCTCAGATCTTGGTTTAAATTATGATTTTTCGTCAAGAACAAGAAATAAGGAAGAGTTATGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTCTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCTGAACTAACACACCAAATTAGAAGATTCTACTCATGGTTATTGCAACAGCAACCTTTTGCAACAATAGCGCAGGAAGGGAAGGCTCCTTATATAGCAAGCATGGCATTAAGGAAACTGTATATGGATAGAGCTGTGGATGAGGAAGAGCTGAGAGCCTTCACTGAAATGATGGTCGCATTAGACGATGAGTTTGAGTGTGACTCTTATGAAGTACACCACCAGGGAAACGATACAATCGATGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGATGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCAAAGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACTCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAATGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTCTGGGTTATGATGGATGGGGATGAACAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTTGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGTGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAATGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACAGAGAGGCACACCGCCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAAAACATGTGA"
  },
  {
    "path": "inst/extdata/GVariation/sample_alignment.fa",
    "content": ">Mont\nATGGCAACTTACACATCAACAATCCAGTTTGGTTCCATTGAATGCAAACTTCCATACTCACCCGCTCCTTTTGGGCTAGT\nTGCGGGGAAACGAGAAGTTTTAACCACCACTGACCCCTTCGCAAGTTTGGAGATGCAGCTTAGTGCGCGATTACGAAGGC\nAAGAGTTTGCAACTATTCGAACATCCAAGAATGGTACTTGCATGTATCGATACAAGACTGATGTCCAGATTGCGCGCATT\nCAAAAGAAGCGCGAGGAAAGAGAAAGAGAGGAATATAATTTCCAAATGGCTGCGTCAAGTGTTGTGTCGAAGATCACTAT\nTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAG\nCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACC\nAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACTCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGT\nTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGC\nATCTCGCCAGGACGGACAAGTGGACTAACCAAGTTCGTGCTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGT\nAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCCTATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTA\nTGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGAT\nTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGA\nGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTT\nGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTCTAAATCGATTGGGGGCAGACAAAGATCGCT\nTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAA\nGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTGAAAGGAAA\nGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATA\nATATCAAGAAAGGAGACATCTCGTTCTTTAGGAATAAACTATCTGCAAAAGCAAATTGGAACTTGTATCTGTCATGTGAT\nAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGA\nGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAA\nACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTATAAAAGACAGCCAGGGGTGAGTAAG\nAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAGTC\nAACATTTTACCCGCCAACTAAGAAGCACCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGA\nATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTTCTCGCGATGTTGATTAACATTAGTGAG\nGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACCATGATGGATTT\nGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACG\nAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCC\nCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTGGAGCATGCCC\nTGAGCTTGGGTCCACAATATCACCTTTTAGAGAAGGAGGAATCATAATGTCTGAGTCAGCAGCGCTAAAACTGCTCCTAA\nAGGGAATTTTTAGGCCCAAAGTGATGAAGCAATTGCTACTGGATGAACCATATTTGCTCATTTTATCGATATTATCTCCT\nGGTATACTTATGGCCATGTACAACAATGGGATATTTGAGTTAGCGGTGAAGTTGTGGATCAATGAGAAACAATCTATAGC\nCATGATAGCATCGTTATTGTCCGCCTTGGCTTTACGAGTGTCAGCAGCAGAAACACTCGTTGCACAGAGGATTATAATTG\nACACGGCAGCAACAGATCTTCTCGATGCTACGTGTGATGGATTCAACTTACATCTAACATATCCCACTGCACTCATGGTG\nTTGCAAGTTGTTAAGAACAGAAATGAATGTGATGATACGTTGTTTAAAGCAGGTTTTTCACATTACAACATGAGTGTCGT\nGCAGATTATGGAAAAAAATTATCTAAGCCTCTTGGGCGATGCTTGGAAAGATTTAACCTGGCGAGAAAAATTATCCGCAA\nCATGGCACTCATACAAAGCAAAGCGCTCTATCACTCAGTTCATAAAACCCATAGGCAAAGCAGATTTAAAAGGGTTGTAC\nAACATATCACCGCAAGCATTCTTGGGTCAGGGCGTACAGAGAGTCAAAGGCACCGCCTCAGGGTTGAATGAGCGACTCAA\nTAATTATATCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATTTTCCGGCGCTTGCCAACTTTTGTAA\nCTTTCATTAATTCATTATTAGTTATTAGTATGCTAACTAGTGTAGTAGCAGTGTGTCAAGCAATAATTCTAGATCAAAGG\nAAGTATAGAAAAGAAATTGAGTTGATGCAGATTGAGAAGAATGAAATTGTTTGTATGGAGTTGTATGCGAGTCTGCAGCG\nCAAACTTGAGCGTGAATTCACATGGGATGAATATATGGAATATTTGAAATCTGTGAATCCCCAGATAGTTCAATTCGCGC\nAAGCTCAAATGGAAGAATATAATGTGCGACATCAGCGCTCCACACCAGGTGTTAAGAATTTAGAGCAGGTGGTAGCATTT\nATAACTCTAATTATCATGATGTTTGATGCTGAAAGGAGCGACTGTGTATTTAAGACTCTCAACAAATTCAAAGGCATCGT\nTTCTTCAATGGATCATGAAGTTAGACACCAGTCCTTGGATGATGTAATCAAGAATTTCGATGAAAGGAACGAAGTTATTG\nATTTTGAACTAAATGAGGATACAATTAAAACATCATCAGTGTTGGACACAAAGTTTAGCGACTGGTGGGATCGGCAAATC\nCAAATGGGACACACACTTCCCCATTATAGAACTGAGGGACACTTCATGGAATTCACAAGGGCAACTGCTGTACAAGTGGC\nCAACGACATCGCGCATAGTGAGCACCTAGACTTTCTAGTGAGGGGAGCTGTTGGGTCTGGAAAATCTACTGGACTGCCTG\nTCCATCTCAGTGCAGCTGGATCTGTGCTTTTGATAGAACCAACTCGACCACTTGCAGAAAACGTGTTCAAGCAATTATCC\nAGTGAACCGTTTTTCAAGAAGCCAACACTGCGCATGCGAGGAAATAGTGTGTTTGGTTCCTCTCCAATCTCCATTATGAC\nTAGCGGCTTTGCGTTGCACTACTATGCTAATAATCGCTCTCAGCTAACTCAGTTTAATTTCATAATTTTTGATGAATGTC\nATGTTTTAGATCCTTCTGCAATGGCATTTCGTAGCTTGTTAAGTGTGTATCACCAAACATGCAAAGTGTTAAAGGTGTCA\nGCCACTCCAGTGGGAAGGGAGGTCGAGTTCACAACACAACAACCAGTTAAATTGGTGGTTGAGGATACACTTTCATTCCA\nATCTTTTGTTGATGCGCAAGGCTCAAAAACCAATGCTGACGTAGTTCAGCATGGTTCGAACATACTCGTGTATGTGTCGA\nGTTACAATGAAGTGGATACATTAGCCAAGCTTCTAACAGATAGGAATATGATAGTCTCAAAAGTTGATGGCAGAACAATG\nAAGCACGGATGCTTAGAAATTGTAACGAAAGGGACTAGTGCAAAGCCACATTTTGTCGTAGCAACCAACATTATTGAAAA\nTGGAGTAACTTTAGATATAGATGTAGTTGTAGATTTTGGGCTTAAAGTCTCACCGTTTTTAGATATTGACAATAGGAGCA\nTAGCATACAATAAGATTAGTGTTAGCTATGGAGAAAGAATTCAGAGGTTGGGCCGTGTTGGGCGCTTTAAGAAGGGAGTG\nGCATTGCGTATTGGACACACCGAAAAGGGAATTATTGAGATTCCAAGTATGATTGCTAGTGAAGCTGCGCTTGCGTGCTT\nTGCATACAATTTGCCAGTAATGACAGGGGGTGTTTCAACTAGCCTCATTGGCAATTGTACTGTTCGTCAAGTTAAAACTA\nTGCAACAATTTGAGCTGAGTCCATTCTTTATACAAAATTTTGTTGCCCATGATGGATCAATGCATCCTGTCATACATGAC\nATTCTTAAGAAGTATAAACTGCGAGATTGTATGACGCCCTTGTGTGATCAATCCATACCTTACAGAGCCTCAAGCACTTG\nGTTGTCTGTTAGTGAGTACGAACGACTCGGAGTGGTTTTGGACATTCCAAAACAGATCAAGATTGCATTCCACATCAAGG\nATATCCCTCCTAAGTTGCATGAAATGCTTTGGGAAACAGTTATCAAATATAAGGATGTTTGTTTGTTTCCAAGTATTCGG\nGCTTCATCCATTAGCAAAATTGCATACACACTGCGCACTGATCTTTTTGCAATTCCCAGAACCCTAATTCTAGTTGAAAG\nATTGCTCGAGGAGGAACGAGTGAAACAGAGTCAATTCAGAAGTCTCATTGATGAAGGATGCTCAAGCATGTTTTCAATTG\nTTAATTTAACAAACACTCTTAGAGCTAGATATGCAAAGGATTACACTGCAGAAAACATACAGAAGCTCGAGAAAGTGAGG\nAGTCAGTTAAAGGAGTTCTCAAATTTAAATGGCTCTGCATGCGAGGAGAACTTAATGAAGAGGTATGAATCTCTACAGTT\nTGTGCATCATCAAGCAACAACTGCACTCGCAAAGGATTTGAAGTTGAAAGGAGTTTGGAAGAAGTCATTAGTTGTGCAGG\nACTTAATCATAGCGGGTGCCGTTGCTATTGGTGGAATAGGGCTCATCTATAGTTGGTTTACTCAATCAGTTGAAACTGTG\nTCTCACCAGGGCAAGAACAAATCCAAAAGAATTCAAGCATTGAAGTTTCGACACGCCCGCGATAAGAGGGCTGGTTTTGA\nAATTGATAACAATGATGATACAATAGAAGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGCACCACTG\nTTGGTATGGGCAAGTCAAGCAGGAGGTTTGTTAATATGTATGGATTTGACCCAACAGAATATTCATTCATCCAGTTCGTT\nGATCCGCTCACTGGAGCTCAAATTGAAGAGAACGTCTATGCTGATATTAGAGACATCCAAGAGCGCTTTAGTGATGTCCG\nCAAGAAAATGGTAGAGGATGATGAAATCGAATTGCAAGCATTGGGCAGCAACACAATCATTCATGCTTACTTCAGGAAGG\nATTGGTCTGACAAGGCTCTAAAAATTGATTTGATGCCACACAACCCACTCAAAATCTGTGATAAATCGAATGGCATTGCT\nAAGTTTCCTGAAAGAGAACTTGAGTTGAGGCAAACTGGGCCAGCAACAGAGGTTGATGTGAAAGACATTCCAAAACAGGA\nAGTGGAGCATGAAGCCAAATCACTCATGAGAGGTTTAAGGGATTTCAATCCAATTGCTCAAACAGTTTGCAGAGTAAAAG\nTGTCTGTTGAATATGGAACGTCTGAAATGTATGGGTTCGGTTTTGGTGCGTATATTATAGTAAACCACCATCTATTCAAG\nAGTTTCAATGGATCCATGGAAGTGCGATCAATGCATGGAACATTCAGAGTGAAGAATTTGCATAGCCTGAGCGTTTTACC\nGATCAAAGGCAGAGACATTATCATCATAAAGATGCCAAAGGATTTCCCTGTTTTCCCACAAAAACTGCACTTCCGAGCTC\nCAGTGCAGAATGAGAGGATTTGTTTGGTTGGAACTAATTTTCAAGAAAAACATGCATCATCAATCATCACAGAAACGAGT\nACTACATACAATGTACCGGGCAGCACTTTTTGGAAGCATTGGATTGAAACAAATGATGGGCATTGTGGATTACCAGTAGT\nGAGTACAGCTGATGGATGTCTAGTTGGAATACACAGCTTGGCGAATAATGTGCAAACCACGAATTATTATTCAGCCTTTG\nATGAGGATTTTGAAAGTAAGTATCTCCGAACTGATGAGCATAATGAGTGGACCAAATCGTGGGTATATAACCCAGATACT\nGTGTTGTGGGGTCCATTGAAGCTCAAAGAGAGTACCCCTAAAGGCCTGTTTAAGACAACAAAACTTGTACAGGATTTAAT\nTGATCATGATGTTGTTGTAGAGCAAGCTAAACATTCTGCGTGGATGTATGAGGCTCTAACAGGGAATTTGCAAGCTGTGG\nCGACAATGAAGAGTCAGCTAGTGACAAAGCACGTGGTCAAAGGGGAGTGTCGGCACTTCAAAGAGTTCTTAACTGTGGAT\nTCGGAAGCAGAAGCTTTCTTCAGGCCTTTGATGGATGCTTATGGGAAGAGCTTGTTAAATAGAGAAGCATATATAAAGGA\nCATAATGAAATACTCAAAGCCTATTGATGTTGGAATAGTAGACTGTGATGCTTTTGAAGAGGCTATCAATAGGGTTATCA\nTTTATCTGCAAGTACATGGCTTCCAGAAATGCAATTACATCACCGATGAGCAGGAAATTTTCAAAGCTCTCAATATGAAA\nGCTGCTGTCGGGGCTATGTATGGAGGCAAGAAGAAAGACTACTTCGAGCATTTTACTGAGGCGGATAAAGAGGAAATTGT\nTATGCAAAGTTGCTTACGATTGTACAAGGGCTCACTTGGCATATGGAATGGATCATTGAAAGCAGAACTTCGGTGCAAAG\nAGAAGATACTTGCAAATAAGACAAGGACATTCACTGCTGCACCTTTAGATACTCTACTGGGTGGGAAGGTGTGCGTTGAT\nGATTTTAATAATCAATTCTACTCAAAGAACATTGAATGCTGCTGGACTGTTGGAATGACTAAGTTTTATGGAGGTTGGGA\nCAAATTGCTTCGGCGTCTACCTGAAAATTGGGTGTACTGCGATGCCGATGGTTCACAATTCGATAGTTCACTCACCCCAT\nACCTAATTAATGCTGTTCTCATCATCAGAAGCACATACATGGAAGATTGGGACTTGGGGTTGCAAATGTTGCGCAATTTG\nTACACAGAAATAATTTACACACCAATCTCAACTCCAGATGGAACAATTGTCAAGAAGTTTAGAGGTAATAATAGCGGTCA\nACCTTCTACCGTTGTGGATAATTCTCTCATGGTTGTTCTTGCTATGCATTACGCTCTCATTAAGGAGTGCGTTGAGTTTG\nAAGAAATCGACAGCACGTGTGTATTCTTTGTTAATGGTGATGACTTATTGATTGCTGTGAATCCGGAGAAAGAGAGCATT\nCTCGATAGAATGTCACAACATTTCTCAGATCTTGGTTTGAACTATGATTTTTCGTCGAGAACAAGAAGGAAGGAGGAATT\nGTGGTTCATGTCCCATAGAGGCCTGCTAATTGAGGGTATGTACGTGCCAAAGCTTGAAGAAGAGAGAATTGTATCCATTC\nTGCAATGGGATAGGGCTGATCTGCCAGAGCACAGATTAGAAGCGATTTGTGCAGCAATGATAGAATCCTGGGGTTATTTT\nGAGTTAACGCACCAAATTAGGAGATTCTACTCATGGTTGTTACAACAGCAACCTTTTTCAACGATAGCACAGGAAGGAAA\nAGCTCCATACATAGCGAGCATGGCATTGAAGAAGCTGTACATGAATAGGACAGTAGATGAGGAGGAACTGAAGGCTTTCA\nCTGAAATGATGGTTGCCTTGGATGACGAATTTGAGTGCGATACTTATGAAGTGCACCATCAAGGAAATGACACAATCGAT\nGCAGGAGGAAGCACTAAGAAGGATGCAAAACAAGAGCAAGGTAGCATTCAACCAAATCTCAACAAGGAAAAGGAAAAGGA\nCGTGAATGTTGGAACATCTGGAACTCATACTGTGCCACGAATTAAAGCTATCACGTCCAAAATGAGAATGCCTAAGAGTA\nAAGGTGCAACTGTACTAAATTTGGAACACTTACTCGAGTATGCTCCACAGCAAATTGACATCTCAAATACTCGAGCAACT\nCAATCACAGTTTGATACGTGGTATGAAGCAGTACAACTTGCATACGACATAGGAGAAACTGAAATGCCAACTGTGATGAA\nTGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAACATCAACGGAGTTTGGGTTATGATGGATGGAGATGAAC\nAAGTCGAATACCCACTGAAACCAATCGTTGAGAATGCAAAACCAACACTTAGGCAAATCATGGCACATTTCTCAGATGTT\nGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAGTTCGTAATCTGCGCGATGG\nAAGTCTGGCTCGCTATGCTTTTGACTTTTATGAAGTTACATCACGGACACCAGTGAGGGCTAGAGAGGCACACATTCAAA\nTGAAGGCCGCAGCTTTAAAATCAGCTCAATCTCGACTTTTCGGATTGGATGGTGGCATTAGTACACAAGAGGAAAACACA\nGAGAGGCACACCACCGAGGATGTTTCTCCAAGTATGCATACTCTACTTGGAGTGAAGAACATGTGA\n>CF_YL21\nATGGCAATCTACACGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATAT\nTGTGAAGGAACGAGAAGTGCTGGCTTCCATCGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGC\nAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATC\nCAGAAGAGACTGGAAAGGAAGGATAGGGAAGAATACCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTAT\nTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAG\nCAAAAACATATCACACGCCAAAGCTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACC\nAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGT\nTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGC\nATCTCGCCAGGACGGATAAGTGGAATAACCAAGTTCGTGTTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGT\nAATACTAATCTCAAAGGAAACTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTA\nTGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTTTGGAAGGGAT\nTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGTGGCAGA\nGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAACTT\nGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCT\nTTGTGCATGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGAGTCTAGAAATTTTCAATGAA\nGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTAAAAGGAAA\nGGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAAGCTTACTTGAATTGGCAAGATTCCAAAAGAACAGAACGGATA\nATATCAAGAAAGTAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGAT\nAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGA\nGGAAATTGATCCAGCGAAGGGCTATTCAGCATACGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAA\nACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTTTAAAAGACAGCCAGGGGTGAGTAAG\nAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATC\nAACATTTTACCCGCCAACTAAGAAGCGCCTCGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGA\nATTCTGAGATGTTATATATTGCCAGGCAAGGCTTCTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAG\nGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTCGGAACCTGGCCAACCATGATGGATCT\nGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACG\nAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCC\nCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAGCACTATAGAGTTGGTGGTATTCCTAATGCATGCCC\nTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTGA\nAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTCTATCAATATTATCCCCT\nGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTGAGGTTGTGGATTAATGAGAAGCAATCCATAGC\nGATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTG\nATGCTGCAGCTACAGACCTCCTCGATGCTACGTGTGACGGGTTCAACCTGCATCTAACGTACCCCACTGCATTGATGGTG\nTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACTCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGT\nACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTTTCCGCAA\nCATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATAC\nAACATATCACCACAAGCATTCTTGGGCCGAAGCGTCCAGGTGGTCAAAGGCACTGCCTCAGGATTGAGCGAGCGATTTAA\nTAATTACTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCA\nCTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGTGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAAG\nAAGTATAGGAGAGAGATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATACGCAAGTTTACAGCG\nCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTGAACCCTCAGATAGTTCAATTTGCTC\nAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTT\nATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGACTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCT\nTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTGGATGATGTGATCAAGAATTTTGATGAGAGGAATGAGACTATTG\nATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATC\nCAAATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTTACAAGGGCAACTGCTGTTCAAGTGGC\nTAATGACATTGCCCACAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTG\nTTCATCTTAGTGTAGCTGGATCTGTGCTTTTAATTGAGCCAACGCGGCCACTAACGGAGAACGTTTTCAAACAGCTATCT\nAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCCCCAATCTCCGTTATGAC\nTAGCGGGTTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAATTTTGTAATATTTGATGAGTGCC\nATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGTAAAGTATTAAAAGTGTCA\nGCTACTCCAGTGGGAAGAGAGGTTGAATTCACAACACAGCAGCCAGTCAAGTTAATAGTGGAAGACACACTGTCTTTCCA\nATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTCCAGTTTGGTTCAAACATACTTGTTTACGTGTCGA\nGCTACAATGAAGTTGACAACTTGGCTAAGCTCCTAACAGATAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATG\nAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAA\nTGGAGTGACTTTGGACATAGATGTAGTAGTGGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCA\nTTGCTTACAATAAGGTGAGTGTTAGTTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTA\nGCATTGCGCATTGGACACACTGAGAAGGGAATTATCGAAATTCCAAGCATGATCGCTACTGAGGCGGCTCTTGCTTGCTT\nTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAGGTCAAAACAA\nTGCAGCAATTTGAATTGAGCCCCTTCTTTATCCAGAATTTCGTTGCTCATGATGGATCAATGCATCCTGTCATACATGAC\nATTCTCAAAAAGTATAAACTCCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTG\nGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCTTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAG\nAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGA\nGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAG\nATTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGGTGCTCAAGCATGTTTTCAATTG\nTCAACTTGACCAACACTCTCAGGGCCAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGA\nAGTCAATTAAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTT\nCGTTCACCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCTAAAG\nACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTG\nTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGA\nAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAGAAGGGAAAAGGTAAAGGTACCACAG\nTTGGTATGGGCAAGTCAAGCAGGAGGTTCATTAACATGTACGGGTTTGACCCGACAGAGTACTCATTCATCCAATTCGTT\nGATCCACTCACTGGGGCACAAATAGAAGAGAATGTCTATGCTGACATTAGAGATGTTCAAGAAAGATTTAGTGAGGTGCG\nAAAGAAAATGGTTGAGAATGATGATATTGAAATGCAAGCCTTGGGTAGTAACACGACCATACATGCATACTTTAGGAAAG\nATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAACCCACTCAAAGTTTGTGACAAAACAAATGGCATTGCC\nAAACTTCCTGAGAGAGAGTTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCGCAGGA\nAGTGGAGCATGAAGCCAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCTCAAACAGTTTGTAGGCTGAAAG\nTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTTAGG\nAGTTACAATGGTTCCATGGAGGTGCGATCCATGCATGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCC\nAATTAAAGGCAGGGACATCATCCTCATCAAAATGCCGAAGGATTTCCCTGTTTTTCCACAGAAATTGCATTTCCGAGCTC\nCTACACAGAATGAAAGAGTTTGTTTAGTTGGGACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGC\nACTACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGT\nGAGCACAGCCGATGGATGTCTAGTCGGAATTCATAGTTTGGCAAACAATGCACACACCACGAACTACTACTCAGCCTTCG\nATGAAGATTTTGAAAGCAAGTACCTCCGAACCAATGAGCACAATGAATGGGTCAAATCTTGGGTTTATAATCCAGACACA\nGTGTTGTGGGGCCCGTTGAAGCTTAAAGATAGCACTCCCAAAGGGTTATTCAAAACAACAAAGCTTGTGCAAGATCTAAT\nCGATCATGATGAAGTCGCGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCG\nCAACAATGAAGAGTCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGAT\nGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCATATGGGAAAAGCTTGCTGAATAGAGATGCGTACATCAAAGA\nCATAATGAAGTACTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAAGAAGCCATCAATAGGGTTATCA\nTCTACTTGCAAGTGCACGGCTTCAAGAAGTGTGCATATGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAA\nGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGT\nCATGCAAAGCTGTCTGCGACTGTATAAAGGCTTGCTCGGCATTTGGAACGGATCGTTGAAGGCAGAGCTCCGGTGTAAGG\nAAAAGATACTTGCAAACAAGACGAGGACGTTCACTGCTGCACCTCTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGAT\nGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACAGTTGGGATGACTAAGTTTTATGGTGGTTGGGA\nTAAACTGCTTCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCAT\nACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGACTGGGATGTGGGGTTGCAAATGCTGCGTAATTTA\nTACACTGAGATTATTTACACACCTATCTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCA\nGCCTTCTACTGTTGTGGACAACTCTCTTATGGTTGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTG\nAAGAGATTGACAGCACGTGCGTGTTCTTCGTCAATGGTGATGATTTGCTGATTGCCGTGAATCCGGATAAAGAGGGCATT\nCTTGATAGATTGTCACAGCACTTCTCAGATCTTGGTTTAAATTATGATTTTTCGTCAAGAACAAGAAATAAGGAAGAGTT\nATGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTC\nTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCT\nGAACTAACACACCAAATTAGAAGATTCTACTCATGGTTATTGCAACAGCAACCTTTTGCAACAATAGCGCAGGAAGGGAA\nGGCTCCTTATATAGCAAGCATGGCATTAAGGAAACTGTATATGGATAGAGCTGTGGATGAGGAAGAGCTGAGAGCCTTCA\nCTGAAATGATGGTCGCATTAGACGATGAGTTTGAGTGTGACTCTTATGAAGTACACCACCAGGGAAACGATACAATCGAT\nGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGA\nTGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCA\nAAGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACT\nCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAA\nTGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTCTGGGTTATGATGGATGGGGATGAAC\nAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTT\nGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGT\nGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAA\nTGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACA\nGAGAGGCACACCGCCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAAAACATGTGA\n>Oz\nATGGCAACTTACATGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCTTCTTGCGGGCATAT\nTGTGAAGGAGCGAGAAGTGCTGGCTTCCGTTGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGC\nAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACTTACCGATACAAAACTGATGCCCAGATAACGCGCATT\nCAGAAGAAACTGGAGAGGAAGGATAGGGAAGAATATCACTTCCAGATGGCCGCTCCTAGTATTGTGTCAAAAATTACAAT\nAGCTGGTGGAGATCCTCCATCAAAGTCTGAGCCACAAGCACCAAGAGGGATCATTCATACAACTCCAAGGGTGCGTAAAG\nTCAAGACACGTCCCATAATAAAGTTGACAGAAGGCCAGATGAATCATCTCATTAAGCAGGTGAAGCAGATTATGTCGGAG\nAAGAGAGGGTCTGTCCACTTAATTAGTAAGAAGACCACTCATGTTCAATATAAGGAGATACTTGGAGCAACTCGCGCAGC\nGGTTCGAACTGCACATATGATGGGTTTGCGACGGAGAGTGGACTTCCGATGTGATATGTGGACAGTCGGACTTTTGCAAC\nGTCTCGCTCGGACGGACAAATGGTCCAATCAAGTCCGCACTATCAACATACGAAGGGGTGATAGTGGAGTCATTTTGAAC\nACAAAAAGCCTCAAAGGCCACTTTGGTAGAAGTTCAGGAGACTTGTTCATAGTGCGTGGATCACACGAAGGGAAATTGTA\nCGATGCACGTTCTAGAGTTACTCAGAGTGTTTTGAACTCAATGATCCAGTTTTCGAATGCTGATAATTTTTGGAAGGGTC\nTAGACGGTAATTGGGCACAACTGAGATATCCTTCGGATCACACATGTGTAGCTGGTTTACCTGTCGAAGATTGTGGTAGA\nGTTGCTGCATTGATGGCACACAGTATCCTCCCGTGCTACAAGATAACCTGCCCCACCTGTGCTCAACAGTATGCCAGCTT\nGCCGGTTAGCGATCTGTTTAAGCTGTTGCATAAACATGCGAGAGATGGTTTGAACCGATTGGGAGCGGATAAAGACCGGT\nTTATACATGTTAATAAGTTCTTGATAGCGTTAGAGCATCTAACTGAACCGGTGGATTTGAATCTCGAGCTTTTCAATGAG\nATATTTAAATCCATAGGGGAGAAGCAGCAAGCACCGTTCAAGAATTTAAATGTCTTAAATAATTTCTTCCTGAAAGGAAA\nAGAAAATACAGCTCATGAATGGCAAGTGGCTCAATTGAGTTTGCTCGAATTAGCAAGGTTCCAGAAGAATAGAACTGATA\nACATCAAGAAAGGTGATATATCTTTCTTCAGAAATAAATTATCTGCCAAGGCAAACTGGAATCTGTATTTGTCGTGCGAC\nAACCAGTTGGATAAAAATGCAAATTTTCTGTGGGGACAAAGGGAGTATCATGCTAAGCGGTTTTTCTCAAACTTCTTTGA\nGGAAATTGATCCAGCAAAGGGATACTCAGCATATGAAATCCGCAAGCATCCAAATGGAACAAGGAAGCTCTCAATTGGTA\nACTTAGTTGTCCCACTTGATTTAGCTGAGTTTAGGCAGAAGATGAAAGGTGACTATAGGAAACAACCAGGAGTTAGCAGA\nAAGTGCACGAGTTCGAAAGATGGTAATTATGTGTATCCCTGTTGTTGCACAACACTTGATGATGGTTCAGCTATTGAATC\nAACATTCTATCCACCAACCAAAAAGCACCTTGTAATAGGCAATAGCGGTGACCAAAAATTTGTTGATTTACCAAAAGGGG\nATTCGGAGATGTTATACATTGCCAAGCAGGGTTATTGTTATATCAACGTGTTTCTTGCAATGCTTATTAACATTAGCGAG\nGAGGATGCAAAGGATTTCACAAAGAAAGTTCGCGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACTATGATGGATTT\nGGCGACCACTTGTGCTCAAATGAGAATATTCTATCCTGACGTGCATGATGCAGAGCTGCCTAGAATATTGGTTGACCATG\nACACTCAAACGTGTCACGTGGTTGACTCATTTGGCTCGCAAACAACTGGATATCATATTCTAAAAGCATCCAGCGTGTCT\nCAACTTATCTTGTTTGCAAATGATGAATTAGAATCTGATATAAAACATTATAGAGTTGGTGGTGTTCCTAATGCATGCCC\nTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCGCTGAAACTGCTTTTAA\nAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGCTGTTAGATGAGCCTTACCTGTTGATTTTATCAATATTATCTCCT\nGGCATACTGATGGCTATGTATAATAATGGGATTTTTGAACTTGCGGTAAGGTTGTGGATTAATGAGAAACAATCCATAGC\nTATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATCATTG\nATGCTGCAGCTACAGACCTCCTTGATGCTACGTGTGATGGGTTCAACCTACATCTAACGTACCCCACTGCATTGATGGTG\nTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACCCTATTCAAGGCGGGTTTTTCAAGTTACAACACGAGCGTCGT\nACAGATTATGGAAAAAAATTATCTAAATCTCTTGAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTATCCGCAA\nCATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATAC\nAACATATCACCACAAGCATTCTTGGGCCGAAGCGCCCAGGTGGTCAAAGGTACTGCCTCAGGATTGAGTGAGCGATTTAA\nTAATTATTTCAATACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCA\nCTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACTAGCGTAGTGGCAGTGTGTCAGGCAATAATTTTAGATCAGAGG\nAAATATAGGAGAGAAATCGAGTTGATGCAGATAGAGAAGAATGAGATTGTCTGCATGGAGCTATATGCAAGTTTACAGCG\nCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAATCAGTAAACCCTCAGATAGTTCAGTTTGCTC\nAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTT\nATGGCTTTAGTCATTATGGTGTTCGATGCTGAAAGGAGTGATTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCT\nTTCCTCAATGGACTATGAAGTTAGACATCAGTCCTTAGACGATGTGATCAAGAATTTTGATGAGAGGAATGAGATTATTG\nATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATC\nCAGATGGGACATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTCACAAGAGCAACTGCTGTCCAAGTGGC\nTAATGACATTGCCCATAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTACCTG\nTTCATCTTAGTGTAGCCGGATCTGTGCTTTTAATTGAACCAACGCGACCACTAGCGGAGAACGTTTTCAAACAGCTATCT\nAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCTCCAATCTCCGTCATGAC\nTAGCGGATTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAACTTTGTAATATTTGATGAGTGCC\nATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGCAAAGTATTAAAAGTGTCA\nGCTACTCCAGTGGGAAGAGAGGTTGAATTTACAACACAGCAACCAGTCAAGTTAATAGTGGAGGACACACTGTCTTTCCA\nATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTTCAGTTTGGTTCAAACGTACTTGTGTACGTGTCGA\nGCTACAATGAAGTTGATACCTTGGCTAAGCTCCTAACAGACAAGAATATGATGGTCACAAAGGTTGATGGCAGAACAATG\nAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAA\nTGGAGTGACTTTGGACATAGACGTGGTTGTAGATTTTGGGTTAAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCA\nTTGCTTACAATAAGGTGAGTGTTAGCTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTA\nGCATTGCGCATTGGACACACTGAGAAGGGAATTATTGAAATTCCAAGCATGATCGCTACAGAGGCAGCTCTTGCTTGCTT\nTGCATATAACTTACCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAAGTTAAAACAA\nTGCAGCAATTTGAATTGAGTCCCTTCTTTATCCAGAATTTCGTTGCCCATGATGGATCAATGCATCCTGTCATACATGAC\nATTCTTAAAAAGTATAAACTTCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTG\nGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCCTTAGAAATTCCAAAGCAAGCCAAAATTGCATTCCATATCAAAG\nAGATCCCTCCTAAGCTCCACGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGA\nGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTCTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAG\nACTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGATGAAGGATGCTCAAGCATGTTTTCAATTG\nTCAACCTGACAAACACTCTCAGAGCTAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGA\nAGTCAATTGAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTT\nCGTTCATCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTAAAGGGGACTTGGAAGAAGTCATTAGTGGCCAAAG\nACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTG\nTCTCACCAAGGGAAAAATAAATCCAAAAGAATTCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGA\nAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAAAAGGGAAAAGGTAAAGGTACCACAG\nTTGGTATGGGCAAGTCAAGCAGGAGGTTCATCAACATGTATGGGTTTGATCCAACAGAGTACTCATTCATCCAATTCGTT\nGATCCACTCACTGGGGCGCAAATAGAAGAGAATGTCTATGCTGACATTAGAGATATTCAAGAGAGATTTAGTGAAGTGCG\nAAAGAAAATGGTTGAGAATGATGACATTGAAATGCAAGCCTTGGGTGGTAACACGACCATACATGCATACTTTAGGAAAG\nATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAATCCACTCAAAGTTTGTGACAAGACAAATGGCATTGCC\nAAATTTCCTGAGAGAGAGCTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGATGTGAAGGACATACCAGCACAGGA\nAGTGGAGCATGAAGCTAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCCCAAACAGTTTGTAGGCTGAAAG\nTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCATACATAATAGCGAACCACCATTTGTTCAGG\nAGTTACAATGGTTCCATGGAGGTGCGATCCATGCACGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGCGTTCTGCC\nAATTAAAGGTAGGGATATCATCCTCATCAAAATGCCGAAAGATTTCCCTGTCTTTCCACAGAAATTGCATTTCCGAGCTC\nCTACACAGAATGAAAGAGTTTGTTTAGTTGGAACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAACAAGC\nACCACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGATTACCAGTGGT\nGAGCACCACCGATGGATGTCTAGTCGGAATTCACAGTTTGGCAAACAACAAACACACCACGAACTACTACTCAGCCTTCG\nATGAAGATTTTGAAAGCAAGTATCTCCGAACCAATGAGCACAATGAATGGGTCAAGTCTTGGATTTATAATCCAGACACA\nGTGTTGTGGGGCCCGTTGAAACTTAAAGACAGCACTCCCAAAGGATTATTCAAAACAACAAAGCTTGTGCAAGATCTAAT\nCGATCATGATGTAGTGGTGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCG\nCAACAATGAAGAGCCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGAT\nGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCGTATGGGAAAAGCTTGCTGAATAGAGATGCATACATCAAGGA\nCATAATGAAGTATTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAGGAAGCCATCAATAGGGTTATCA\nTCTACCTGCAAGTGCACGGCTTCAAGAAGTGCGCATACGTCACTGACGAGCAAGAAATTTTCAAAGCGCTCAACATGAAA\nGCTGCAGTTGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGT\nCATGCAAAGCTGTCTGCGATTGTATAAAGGCTTGCTTGGCATTTGGAATGGATCATTGAAGGCAGAGCTCCGGTGTAAGG\nAAAAGATACTTGCAAATAAGACGAGGACATTCACTGCTGCACCTTTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGAT\nGATTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACGGTTGGGATGACTAAGTTTTATGGTGGTTGGGA\nTAAACTGCTGCGGCGTTTACCTGAGAATTGGGTATACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCAT\nACTTAATCAATGCTGTTCTCACCATCAGAAGCACATACATGGAAGATTGGGATGTGGGGTTGCAAATGTTGCGCAATTTA\nTACACTGAGATTGTTTACACACCTATTTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGAAATAACAGTGGTCA\nGCCTTCTACTGTTGTGGACAACTCTCTTATGGTCGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTG\nAAGAGATTGACAGCACGTGCGTGTTCTTTGTCAATGGTGATGATTTGCTGATTGCTGTGAATCCGGATAAAGAGGGCATT\nCTTGACAGATTGTCACAACACTTCTCAGATCTTGGTTTGAATTATGATTTCTCGTCAAGAACAAGAAATAAGGAGGAATT\nGTGGTTTATGTCTCATAGAGGCCTACTGATTGAGGGCATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTC\nTCCAATGGGACAGAGCAGACTTGGCTGAACACAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCT\nGAACTAACACACCAAATCAGGAGATTCTACTCATGGTTATTGCAACAGCAACCCTTTGCAACAATAGCGCAGGAAGGGAA\nGGCTCCTTATATAGCAAGCATGGCATTAAGGAAATTGTATATGGATAGGGCTGTGGATGAGGAAGAGCTGAGAGCCTTCA\nCTGAAATGATGGTCGCATTAGACGATGAGTTTGAATTTGACTCTTATGAAGTACACCATCAAGCAAATGACACAATCGAT\nGCAGGAGGAAGCAGCAAGAAAGATGCAAGACCGGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGA\nTGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGAATGCCCAAAAGCA\nAGGGAGCAACCGTGCTAAACCTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACT\nCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAA\nTGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTTTGGGTTATGATGGATGGGAATGAAC\nAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTT\nGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGT\nGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAA\nTGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACA\nGAGAGGCACACCACCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAGAACATGTGA\n>Wilga5\nATGGCAACTTACATGTCAACAATCTGTTTCGGTTCGTTTGAATGCAAGCTACCATACTCACCCGCCTCTTGCGGGCATAT\nTGTGAAGGAACGAGAAGTGCTGGCTTCCGTTGATCCTTTCGCAGATCTGGAAACACAACTTAGTGCACGATTGCTCAAGC\nAAGAATATGCTACTGTTCGTGTGCTCAAGAACGGTACTCTTACGTACCGATACAAGACTGATGCCCAGATAACGCGCATC\nCAGAAGAAACTGGAAAGGAAGGATAGGGAAGAATATCACTTCCAGATGGCAGCTCCTAGTATTGTGTCAAAGATCACTAT\nTGCTGGTGGAGAGCCACCTTCAAAACTTGAATCACAAGTGCGGAGGGGTGTCATCCACACAACTCCAAGGATGCGCACAG\nCAAAAACATATCACACGCCAAAGTTGACAGAGGGACAAATGAACCACCTTATCAAGCAGGTGAAGCAAATTATGTCAACC\nAAAGGAGGGTCTGTTCAACTGATTAGCAAGAAAAGTACCCATGTTCACTATAAAGAAGTTTTGGGATCACATCGCGCAGT\nTGTTTGCACTGCACATATGAGAGGTTTACGAAAGAGAGTGGACTTTCGGTGTGATAAATGGACCGTTGTGCGTCTACAGC\nATCTCGCCAGGACGGACAAGTGGAATAACCAAGTTCGTGCTACTGATCTACGCAAGGGCGATAGTGGAGTTATATTGAGT\nAATACTAATCTCAAAGGAAGCTTTGGGAGAAGCTCGGAGGGCATATTCATAGTGCGTGGGTCGCACGAAGGAAAAATCTA\nTGATGCACGTTCCAAGGTTACTCAAGGGGTTATGGATTCAATGGTTCAGTTCTCAAGCGCTGAAAGCTTCTGGAAGGGAT\nTGGACGGCAATTGGGCACAAATGAGATATCCTACAGATCATACATGTGTGGCAGGCTTACCAGTTGAAGACTGCGGCAGA\nGTTGCAGCGATAATGACACACAGTATTTTACCGTGCTATAAGATAACCTGCCCTACCTGTGCCCAACAATATGCCAATTT\nGCCAGCCAGTGACTTACTTAAGATATTACACAAGCACGCAAGTGATGGTTTAAATCGATTGGGGGCAGACAAAGATCGCT\nTTGTGCACGTCAAAAAGTTCTTGACAATCTTAGAGCACTTAACTGAACCGGTTGATCTGGGTCTAGAAATTTTCAATGAA\nGTATTCAAGTCTATAGGGGAGAAGCAACAATCACCTTTCAAAAACCTGAATATTCTGAATAATTTCTTTTTGAAAGGAAA\nAGAAAATACAGCTCGTGAATGGCAGGTGGCTCAATTAGGCTTACTTGAATGGGCAAGATTCCAAAAGAACAGAACGGATA\nATATCAAGAAAGGAGACATCTCGTTCTTTAGGAATAAACTATCTGCCAAAGCAAATTGGAACTTGTATCTGTCATGTGAT\nAACCAGCTGGATAAGAATGCAAACTTCCTGTGGGGACAGAGGGAATATCATGCTAAGCGATTTTTCTCGAACTATTTCGA\nGGAAATTGATCCAGCGAAGGGCTATTCAGCATATGAAAATCGTTTGCATCCGAATGGGACAAGAAAACTTGCAATTGGAA\nACCTAATTGTACCACTTGATCTGGCTGAGTTTAGGCGGAAGATGAAAGGTGATTATAAAAGACAGCCAGGGGTGAGTAAG\nAAGTGCACGAGCTCGAAGGATGGAAACTACGTGTATCCCTGTTGTTGCACTACACTTGATGATGGCTCAGCTGTTGAATC\nAACATTTTACCCGCCAACTAAGAAGCACCTTGTAATAGGTAATAGTGGCGACCAAAAGTATGTTGACTTACCAAAAGGGA\nATTCTGAGATGTTATATATTGCCAGGCAAGGCTTTTGTTACATTAACATTTTCCTCGCGATGTTGATTAACATTAGTGAG\nGAAGATGCAAAGGATTTCACTAAGAAGGTTCGTGACATGTGTGTGCCAAAGCTTGGAACCTGGCCAACCATGATGGATCT\nGGCTACAACTTGTGCTCAAATGAAAATATTCTACCCTGATGTTCATGATGCAGAACTGCCTAGAATACTAGTCGATCACG\nAAACGCAGACATGCCATGTGGTTGACTCGTTTGGCTCACAAACAACTGGGTATCATATTTTGAAAGCATCTAGCGTGTCC\nCAACTTATTTTGTTTGCTAATGATGAGTTGGAGTCTGACATTAAACATTATAGAGTTGGTGGTGTTCCTAATGCATGCCC\nTGAACTTGGGTCCACAATATCACCTTTCAGAGAAGGAGGAGTTATAATGTCTGAGTCGGCAGCTCTGAAACTGCTTTTGA\nAGGGAATTTTTAGACCTAAGGTGATGAGACAGTTGTTGTTAGATGAGCCTTATCTGTTGATTCTATCAATATTATCCCCT\nGGCATACTGATGGCTATGTACAATAATGGGATTTTTGAACTTGCGGTAAGGTTGTGGATTAATGAGAAACAATCCATAGC\nTATGATAGCATCGCTACTATCAGCTTTAGCCCTACGAGTGTCAGCGGCAGAAACACTCGTCGCACAGAGGATTATAATTG\nATACTGCAGCTACAGATCTCCTTGATGCTACGTGCGATGGGTTCAACCTACATCTAACGTACCCCACTGCGTTGATGGTG\nTTGCAAGTTGTTAAGAATAGAAATGAATGTGATGATACCCTATTCAAGGCGGGTTTTCCAAGTTACAACACGAGCGTCGT\nACAGATCATGGAAAAAAATTATCTAAATCTCTTAAACGATGCTTGGAAAGATTTAACTTGGCGGGAAAAATTATCCGCAA\nCATGGTACTCATACAGAGCAAAACGCTCTATCACTCGGTACATAAAACCCACAGGAAGGGCAGATTTGAAAGGGTTATAC\nAACATATCACCACAAGCATTCTTGGGCCGAGGCGCCAAGGTGGTCAAAGGCACTGCCTCAGGATTGTGCGAGCGATTTAA\nTAATTATTTCTACACTAAGTGTGTAAATATTTCATCCTTTTTCATTCGTAGAATCTTTAGGCGTTTGCCAACTTTCGTCA\nCTTTTGTTAACTCATTATTAGTTATTAGTATGTTAACCAGCGTAGTGGCAGTCTGTCAGGCAATAATTTTAGATCAGAGG\nAAGTATAGGAGAGAAATCGAGTTGATGCAGATAGAGAAGAATGAGATCGTCTGCATGGAGCTATATGCAAGTTTACAGCG\nCAAACTTGAACGCGATTTCACATGGGATGAGTACATTGAGTATTTGAAGTCAGTAAACCCTCAGATAGTTCAGTTTGCTC\nAAGCGCAGATGGAAGAATATGATGTGCGACACCAGCGTTCCACACCAGGTGTTAAAAATTTGGAACAAGTGGTAGCATTT\nATGGCTTTAGTCATCATGGTGTTCGATGCTGAAAGGAGTGATTGCGTGTTCAAAACTCTCAATAAATTTAAGGGTGTCCT\nTTCCTCACTGGACCATGAAGTTAGACATCAGTCCTTAGACGATGTGATCAAGAATTTTGATGAGAGGAATGAGGTTATTG\nATTTTGAATTGAGTGAGGACACAATTCGAACATCATCAGTGCTAGATACAAAGTTTAGTGATTGGTGGGACCGACAAATC\nCAGATGGGGCATACACTTCCACATTACAGAACCGAGGGGCACTTCATGGAATTCACAAGAGCAACTGCTGTCCAAGTGGC\nTAATGACATTGCCCATAGCGAACACCTAGACTTTCTAGTAAGGGGAGCTGTTGGGTCTGGAAAGTCAACTGGGTTGCCTG\nTTCATCTTAGTGTAGCCGGATCTGTGCTTTTAATTGAACCAACGCGACCACTAGCGGAGAACGTTTTCAAACAGCTATCT\nAGTGAACCATTCTTCAAGAAGCCAACACTGCGTATGCGTGGAAATAGTATATTTGGCTCTTCTCCAATCTCCGTCATGAC\nTAGCGGATTTGCGCTACACTACTTCGCCAATAATCGCTCTCAATTAGCTCAGTTCAACTTTGTAATATTTGATGAGTGCC\nATGTTCTGGATCCTTCCGCAATGGCGTTCCGCAGTCTGCTGAGTGTTTATCATCAAGCATGCAAAGTATTAAAAGTGTCA\nGCTACTCCAGTGGGAAGAGAGGTTGAATTTACAACACAGCAACCAGTCAAGTTAATAGTGGAGGACACACTGTCTTTTCT\nATCATTTGTTGATGCACAAGGTTCTAAAACTAATGCTGATGTTGTTCAGTTTGGTTCAAACGTACTTGTGTACGTGTCGA\nGCTACAATGAAGTTGATACCTTGGCTAAGCTCCTAACAGACAAGAATATGATGGTCACAAAAGTTGATGGCAGAACAATG\nAAGCACGGTTGCCTAGAAATTGTCACAAAAGGAACCAGTGCGAGACCACATTTTGTTGTAGCAACCAACATAATTGAGAA\nTGGAGTGACTTTGGACATAGACGTGGTTGTAGATTTTGGGTTGAAAGTCTCACCGTTCTTGGACATTGACAATAGGAGCA\nTTGCTTACAATAAGGTGAGTGTTAGCTATGGTGAGAGAATTCAAAGGCTGGGTCGTGTTGGACGCTTCAAGAAAGGAGTA\nGCATTGCGCATTGGACACACTGAGAAGGGAATTATTGAAATTCCAAGCATGATCGCTACAGAGGCGGCTCTTGCTTGCTT\nTGCATATAACTTGCCAGTGATGACAGGAGGCGTCTCAACTAGTCTGATTGGCAATTGTACTGTGCGCCAAGTTAAAACAA\nTGCAGCAATTTGAATTGAGTCCCTTCTTTATCCAGAATTTTGTTGCCCATGATGGATCAATGCATCCTGTCATACATGAC\nATTCTTAAAAAGTATAAACTTCGAGATTGTATGACACCTTTGTGCGATCAGTCTATACCATACAGGGCATCGAGCACTTG\nGTTATCGGTTAGTGAATATGAGCGACTTGGAGTGGCCTTAGAAATTCCAAAGCAAGTCAAAATTGCATTCCATATCAAAG\nAGATCCCTCCTAAGCTCCATGAAATGCTTTGGGAAACGGTTGTCAAGTACAAAGACGTTTGCTTATTTCCAAGCATTCGA\nGCATCGTCCATCAGCAAAATCGCATACACATTGCGTACAGACCTTTTCGCCATCCCAAGAACTCTAATATTGGTGGAGAG\nACTGCTTGAAGAGGAGCGAGTGAAGCAGAGCCAATTCAGAAGTCTCATCGACGAAGGATGCTCAAGCATGTTTTCAATTG\nTCAACTTGACAAACACTCTCAGAGCTAGATATGCAAAAGATTACACCGCAGAGAACATACAAAAACTTGAGAAAGTGAGA\nAGTCAATTGAAAGAATTCTCAAATTTGGATGGTTCTGCATGTGAGGAAAATTTAATAAAGAGGTATGAGTCTTTGCAGTT\nCGTTCATCACCAAGCTGCGACGTCACTTGCAAAGGATCTCAAGTTGAAGGGGACTTGGAAGAAGTCATTAGTGGCCAAAG\nACTTGATCATAGCAGGCGCTGTTGCAATTGGTGGAATAGGACTCATATATAGTTGGTTCACACAATCAGTTGAGACTGTG\nTCTCACCAAGGGAAAAATAAATCCAAAAGAATCCAAGCCTTGAAGTTTCGCCATGCTCGTGACAAAAGGGCTGGCTTTGA\nAATTGACAACAATGATGACACAATAGAGGAATTCTTTGGATCTGCATACAGGAAAAAGGGAAAAGGTAAAGGTACCACAG\nTTGGTATGGGCAAGTCAAGCAGGAGGTTCATCAACATGTATGGGTTTGACCCAACAGAGTACTCATTCATCCAATTCGTT\nGATCCACTCACTGGGGCGCAAATAGAAGAGAATGTCTATGCTGACATTAGAGATATTCAAGAGAGATTTAGTGAAGTGCG\nAAAGAAAATGGTTGAGAATGATGACATTGAAATGCAAGCCTTGGGCAGTAACACGACCATACATGCATACTTCAGGAAAG\nATTGGTCTGACAAAGCTTTGAAGATTGATTTAATGCCACATAATCCACTCAAAGTTTGTGACAAGACAAATGGCATTGCC\nAAATTTCCTGAGAGAGAACTCGAACTAAGGCAGACTGGGCCAGCTGTAGAAGTCGACGTGAAGGACATACCAGCACAGGA\nGGTGGAGCATGAAGCTAAATCGCTCATGAGAGGCTTGAGAGACTTCAACCCAATTGCCCAAACAGTTTGTAGGCTGAAAG\nTATCTGTTGAATATGGGACATCAGAGATGTACGGTTTTGGATTTGGAGCGTACATAATAGCGAACCACCATTTGTTCAGG\nAGTTACAATGGTTCCATGGAGGTGCGATCCATGCACGGTACATTCAGGGTGAAGAATCTACACAGTTTGAGTGTTCTGCC\nAATTAAAGGTAGGGATATCATCCTCATCAAAATGCCGAAAGATTTCCCTGTCTTTCCACAGAAATTGCATTTCCGAGCTC\nCTACACAGAATGAAAGAGTTTGTTTAGTTGGAACCAACTTTCAGGAGAAGTATGCATCGTCGATCATCACAGAAGGAGGC\nACCACTTACAATATACCAGGCAGCACATTCTGGAAGCATTGGATTGAAACAGATAATGGACATTGTGGACTACCAGTGGT\nGAGCACCACCGATGGATGTCTAGTCGGAATTCACAGTTTGGCAAACAACAAACACACCACGAATTACTACTCAGCCTTCG\nATGAAGATTTTGAAAGCAAGTATCTCCGAACCAATGAGCACAATGAATGGGTCAAGTCTTGGATTTATAATCCAGACACA\nGTATTGTGGGGCCCGTTGAAACTTAAAGACAGCACTCCCAAAGGATTATTCAAAACAACAAAGCTTGTGCAAGATCTAAT\nTGATCATGATGAAGTGGTGGAGCAAGCTAAGCACTCTGCGTGGATGTTTGAAGCCTTGACAGGAAATTTGCAAGCTGTCG\nCAACAATGAAGAGCCAATTAGTAACCAAGCATGTAGTTAAAGGAGAGTGTCGACACTTCAAAGAATTCCTGACTGTGGAT\nGCAGAAGCAGAGGCATTCTTCAGGCCTTTGATGGATGCGTATGGGAAAAGTTTGCTGAATAGAGATGCATACATCAAGGA\nCATAATGAAGTATTCAAAACCTATAGATGTTGGTATCGTGGACTGTGATGCATTTGAGGAAGCCATCAATAGGGTTATCA\nTCTACCTGCAAGTGCACGGCTTCAAGAAGTGCGCATACGTTACTGACGAGCAAGAAATTTTTAAAGCGCTCAACATGAAA\nGCTGCAGTCGGAGCCATGTATGGTGGCAAAAAGAAAGACTATTTTGAGCATTTCACTGATGCAGATAAGGAAGAAATAGT\nCATGCAAAGCTGTCTGCGATTGTATAAAGGCTTGCTTGGCATTTGGAATGGATCATTGAAGGCAGAGCTCCGGTGTAAGG\nAAAAGATACTTGCAAATAAGACGAGGACATTCACTGCTGCACCTTTAGACACTTTGCTGGGTGGTAAAGTGTGTGTTGAT\nGACTTCAATAATCAATTTTATTCAAAGAATATTGAATGCTGTTGGACGGTTGGGATGACAAAGTTTTATGGTGGTTGGGA\nTAAACTGCTGCGGCGTTTACCTGAGAATTGGGTTTACTGTGATGCCGATGGCTCACAGTTTGATAGTTCACTAACTCCAT\nACTTAATCAATGCAGTTCTCACCATTAGAAGCACATACATGGAAGATTGGGATGTGGGGTTGCAAATGTTGCGCAATTTA\nTACACTGAGATTGTTTACACACCTATTTCAACTCCAGATGGAACAATTGTTAAGAAGTTCAGAGGGAATAACAGTGGTCA\nGCCTTCTACTGTTGTGGACAACTCTCTTATGGTCGTCCTTGCCATGCACTATGCTCTCATCAAAGAATGCATTGAGTTTG\nAAGAGATTGACAGCACGTGCGTGTTCTTTGTCAATGGTGATGATTTGCTGATTGCTGTGAATCCGGATAAAGAGGGCATT\nCTTGACAGATTGTCACAACACTTCTCAGATCTTGGTTTGAATTATGATTTCTCGTCAAGAACAAGAAATAAGGAGGAGTT\nGTGGTTTATGTCTCATAGAGGCCTACTGATTGAGGCAATGTACGTGCCGAAACTTGAAGAAGAAAGGATTGTGTCCATTC\nTCCAATGGGACAGAGCAGACTTGGCTGAACATAGGCTTGAGGCGATTTGCGCAGCTATGATAGAGTCCTGGGGTTATTCT\nGAACTAACACACCAAATCAGGAGATTCTACTCATGGTTATTGCAACAGCAACCCTTTGCAACAATAGCGCAGGAAGGGAA\nGGCTCCTTATATAGCAAGCATGGCATTAAGGAAATTGTATATGGATAGGGCTGTGGATGAGGAAGAGCTGAGAGCCTTCA\nCTGAAATGATGGTCGCATTAGATGATGAGTTTGAATTTGACTCTTATGAAGTATACCATCAAGCAAATGACACAATCGAT\nGCAGGAGAAAGCAGCAAGAAAGATGCAAGACCAGAGCAAGGCAGCATCCAGTCAAACCCGAACAAAGGAAAAGATAAGGA\nTGTGAATGCTGGTACATCTGGGACACATACTGTGCCGAGAATCAAGGCTATCACGTCCAAAATGAGGATGCCCAAAAGCA\nAGGGAGCAACCGTGCTAAACTTAGAACACTTGCTTGAGTATGCTCCACAACAAATTGATATTTCAAATACTCGGGCAACT\nCAATCACAGTTTGATACGTGGTATGAGGCAGTGCGGATGGCATACGACATAGGAGAAACTGAGATGCCAACTGTGATGAA\nTGGGCTTATGGTTTGGTGCATTGAAAATGGAACCTCGCCAAATGTCAACGGAGTTTGGGTTATGATGGATGGGAATGAAC\nAAGTCGAGTACCCGTTGAAACCAATCGTTGAGAATGCAAAACCAACCCTTAGGCAAATCATGGCACATTTCTCAGATGTT\nGCAGAAGCGTATATAGAAATGCGCAACAAAAAGGAACCATATATGCCACGATATGGTTTAATTCGAAATCTGCGGGATGT\nGGGTTTAGCGCGTTATGCCTTTGACTTTTATGAGGTCACATCACGAACACCAGTGAGGGCTAGGGAAGCGCACATTCAAA\nTGAAGGCCGCAGCATTGAAATCAGCCCAACCTCGACTTTTCGGGTTGGACGGTGGCATCAGTACACAAGAGGAGAACACA\nGAGAGGCACACCACCGAGGATGTCTCTCCAAGTATGCATACTCTACTTGGAGTCAAGAACATGTGA\n"
  },
  {
    "path": "inst/extdata/Gram-negative_AKL.fasta",
    "content": ">Random_Gram-negative_AKL_gjtez\nRWTHLASGRTYNYKFNPPKQYGKDDITGEDLIQRED\n>Random_Gram-negative_AKL_dibhu\nRWTHLNSGRTYHYKFNPPKVHGVDDVTGEPLVQRED\n>Random_Gram-negative_AKL_elirp\nRWTHLASGRTYNYKFNPPKQYGKDDITGEDLIQRED\n>Random_Gram-negative_AKL_dnjtf\nRWTHLASGRTYNYKFNPPKQYGKDDITGEDLIQRED\n>Random_Gram-negative_AKL_qzcvn\nRLIHQPSGRSYHEEFNPPKEPMKDDVTGEPLIRRSD\n>Random_Gram-negative_AKL_mqvro\nRRVHPGSGRVYHVVYNPPKVEGKDDETGEELIVRAD\n>Random_Gram-negative_AKL_qjvxv\nRRVHPASGRIYHLVHNPPEVDGVDDATGEMLIQRDD\n>Random_Gram-negative_AKL_mlmcf\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_bfqnk\nRWVHEPSGRVYNTDFNAPKVPGKDDITGEPLTQRQD\n>Random_Gram-negative_AKL_kvcas\nRWVHEPSGRVYNTDFNVPKVPGKDDVTGEPLTQRQD\n>Random_Gram-negative_AKL_xrbtp\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_yggsb\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_wntes\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_gbdos\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_lhrmd\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_zhrxk\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_taozi\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_reram\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_rukmd\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_dfkbq\nRRVHQPSGRSYHIVYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_dvomf\nRRVHQPSGRSYHIIYNPPKTEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_vgzym\nRWIHPSSGRVYNLDFNPPQVQGIDDITGEPLVQQED\n>Random_Gram-negative_AKL_ptlzq\nRLFHPGSGRVYHKVTNPPKKPMTDDITGEPLIIRKD\n>Random_Gram-negative_AKL_lgmpt\nRLFHPGSGRTYHTKFNPPKVPMKDDQTGEDLIVRKD\n>Random_Gram-negative_AKL_stqhz\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_leceq\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_arqwq\nRWVHEPSGRVYNTDFNAPKVPGKDDVTGEPLTQRED\n>Random_Gram-negative_AKL_edhmf\nRRVHPGSGRSYHVKFNPPKVEGKDDVTGEPLVQRDD\n>Random_Gram-negative_AKL_jefev\nRRVHPGSGRVYHVVFNPPKVEGKDDVTGEDLAIRPD\n>Random_Gram-negative_AKL_mgvft\nRRTHPASGRTYHVKFNPPKVDGKDDVTGEPLIQRDD\n>Random_Gram-negative_AKL_pdjwi\nRWIHPSSGRSYHTKFAPPKVPGVDDVTGEPLIQRKD\n>Random_Gram-negative_AKL_hbdlm\nRWIHPSSGRSYHTKFAPPKTPGLDDVTGEPLIQRKD\n>Random_Gram-negative_AKL_qinsk\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_xiszr\nRRVHAPSGRVYHVKFNPPKVEGKDDVTDEELTTRKD\n>Random_Gram-negative_AKL_tsjls\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_ivaqd\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_uceun\nRRVHESSGRIYHVKYDPPKVEDKDNETGEALIQRED\n>Random_Gram-negative_AKL_yfnqy\nRRVHPASGRVYHTEHNPPKVAGKDDVTGEELIQRED\n>Random_Gram-negative_AKL_fquul\nRRVHPASGRVYHTEHNPPKVAGKDDVTGEELIQRED\n>Random_Gram-negative_AKL_hrvsw\nRWVHVPSGRVYNLDYNPPKVPFKDDVTGEPLSKRED\n>Random_Gram-negative_AKL_wdkfx\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_lpxmt\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_bkmgo\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_rqtgn\nRRVHAPSGRVYHVKFNPPKVAGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_fzfio\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEXLTTRKD\n>Random_Gram-negative_AKL_ptxxd\nRRVHAPSGRVYHVKFNPPKVEGKDDVTGEELTTRKD\n>Random_Gram-negative_AKL_fmdzi\nRRVHVASGRTYHVKYNPPKNEGKDDETGEPLIQRDD\n>Random_Gram-negative_AKL_ehnfi\nRRAHLPSGRTYHSVYNPPKEEGKDDITGEELVVRDD\n>Random_Gram-negative_AKL_gwaom\nRRVHPGSGRVYHIKHNPPKEEGKDDETGEELVIRPD\n>Random_Gram-negative_AKL_ngobh\nRRAHLPSGRTYHNVYNPPKEEGKDDITGEELVVRDD\n>Random_Gram-negative_AKL_jgpqr\nRRVHPESGRIYHTVYNPPKVEGKDDETGEDLVQRPD\n>Random_Gram-negative_AKL_jvlnt\nRRVHPGSGRIYHVEHNPPKVEGVDDETGEALVHRDD\n>Random_Gram-negative_AKL_dnrym\nRRVHEASGRVYHVMHNPPKESGIDDITGEPLIQRDD\n>Random_Gram-negative_AKL_omfoc\nRRVHPGSGRVYHRIHNPPTLDDRDDLTGEPLVQRDD\n>Random_Gram-negative_AKL_gnjvq\nRRVHPGSGRVYHVVYNPSKVEGKDDVTGEDLIIRDD\n>Random_Gram-negative_AKL_xapht\nRRMHPASGRNYHIIFNPPKVEGKDDATGEDLIQRED\n>Random_Gram-negative_AKL_vhtcj\nRWYHLKSGRIYHTLYNPPLTAGKDDDTGEPLEQ---\n>Random_Gram-negative_AKL_jnrhr\nRWVHKSSGRTYHEVFRPPRTPGKDDVTGEDLHQRPD\n>Random_Gram-negative_AKL_kmyvp\nRWIHKPSGRTYHEVFRPPKTPGKDDITGEDLYQRPD\n>Random_Gram-negative_AKL_bbbbb\nRWYHPKSGRIYHTFYNPPLNAGKDDYTGEPLVQ---\n>Random_Gram-negative_AKL_obouo\nRRIHPASGRTYHTKFNPPKVADKDDVTGEPLITRTD\n>Random_Gram-negative_AKL_ellkt\nRRTHPASGRTYHVKFNPPKVEGKDDVTGEPLVQRDD\n>Random_Gram-negative_AKL_sxldp\nRRVHQASGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_sckku\nRRVHQASGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_fbqzv\nRRVHQTSGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_ypuig\nRRVHQTSGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_ltbjc\nRRVHQASGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_wmjap\nRRVHQASGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_oxood\nRRVHQASGRSYHIVYNPPKVEGKDDVTGEDLIIRAD\n>Random_Gram-negative_AKL_ddrjb\nRWVHEPSGRVYNDTFNAPQVPGRDDVTGEPLVRRPD\n>Random_Gram-negative_AKL_nwqma\nRWIHPASGRSYHTKFAPPKVEGKDDFTGEPLIKRKD\n>Random_Gram-negative_AKL_dtzyf\nRRVHPASGRVYHTEHNPPKVAGKDDETGEELIQRED\n>Random_Gram-negative_AKL_whwzb\nRWIHPSSGRTYHTKFAPPKVSGVDDVTGEPLIQRKD\n>Random_Gram-negative_AKL_dmvij\nRWVHVPSGRVYNLDYNPPKVPFKDDITGEPLTKRSD\n>Random_Gram-negative_AKL_xtwaf\nRYVHLPSGRIYSLDYNPPKVPFKDDVTGEDLVKRED\n>Random_Gram-negative_AKL_iyejp\nRRTHPASGRTYHVKFNPPKVEGKDDVTGEPLVQRDD\n>Random_Gram-negative_AKL_cbxjs\nRLIHKPSGRIYHKIFNPPKTPFKDDITNEPLIQRED\n>Random_Gram-negative_AKL_oglie\nRRAHLPSGRTYHTVYNPPKEEGKDDVTGEELVVRDD\n>Random_Gram-negative_AKL_jqtgo\nRRTHPASGRTYHVKFNPPKVEGKDDVTGEPLVQRDD\n>Random_Gram-negative_AKL_prkvf\nRRQHPGSGRVYHLKYNPPKQEGLDDETGEPLIQRDD\n>Random_Gram-negative_AKL_aincb\nRRTHPASGRTYHVKFNPPKVEGKDDVTGEPLVQRDD\n>Random_Gram-negative_AKL_whyuk\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_tbpgo\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_lkebr\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVVRDD\n>Random_Gram-negative_AKL_npiwv\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVVRDD\n>Random_Gram-negative_AKL_zzajl\nRRVHAASGRVYHVKFNPPKVEDKDDVTGEELTIRKD\n>Random_Gram-negative_AKL_pwhal\nRRAHLASGRTYHVVYNPPKVEGKDDVTGEDLVVRDD\n>Random_Gram-negative_AKL_hcuqd\nRRTHPASGRTYHVKFNPPKQEGIDDITGEPLVQRDD\n>Random_Gram-negative_AKL_pswng\nRWVHAPSGRVYNTQFNAPKEPGKDDVTGEPLVQRAD\n>Random_Gram-negative_AKL_eueyh\nRWVHAPSGRVYNTTFHAPKVAGLDDITGEKLTKRPD\n>Random_Gram-negative_AKL_iplvh\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_ykocu\nRRAHLPSGRTYHVVYNPPKVEGKDDVTGEDLVIRED\n>Random_Gram-negative_AKL_pzokl\nRYVHVPSGRVYNLQYNPPKVPGLDDITGEPLTKRLD\n>Random_Gram-negative_AKL_dpucn\nRLVHEPSGRVYHMTSKPPKVPMRDDITNEPLTQRKD\n>Random_Gram-negative_AKL_hcasp\nRRVHVASGRTYHVKYNPPKTEGVDDETGEPLIQRDD\n>Random_Gram-negative_AKL_ynuts\nRWVHAPSGRVYNTTFHAPKVPGLDDITGEKLTKRPD\n>Random_Gram-negative_AKL_kfbqi\nRRIHPASGRTYHTKFNPPKVADKDDVTGEPLITRTD\n>Random_Gram-negative_AKL_fbphm\nRRIHPASGRTYHTKFNPPKVADKDDVTGEPLITRTD\n>Random_Gram-negative_AKL_xrebl\nRRIHPASGRTYHTKFNPPKVADKDDVTGEPLITRTD\n>Random_Gram-negative_AKL_snboh\nRRVHQPSGRTYHVVYNPPKVEGKDDVTGEDLIIRQD\n"
  },
  {
    "path": "inst/extdata/Gram-positive_AKL.fasta",
    "content": ">Random_Gram-positive_AKL_pjxgp\nRRTCVGCGTAFNYVMEPPKKEGICDACGGKLVVRDD\n>Random_Gram-positive_AKL_essyp\nRRTCVGCGTAFNYVMEPPKKEGICDACGGKLVVRDD\n>Random_Gram-positive_AKL_lopeh\nRRIHEASGRVYHVVFNPPKKSGVDDETGDQLLQRED\n>Random_Gram-positive_AKL_mzuep\nRRICRSCGATYHIHFNPPAQAGICDKCGGELYQRAD\n>Random_Gram-positive_AKL_pjycw\nRLYCPNCGETYHVSWKPPRKPGVCDNCGSRLVRRRD\n>Random_Gram-positive_AKL_tmsgs\nRRICARCGAIYHVKYMPPKIPGICDKCGGPLVQRRD\n>Random_Gram-positive_AKL_byrtv\nRRICQSCGGIFNIYTLPTKEKGICDLCKGSLYQRKD\n>Random_Gram-positive_AKL_hynwj\nRRICKSCGGIFNIYTLPTKEKEICDLCKGILYQRKD\n>Random_Gram-positive_AKL_ycsho\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_kgtzr\nRWIHPSSGRVYNLDFNPPQVQED-------------\n>Random_Gram-positive_AKL_tmtym\nRRLDPVTGKIYHLKYSPPENEEIAS----RLTQRFD\n>Random_Gram-positive_AKL_diswt\nRYICPKCGRVYNLLFNPPKNDLRCDDDGTPLIRRSD\n>Random_Gram-positive_AKL_mgvxi\nRRLCPNCQRTYHILFAPPKKDSLCDYCSVQLVQRAD\n>Random_Gram-positive_AKL_cfbwo\nRRICKTCGASYHLVFNPPAEEGKCDKDGGELYTRAD\n>Random_Gram-positive_AKL_rmuqg\nRYTCGNCGAGYHDDFKKPKVEGTCDDCGEQMKRRAD\n>Random_Gram-positive_AKL_ejowt\nRSTCGSCGEVYNDITKPIPQDGKCTKCGGEFKRRAD\n>Random_Gram-positive_AKL_qzdve\nRFTCGGCGEGYHDSFKQPQAMGTCDKCGGEFKRRAD\n>Random_Gram-positive_AKL_yiwft\nRYSCGSCGAVYHDDTKPTKVEGVCDVCGSDLRRRAD\n>Random_Gram-positive_AKL_ubfcu\nRSTCAACGEGYHDSFKQPARAGTCDKCGGEFKRRPD\n>Random_Gram-positive_AKL_yqxwz\nRSTCGNCGEVYHDVTKPQPADGKCEKCGADFKRRAD\n>Random_Gram-positive_AKL_vvrnb\nRSTCGNCGEVYHDVTKPQPADGKCEKCGADFKRRAD\n>Random_Gram-positive_AKL_kcxvy\nRSTCANCGEVYHDETKPIPADGKCSVCGGEFKRRAD\n>Random_Gram-positive_AKL_qimxn\nRYTCGGCGEGYHDSFKTPAVAGVCDKCSGDMQRRPD\n>Random_Gram-positive_AKL_gooyo\nRYTCGGCGEGYHDSFKVPSVEGTCDKCGGEMKRRAD\n>Random_Gram-positive_AKL_zgspg\nRSTCGNCGEVYNDMTKPWPADGKCAKCGSDVRRRAD\n>Random_Gram-positive_AKL_fbyaj\nRFSCANCGALYHDTANPPAKEGVCDVCHSEFKRRPD\n>Random_Gram-positive_AKL_spvav\nRSTCGGCGEVYHDETKPWPEDGKCTNCGSEVKRRTD\n>Random_Gram-positive_AKL_elbro\nRYTCGGCGEGYHDSFKQPAVAGTCDKCGSNMTRRAD\n>Random_Gram-positive_AKL_oxdgk\nRRLCSGCGLDYNLIHHRPQVIDQCDVCGAPLTQRAD\n>Random_Gram-positive_AKL_fxbao\nRFTCGDCGEGYHDTFKTPKVADTCDNCGANMTRRAD\n>Random_Gram-positive_AKL_siwfh\nRYTCAGCGEGYHDSFKQPAVEGKCDKCGGEMTRRAD\n>Random_Gram-positive_AKL_riiin\nRSTCGGCGEVYHDETKPWAADGKCTNCGSDVKRRAD\n>Random_Gram-positive_AKL_eposf\nRRMCGQCGRSWHVEFNPTRVEGICDTCAGSLHQRED\n>Random_Gram-positive_AKL_klpbd\nRRIHLSSGRSYHIEFNPPRVEGKDDLSGEDLIQRED\n>Random_Gram-positive_AKL_txwex\nRRTDPLTGTIYHLKYNPPPEDDT-------------\n>Random_Gram-positive_AKL_lyfma\nRRSCPDCGFVYNIKMDPPKVDGVCDKCGCPLITRKD\n>Random_Gram-positive_AKL_vlwew\nRRSCPDCGFVYNIKMDPPKLDGVCDKCGCPLITRKD\n>Random_Gram-positive_AKL_xzjec\nRRLDPVTGRIYNLKSDPPSPDVVDR-----------\n>Random_Gram-positive_AKL_zyvla\nRWVHKASGRSYHATFNPPKSLKAC------------\n>Random_Gram-positive_AKL_xxxrz\nRYTCAKCGAGYHDKFQQPKVAGTCDSCGGEFTRRAD\n>Random_Gram-positive_AKL_hlisn\nRFTCAACGEGYHDHFKQAAVAGTCDKCGGDFRRRPD\n>Random_Gram-positive_AKL_xeeqt\nRYSCGNCGAVYHDETKPTKVEGVCDVCGSDLRRRAD\n>Random_Gram-positive_AKL_yuxgl\nRRLDPVTGRIYHLKYSPPENEEIAA----RLTQRFD\n>Random_Gram-positive_AKL_gzzla\nRRICRSCGASYHVLFNKPAIEGRCNACGGELYQRSD\n>Random_Gram-positive_AKL_lrjxz\nRRICESCGTTYHLVFNPPKVEGICDIDGGKLYQRED\n>Random_Gram-positive_AKL_subqs\nRRFCPNCKAGFHIDFMPSSKGNICDKCGTELITRKD\n>Random_Gram-positive_AKL_rqgxp\nRRACVDCGATYHLVYAPTKEEGICDKCGGGLILRDD\n>Random_Gram-positive_AKL_ffzzd\nRYTCANCGAGYHDTFKQPKIEGVCDECGSEFKRRPD\n>Random_Gram-positive_AKL_ytofs\nRRTSKVTGKIYHIKFNPPVDEKEED-----LVQRAD\n>Random_Gram-positive_AKL_oygvi\nRQTCKTCGSTYNIYYFPSKHPNVCDDCGGKLYQRSD\n>Random_Gram-positive_AKL_cyvfv\nRFICRNCGATYHKLYNAPKVEGTCDVCGHEFYQRDD\n>Random_Gram-positive_AKL_mwtjn\nRRACVGCGATYHVVYNPTKEEGTCDTCGGELIVRDD\n>Random_Gram-positive_AKL_vaoec\nRRACLKCGATYHIVYAAPKVENVCDTCGENLVLRDD\n>Random_Gram-positive_AKL_gxjsl\nRRACVGCGATYHLVYAPTKTEGICDVCGKELILRDD\n>Random_Gram-positive_AKL_nyalo\nRYTCGGCGEGYHDSFKMPNVAGICDKCGGEMKRRAD\n>Random_Gram-positive_AKL_unitb\nRSTCGGCGEVYNDITKPWPADGKCAKCGSDVKRRAD\n>Random_Gram-positive_AKL_cgcxi\nRRVHEGSGRIYHVKYDPPKTEGKDDETGDALIQRED\n>Random_Gram-positive_AKL_umndp\nRRVHAPSGRVYHTVYNPPKVAGKDNETGDELTIRVD\n>Random_Gram-positive_AKL_diikj\nRFTCGGCGEGYHDSFKPTDKPGICDACGGDMKRRAD\n>Random_Gram-positive_AKL_uwyra\nRSTCAGCGEVYNDITKPIPADGICPKCGGEFKRRAD\n>Random_Gram-positive_AKL_nlnlr\nRRQDPETGAIYHLKFNPPADEAVLA----RLVQRKD\n>Random_Gram-positive_AKL_easkj\nRRVCSHCGTPFHLESNPPKKPDVCDVCGGELIERDD\n>Random_Gram-positive_AKL_bzgzv\nRQNCRKCGEIYNKLFMPSKVEGVCDKCGGELFQRPD\n>Random_Gram-positive_AKL_rdkpc\nRRICESCGTTYHLVFNPPKVEGICDIDGGKLYQRED\n>Random_Gram-positive_AKL_etmff\nRRMCKECGATYHILFNPPTKADQCDKCGGQLYQRDD\n>Random_Gram-positive_AKL_ynwrz\nRRICESCGTTYHLVFNPPKVEGICDIDGGKLYQRED\n>Random_Gram-positive_AKL_fxmpj\nRRACVDCGATYHIVYAPTEKEDVCDKCGGSLILRDD\n>Random_Gram-positive_AKL_wkyro\nRFICRNCGTTYHKLYNAPKVEGTCDVCGHEFYQRDD\n>Random_Gram-positive_AKL_yemlt\nRQTCKTCGATYNIYYFPSKHPNICDDCGGKLYQRSD\n>Random_Gram-positive_AKL_ekvfi\nRRLDPVTGRIYHLKYSPPENEEIAA----RLTQRFD\n>Random_Gram-positive_AKL_quwnk\nRRACVGCGATYHIVYNPTKVEGKCDVCSSDLILRDD\n>Random_Gram-positive_AKL_qrxch\nRRVCEKCGATYHLLYKKPKAEGVCDICGGTLIQRKD\n>Random_Gram-positive_AKL_qrppk\nRRICESCGTTYHLVFNPPKVEGICDIDGGKLYQRED\n>Random_Gram-positive_AKL_fnlju\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_kvakt\nRRICKECGATYHLEFNPPAKADVCDKCGGKLYQRSD\n>Random_Gram-positive_AKL_rcuhp\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_tbwwv\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_acvdx\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_wojzq\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_jipjw\nRRICKECGATYHLEFNPPAKADVCDKCSGELYQRSD\n>Random_Gram-positive_AKL_hrayu\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_kgavl\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_pusim\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_vwdds\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_tkoth\nRRICKECGATYHLEFNPPATADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_hcbyk\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_ahjnk\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_dryqn\nRRICKECGATYHLEFNAPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_bkgrl\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_piqkt\nRRICKECGATYHLEFNAPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_xmjgd\nRRICKECGATYHLEFNAPAKADVCDKCGGKLYQRSD\n>Random_Gram-positive_AKL_fmrku\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_kiqtd\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_awqsq\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_cjuqw\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_etzsu\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_ndrpd\nRRICKECGATYHLEFNPPANADVCDKCGGDLYQRSD\n>Random_Gram-positive_AKL_taebr\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_qvzlz\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n>Random_Gram-positive_AKL_yhzrz\nRRICKECGATYHLEFNPPAKADVCDKCGGELYQRSD\n"
  },
  {
    "path": "inst/extdata/LeaderRepeat_All.fa",
    "content": ">Ain_RyC-MR95\nATCCGTTGATCAAATTTGAGGTTTGAGAGATATGTAAATTCAAAGGATAATCAAAC\n>Asp_D21\nATCCGTTGATCAAATTTGAGGTTTGAGAGATATGTAAATTCAAAGGATAATCAAAC\n>Bbi_S17\nAGGAATCCTTAAGGCTATCGGTTTCAGATGCCTGTCAGATCAATGACTTTGACCAC\n>Bca_FSLF6-1037\nACAAAATCGACGCATTTGAGGTTTTAGAGCTGTGTTAAATTGAATGGTATTAAAAC\n>Bfi_16/4\nGTTGGAAATTTGGATTTGACGTTTTAGTACCCGGGAAAATTAAGTGATTGGAAAAC\n>Bpe_CAG:437\nTTTGGATAACATGATTTGGTATTTTAGTACCTGAACAAATTACGTGACTGTAAAAC\n>Bsp_AC2005\nTTTATCATACTATATTTGGTGTTTTAGTACCTAGAGAAATTAAGTGATTAGAAAAC\n>Bth_DSM20171\nACAAAATTTCATTGTTTGAGGTTTTAGAGCTGTGTTAAATTGAATGGTATTAAAAC\n>Cgl_PW2\nGAGAAGATTTTGATCCAATGGTTTTGGAGCAGTGTCGTTCTGACTGGTAATCCAAC\n>Cla_DSM_14151\nATGGCTCTCTAAAATTTGAGGTTTTAGACCAGTGTAATTTTAGAGAGTAGTAAAAC\n>Cma_M35/04/3\nTTTAAATATTACAATTTAAGGTTCTTGTACTTTCTAGATTTTCATATTAGTAAAAC\n>Cmi_DSM15897\nATTGGATTTTTGAATTTGAGGTTTTAGGGTTATGTTATTTTGAACTGAATTAAAAC\n>Csp_CAG:230\nCGATTATATTTGAATTTGATATTTTAGTACCTGAAAGAATTGAGTTATCGTAAAAC\n>Csp_ZWU0011\nTACGTTATAATGAAATTGACATTTTGGTACTCTCGCATCTTTTGGTATAAGGAAAC\n>Dlo_AGR2136\nCGGCGAGAACCGGATTTGAGGTTTGAGAGTCTTGTTAATACGGAAGGATTTTAAAC\n>Edo_DSM3991\nTATGTTAAAATATGTTTGAGGTTTTGTTACCATATGGATTTTTGCTAGATTAAGAC\n>Efa_1141733\nGGAAAAATTTTTTCTGCGAGGTTTTAGAGCTATGCTGATTTGAATGCTTCCAAAAC\n>Efa_D32_1\nGAAAAAAATAATTCTCCGAGGTTTTAGAGTCATGTTGTTTAGAATGGTACCAAAAC\n>Efa_OG1RF_1\nGAAAAAAATAATTCTCCGAGGTTTTAGAGTCATGTTGTTTAGAATGGTACCAAAAC\n>Efa_TX0012\nTTTCAAATTTTAAATTTGAGGTTTTTGTACTCTCAATAATTTCTTATCAGTAAAAC\n>Efa_TX0012_2\nTTTCAAATTTTAAATTTGAGGTTTTTGTACTCTCAATAATTTCTTATCAGTAAAAC\n>Eit_DSM15952\nAGATAAAAAATATCTGCGAGGTTTTAGAGCTATGTTGAATCGAATGCTTCCAAAAC\n>Emu_QU25_DNA\nGAAAAAATTTTTTCTACGAGGTTTTAGAGCTATGTTGAATTGAATGCTTCCAAAAC\n>Eph_ATCCBAA-412\nAGAAAGAAAATGGCTGCGAGGTTTTAGAGCCATGTTGAATTGAATGCTTCCAAAAC\n"
  },
  {
    "path": "inst/extdata/Rfam/RF00458.fasta",
    "content": ">AF178440.1/5925-6123\nUUGACUAUGUGAUCUUGCUUUCG----UAAUAAAAUUCUGUACAUAAAAGUCGAAAGUAUUGCUAUAGUUAAGGUUGCGCUUGCCUAUUUAGGCAUACUUCUCAGGAUGGCGCG-UUGCAGUCCAA-CAAG-AUCCAGGGACUGUACAGAAUUUUCC-UAUACCUCGAGUCGGGUUU-GGAA--UCUAAGGUUGACUCGCUGUAAAUAAU\n>AB017037.1/6286-6484\nGAAAAUGUGUGAUCUGAUUAGAAG--UAAGAAAAUUCCUAG-UUAUAAUAUUUUUAAUACUGCUACAUUUUU-AAGACCCUUAGUUAUUUAGCUUUACCGCCCAGGAUGGGGUG-CAGCGUUCCUG-CAA-UAUCCAGGGCAC--CUAGGUGCAGCCUUGUAGUUUUAGUGGACUUUAGGCU--AAAGAAUUUCACUAGCAAAUAAUAAU\n>AF014388.1/6078-6278\nGUUAAGAUGUGAUCUUGCUUCCUU--AUACAAUUUUGAGAGGUUAAUAAGAAGGAAGUAGUGCUAUCUUAAU-AAUUAGGUUAACUAUUUAGUUUUACUGUUCAGGAUGCCUAU-UGGCAGCCCCA-UAA-UAUCCAGGACAC-CCUCUCUGCUUCUUAUAUGAUUAGGUUGUCAUUUAGAA--UAAGAAAAUAACCUGCUAACUUUCAA\n>AF218039.1/6028-6228\nGCAAAAAUGUGAUCUUGCUUGUAA--AUACAAUUUUGAGAGGUUAAUAAAUUACAAGUAGUGCUAUUUUUGU-AUUUAGGUUAGCUAUUUAGCUUUACGUUCCAGGAUGCCUAG-UGGCAGCCCCA-CAA-UAUCCAGGAAGC-CCUCUCUGCGGUUUUUCAGAUUAGGUAGUCGAAAAACC--UAAGAAAUUUACCUGCUACAUUUCAA\n>AF183905.1/5647-5848\nCCAACAAUGUGAUCUUGCUUGCGGA-GGCAAAAUUUGCACAGUAUAAAAUCUGCAAGUAGUGCUAUUGUUGG-AAUCACCGUACCUAUUUAGGUUUACGCUCCAAGAUCGGUGGAUAGCAGCCCUAUCAA-UAUCUAGGAGAA-CUGUGCU-AUGUUUAGAAGAUUAGGUAGUCUCUAAACA---GAACAAUUUACCUGCUGAACAAAUU\n>AB006531.1/6003-6204\nCUGACUAUGUGAUCUUAUUAAAAUUAGGUUAAAUUUCGAGGUUAAAAAUAGUUUUAAUAUUGCUAUAGUCUU-AGAGGUCUUGUAUAUUUAUACUUACCACACAAGAUGGACCG-GAGCAGCCCUC-CAA-UAUCUAGUGUAC--CCUCGUGCUCGCUCAAACAUUAAGUGGUGUUGUGCGA--AAAGAAUCUCACUUCAAGAAAAAGAA\n>AF022937.1/6935-7121\nAGUGUUGUGUGAUCUUGCGCGAU-------AAAUGCUGACG---UGAAAACGUUGCGUAUUGCUACAACACU-----UGGUUAGCUAUUUAGCUUUACUAAUCAAGACGCCGUC-GUGCAGCCCAC-AAAA-GUCUAGAUA----CGUCACAGGAGAGCAUACGCUAGGUCGCGUUGACUAUCCUUAUAUAU-GACCUGCAAAUAUAAAC"
  },
  {
    "path": "inst/extdata/Rfam/RF03120.fasta",
    "content": ">KU973692.1/1-298\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAACUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGAAUCAACGAGAAAA\n>DQ071615.1/1-298\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUUGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAACUCUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAGAGGUAAGAUGGAGAGCCUUGUUCUUGGAAUCAACGAGAAAA\n>KF367457.1/1-298\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCCCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAAUUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>MK211377.1/1-296\n--UUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAAUUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>MK062184.1/1-299\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUCUAAUGGUCGCUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAAUUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>AY559082.1/1-297\nAAAUUUUGUUU-CAUCUAUACAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAAUUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>DQ412043.1/1-294\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUUUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCGCUUGGCUGUAUGCCUAGUGCACCUACACAGUAUAAA---UAAU-AACUUUACUGUCGUUGACAAGAAACGGGUAACUCGUCCUUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGUAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>KP886809.1/1-297\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUCGAUCUCUUGUAGAUCUGUUCUUUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCGCUCGGCUGUAUGCCUAGUGCACCUACACAGUAUAAAUAUUAAU-AACUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCUUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUCCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>DQ022305.2/1-295\n--GUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CCUUGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCGCUCGGCUGCAUGCCUAGCGCACCUACGCAGUAUAAAUAUUAAU-AACUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAAUGAGAAAA\n>MK211374.1/1-294\n---UUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-CUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCGCUUGGCUGUAUGCCUAGUGCACCUACGCAGUACAAAUAUUAAU-AACUCUAUUGUCGUUGACAAGAAACGAGUAACUCGUCCUUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUCGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGUGUCAACGAGAAAA\n>DQ648857.1/1-297\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAA-UCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACUUACGCAGUAUAAAUAUUAAU-AACUUUACUGUCGCUGACUGGAUACGAGUAACUCGUCCUUCUUCUGCAGACUGCUUACGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGCUCUUGGUGUCAGCGAGAAAA\n>KF569996.1/1-305\nAUAUUAGGUUUUUACCUACCCAGGAA--AAGCCAACCAACCCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAAGCAUUCUCUGUGUAGCUGUCGCUCGGCUGCAUGCCUAGUGCACCUACGCAGUAUAAACAAUAAUAAACUUUACUGUCGUUGACAAGAAACGAGUAACUCGUCCCCCUUCUGCAGACUGCUUGCGGUUUCGUCCGUGUUGCAGUCGAUCAUCAGCAUACCUAGGUUCCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUUCUUGGAAUCAACGAGAAAA\n>MT163718.1/1-299\nUUGGUUGGUUUAUACCUUCSCAGGUAACAAACCAACCAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCUGCAUGCUUAGUGCACUCACGCAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MT344963.1/1-299\nAUUAAAGGUUUAUACCUUCCCAGGUAACAAACCAACCAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCUGCAUGCUUAGUGCACUCACGCAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCKUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUUGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MT019530.1/1-299\nAUUAAAGGUUUAUACCUUCCCAGGUAACAAACCAACCAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCAGCAUGCCGAGUGCAGCCACACAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MT263421.1/1-296\nAU---AGGUUUAUACCUUCCCAGGUAACAAACCHUUHAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCUGCAUGCUUAGUGCACUCACGCAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MT345869.1/1-293\n------GGUUUAUACCUUCCCAGGUAACAAUCCAWUCAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCUGCAUGCUUAGUGCACUCACGCAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUUGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MT345841.1/1-293\n------CUYYUAUACCUUCCCAGGUAACAAACCHWYCAACUUUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGGCUGUCACUCGGCUGCAUGCUUAGUGCACUCACGCAGUAUAAUUAAUAACUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGCUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCACAUCUAGGUUUCGUCCGGGUGUGACCGAAAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA\n>MG772934.1/1-298\nAUAUUAGGUUUUUACCUUCCCAGGUAACAAACCAACUAACUCUCGAUCUCUUGUAGAUCUGUUCUCUAAACGA-ACUUUAAAA------UCUGUGUGACUGUCACUUAGCUGCAUGCUUAGUGCACUCACGCAGUUUAAUUA-UAAUUAA--UUACUGUCGUUGACAGGACACGAGUAACUCGUCUAUCUUCUGCAGGUUGCUUACGGUUUCGUCCGUGUUGCAGCCGAUCAUCAGCAUACCUUGGUUUCGUCCGGGUGUGACCGAGAGGUAAGAUGGAGAGCCUUGUCCCUGGUUUCAACGAGAAAA"
  },
  {
    "path": "inst/extdata/Rfam/RF03120_SS.txt",
    "content": ">RF03120\n......<<<<<<<.<<<....>>>>>..>>>>>...........<<<<<.....>>>>>.<<<<.......>>.>>..............<<<<<<<<.<<.<<<<.<<<.....>>>.>>>>>>.>>>>>>>>........................((((((((((((.(((((...(((.(((.((((<<<..<<<<<<.<<<<<......>>>>>..>>>>>>......>>><<<<<<<.<<......>>>>>>>>><<<....>>>)))).)))))).))))))))))...))))))).....\n"
  },
  {
    "path": "inst/extdata/sample.fasta",
    "content": ">PH4H_Rattus_norvegicus\nMAAVVLENGVLSRKLSDFGQETSYIEDNSNQNGAISLIFSLKEEVGALAKVLRLFEENDINLTHIESRPSRLNKDEYEFF\nTYLDKRTKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ\nFADIAYNYRHGQPIPRVEYTEEEKQTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF\nRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIG-LASLGAPDEYIE\nKLATIYWFTVEFGLCKEG-DSIKAYGAGLLSSFGELQYCLSD-KPKLLPLELEKTACQEYSVTEFQPLYYVAESFSDAKE\nKVRTFAATIPRPFSVRYDPYTQRVEVLDNTQQLKILADSINSEVGILCNALQKIKS\n>PH4H_Mus_musculus\nMAAVVLENGVLSRKLSDFGQETSYIEDNSNQNGAVSLIFSLKEEVGALAKVLRLFEENEINLTHIESRPSRLNKDEYEFF\nTYLDKRSKPVLGSIIKSLRNDIGATVHELSRDKEKNTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ\nFADIAYNYRHGQPIPRVEYTEEERKTWGTVFRTLKALYKTHACYEHNHIFPLLEKYCGFREDNIPQLEDVSQFLQTCTGF\nRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIG-LASLGAPDEYIE\nKLATIYWFTVEFGLCKEG-DSIKAYGAGLLSSFGELQYCLSD-KPKLLPLELEKTACQEYTVTEFQPLYYVAESFNDAKE\nKVRTFAATIPRPFSVRYDPYTQRVEVLDNTQQLKILADSINSEVGILCHALQKIKS\n>PH4H_Homo_sapiens\nMSTAVLENPGLGRKLSDFGQETSYIEDNCNQNGAISLIFSLKEEVGALAKVLRLFEENDVNLTHIESRPSRLKKDEYEFF\nTHLDKRSLPALTNIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDRFANQILSYGAELDADHPGFKDPVYRARRKQ\nFADIAYNYRHGQPIPRVEYMEEEKKTWGTVFKTLKSLYKTHACYEYNHIFPLLEKYCGFHEDNIPQLEDVSQFLQTCTGF\nRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIG-LASLGAPDEYIE\nKLATIYWFTVEFGLCKQG-DSIKAYGAGLLSSFGELQYCLSE-KPKLLPLELEKTAIQNYTVTEFQPLYYVAESFNDAKE\nKVRNFAATIPRPFSVRYDPYTQRIEVLDNTQQLKILADSINSEIGILCSALQKIK-\n>PH4H_Bos_taurus\nMSALVLESRALGRKLSDFGQETSYIEGNSDQN-AVSLIFSLKEEVGALARVLRLFEENDINLTHIESRPSRLRKDEYEFF\nTNLDQRSVPALANIIKILRHDIGATVHELSRDKKKDTVPWFPRTIQELDNFANQVLSYGAELDADHPGFKDPVYRARRKQ\nFADIAYNYRHGQPIPRVEYTEEEKKTWGTVFRTLKSLYKTHACYEHNHIFPLLEKYCGFREDNIPQLEEVSQFLQSCTGF\nRLRPVAGLLSSRDFLGGLAFRVFHCTQYIRHGSKPMYTPEPDICHELLGHVPLFSDRSFAQFSQEIG-LASLGAPDEYIE\nKLATIYWFTVEFGLCKQG-DSIKAYGAGLLSSFGELQYCLSD-KPKLLPLELEKTAVQEYTITEFQPLYYVAESFNDAKE\nKVRNFAATIPRPFSVHYDPYTQRIEVLDNTQQLKILADSISSEVEILCSALQKLK-\n>PH4H_Chromobacterium_violaceum\n--------------------------------------------------------------------------------\n----------------------------------------------------MNDRADFVVPD-----ITTRKNVGLSHD\nAN------DFTLPQPLDRYSAEDHATWATLYQRQCKLLPGRACDEFMEGL----ERLEVDADRVPDFNKLNQKLMAATGW\nKIVAVPGLIPDDVFFEHLANRRFPVTWWLREPHQLDYLQEPDVFHDLFGHVPLLINPVFADYLEAYGKGGVKAKALGALP\nMLARLYWYTVEFGLINTP-AGMRIYGAGILSSKSESIYCLDSASPNRVGFDLMRIMNTRYRIDTFQKTYFVIDSFKQLFD\nATA-PDFAPLYLQLADAQPWGAGDVAPDDLVLNAGDRQGWADTEDV----------\n>PH4H_Ralstonia_solanacearum\n--------------------------------------------------------------------------------\n-----------------------------------------------MAIATPTSAAPTPAPAGFTGTLTDKLREQFAEG\nLDGQTLRPDFTMEQPVHRYTAADHATWRTLYDRQEALLPGRACDEFLQGL----STLGMSREGVPSFDRLNETLMRATGW\nQIVAVPGLVPDEVFFEHLANRRFPASWWMRRPDQLDYLQEPDGFHDIFGHVPLLINPVFADYMQAYGQGGLKAARLGALD\nMLARLYWYTVEFGLIRTP-AGLRIYGAGIVSSKSESVYALDSASPNRIGFDVHRIMRTRYRIDTFQKTYFVIDSFEQLFD\nATR-PDFTPLYEALGTLPTFGAGDVVDGDAVLNAGTREGWADTADI----------\n>PH4H_Caulobacter_crescentus\n--------------------------------------------------------------------------------\n----------------------------------------------------MSG---------------DGLSNGPPPG\nAR-----PDWTIDQGWETYTQAEHDVWITLYERQTDMLHGRACDEFMRGL----DALDLHRSGIPDFARINEELKRLTGW\nTVVAVPGLVPDDVFFDHLANRRFPAGQFIRKPHELDYLQEPDIFHDVFGHVPMLTDPVFADYMQAYGEGGRRALGLGRLA\nNLARLYWYTVEFGLMNTP-AGLRIYGAGIVSSRTESIFALDDPSPNRIGFDLERVMRTLYRIDDFQQVYFVIDSIQTLQE\nVTL-RDFGAIYERLASVSDIGVAEIVPGDAVLTRGT-QAYATAGGRLAGAAAG---\n>PH4H_Pseudomonas_aeruginosa\n--------------------------------------------------------------------------------\n----------------------------------------------------------------------MKTTQYVARQ\nPD----------DNGFIHYPETEHQVWNTLITRQLKVIEGRACQEYLDGI----EQLGLPHERIPQLDEINRVLQATTGW\nRVARVPALIPFQTFFELLASQQFPVATFIRTPEELDYLQEPDIFHEIFGHCPLLTNPWFAEFTHTYGKLGLKASKE-ERV\nFLARLYWMTIEFGLVETD-QGKRIYGGGILSSPKETVYSLSD-EPLHQAFNPLEAMRTPYRIDILQPLYFVLPDLKRLFQ\nLAQ-EDIMALVHEAMRLG-LHAPLFPPKQAA-------------------------\n>PH4H_Rhizobium_loti\n--------------------------------------------------------------------------------\n----------------------------------------------------MSVAEYAR----------DCAAQGLRGD\nYS--VCRADFTVAQDYD-YSDEEQAVWRTLCDRQTKLTRKLAHHSYLDGV----EKLGL-LDRIPDFEDVSTKLRKLTGW\nEIIAVPGLIPAAPFFDHLANRRFPVTNWLRTRQELDYIVEPDMFHDFFGHVPVLSQPVFADFMQMYGKKAGDIIALGGDE\nMITRLYWYTAEYGLVQEAGQPLKAFGAGLMSSFTELQFAVEGKDAHHVPFDLETVMRTGYEIDKFQRAYFVLPSFDALRD\nAFQTADFEAIVARRKDQKALDPATV-------------------------------\n"
  },
  {
    "path": "inst/extdata/seedSample.fa",
    "content": ">hsa-let-7a-5p MIMAT0000062 Homo sapiens let-7a-5p\nUGAGGUAGUAGGUUGUAUAGUU\n>hsa-let-7b-5p MIMAT0000063 Homo sapiens let-7b-5p\nUGAGGUAGUAGGUUGUGUGGUU\n>hsa-let-7c-5p MIMAT0000064 Homo sapiens let-7c-5p\nUGAGGUAGUAGGUUGUAUGGUU\n>hsa-let-7d-5p MIMAT0000065 Homo sapiens let-7d-5p\nAGAGGUAGUAGGUUGCAUAGUU\n>hsa-let-7e-5p MIMAT0000066 Homo sapiens let-7e-5p\nUGAGGUAGGAGGUUGUAUAGUU\n>hsa-let-7f-5p MIMAT0000067 Homo sapiens let-7f-5p\nUGAGGUAGUAGAUUGUAUAGUU\n"
  },
  {
    "path": "inst/extdata/sequence-link-tree.fasta",
    "content": ">Phy000B0HV_NEUCR\nM-----GIGSATLG-----------------------------------SRIPTPVLVARAVVSSSDGK-----DC--VA\nNPNLCEKP-VGGSQLTVPIVLGLW----------------RNMKKLAAEEAHDPHKSLDFGLDENM-----------GKA\nKGRNMAG------EKDGNGSRFHAHQMSMDMNLSSPYLLPPDAH-GSQSSLNSLARTL-NPQDDPFRPVTQYTASDAASV\nKSMP-R-----GTD-----------R-------GPGG------PFRGPPPRQGSMP-RSPEPTHA---RPGNG----PRP\nPRI-SVQD------------P---SSNA-TS-D--NE-TS----------------------D----------------S\nERTLT-GSPRELHAATHK-------------------DGVKPPA-SPSQPISPANP-AV---------------------\n>Phy000FCLK_ASPCL\n--------------------------------------------------------------------------------\n-------------------------------------------------------------------------------M\n-------------REAEKGNPMHAKGMSLDIV-PSPYLLPPGLH-GSRESLHSLSRSV-IGDDDKYRHATSFL-GDNASV\nRSQP-R---G-YHDDAMTYSR-SQ-S------K-VS------M--R-DDMNQGLLQ------------NAQRMSR--SSP\nPL-YNTPPDGGSVHSPVGQD------------------------------------------R-----------------\nGQDSG-LQLNLPRSLSPVHI-----------------PGFNGSR-GPSPV-----P-TS-PE------GNDDKLPS----\n>Phy000FJDH_ASPFL\nMH---YHHRHQT--HQDIHMV-VRSPP-RRPDI-VPRHRLP------YLV-PEPPTFVKRDSDPS--------QTCSAGD\nTSSKCEKPTSTTTTTTLPVVLGAVVPILCAV-IVLIYLHRRNVRKLRSEDANDKHRSLDFGLDLEP-TG----GGNA--M\nR-------Q----TEKSNGSYNHNKGISLDIG-PSPYLLPPGLH-GSRDSLHSLSRSI--GGDDKYRHATSFL-GDNASV\nRSQS-R---G-AQDDAPSFTG-SA-R------K-AA------L---GDDMKQGLLG------------NAQRMSR--SSP\nPL-YISPGEDGA-HVQVDPI------------------------------------------A-----------------\nQPDHG-FQFELPRSPSPVLI-----------------PGAPSTK-ESITP-----TNNV-DK------------------\n>Phy000FQ5O_ASPFU\nMH---HHHQHLHFPRHGIHLA-VRSPP-RRPDI-VPRDRVP------LLVGTEDPTLVKRVPSTSTTSTA--STRCPEGD\nTSSACEKYTNSSSTTTLPIVLGAVIPIVCAI-IVLFYLHRRNVKKLRQEDANDKHKSLDFGLDLEP-RA----GSKP--M\n-------------TQAEKGSNMHSKGMSLDIG-QSPYLLPPGLH-GSRESLHSLSRSI-IGDDDKYRHASSFL-GDNASV\nRSQP-R---G-FHDETSAFSR-SQ-S------K-AS------L--RGDDMNQGLLQ------------NAQRMSR--SSP\nPL-YNAPSDGGSSHSPRGQG------------------------------------------N-----------------\nGQDMG-LQLNLPRSLSPVHI-----------------PGVNGSR-GTSPA-----P-GGHAD------GSEDISSS----\n>Phy000G05U_EMENI\nMH---RHQQHQH--RHGKYLG-ARFAP-VEPAL-MPRNRPP------YLLMPEAPTLVKREPMPTTDSGR--VETCSPGD\nNSARCEKNTSTASNTTLPVVLGAVIPIVCAI-IVLIFLHRRNVKKLRNEDANDKHKSLDFGMDLAP-SG----GRSG--M\nQ-------E---------KGSHHMKGISLDIG-PSPYMLPPSIR-GSKDSLNSLPRTI-LADDDKYRHAHTYFSTDAQSI\nRSQR-R-----VHDDAASVAG-ST-R------R-GA------F---GDEMNQGLLG------------NAQRISR--SSP\nPL-YNPPEPTAGRAQ----P------------------------------------------Q-----------------\nVQDAG-FELSLPRSPSPVHV-----------------SGLTSIN-ESTTE-----TGRE-AN------------------\n>Phy000GDP6_ASPNG\nM--------------------------------------------------RETPTLARREPLPSTDSSS------ASSS\nTASSGTKPTSTLTTTTLPVVLGAVVPIVIAI-GILLYLHRRNVKKLRNEDANDKHKSLDFGLDLAP-TN----GAVP--M\nQ-------Q----AEKTDRNAAHNKGISLDIG-PSPYLLPPGLH-NSRESLASMSRSIGDGDDDKYRHVGSFL-GDNSSL\nRSHS-R---G-PHDDAASFTG-ST-R------R-AA------L---GDDMNQGLLR------------NAQRMSR--SSP\nPL-YKTSSGDRNVQSPASSD------------------------------------------H-----------------\nEHDHG-FQLDLPRSPSPVHV-----------------PGMAISE-PHT-------TSNE-VG------FAGDHAVTETSA\n>Phy000HD5X_BOTFU\nMADHQRLANIVRLARRV----P--LAE-AAAED-IGNIASIL------KMSLPDPVLMVRSATTSAAASS---STCAADD\nTSAACEKP-VGPSAYTLPVILGIVIPVGGAI-ILFTILQRRYMKKAREEDLNDPTKNMDFGMGRIS-R--T------AGG\nESG-I--S---N-FDDEKGGAVRTRQMSLDLGGKSPYLLPPELH-NSRESLHSLSRTI-HSNEDPYRPVHEAV---GGSI\nRSKQ-------GRNGSSIMTESSA-A--------PSK-----MYDAGSPDGQGLLS------------NAAAMSR--TTP\nPSTGSSPP------------P---KSNS--I-----P-P-----------------------------------------\n-ANMP-AEPKQAESPQNVARKGL--------------PGNFRPQ-DRFPTAMPVPM-PYP-------------DRESYAG\n>Phy000IAZP_COCIM\nMA------------RHTYRDP--SRLV-SRALA-IPVERSI------ILTALEPPSLVKRNPADAASSSSVPTKTCGPDD\nTTGVCTRPVNSTTTLTLPIVLGAVIPLTCAF-IAFFFLHRRHVKKLRLEDANDKHKSLDFGLDFVP-SG----SNNNRRG\nNGGNG-P------SMAEKSTRRGGHGVSMDLTLNSPYLLPPGLH-GSHESIHSLSRSL-HGEDDKYRHASAFPTGDSGSI\nRSCS-PSFKRGGDD-ASSHNSPSS-K------Y-PY----------GDDMNQHLLK------------NAQRMSR--SPP\nAI-ELDPIESDLGHPPHHA--------------------T----------------------A-----VSASE------S\nGNTTF-HGRSELTVPTAVSS-----------------HGDRSSS-SSSER-----DDSV-LR---------KS-------\n>Phy000KG2Q_MAGGR\nMVGVTVHEGEYHLGSRM----P--VMA-RDAST-PAL-QIAA------DGPGFFKRLVARQSSDD----------CVNGE\nPSNLCEKP-VTSQTLALPIALGVTIPLVALV-VMLIWLHRKNVRRQRQEDANDPHKSLDFGLDMGP------------GK\nRKS-K--L---F-GGEKLGGGPHNRQISMDMNLSSPYLLPPNMQ-NSRESIHSLAKTL--HNEDPYRHITQYNASDAGSL\nRSYK-A---G-GMD-------------------RPIG-----PKITVPTSRKGSLQATSPTSTIGSVPPRYEASQ---DD\nYV-KPPPP------------A---ALK---S-P--TQ-DS----------------------TPYPDDKSGP-------L\nATVMP-SVP-EIQEPKPASLSK----E-SS---QAPS---LAAV-PPSSPLTISAP-EI---------------------\n>Phy000ODBJ_SCLSC\nMEDHQRLANIVRLARRV----P--LAE-AAAED-IDNIASIL------KMAVPDRVIMGRSSTTTSSTSS---STCAADD\nTSAACEKP-IGPSAYTLPVILGIVIPVAGAI-ILFTILQRRYTKRAREEDANDPTKNMDFGMGRIS-R--T------AGG\nESS-I--S---N-FDDEKGDSGRPRQMSLDLGGKSPYLLPPELQ-DSRESLHSLSKTI-HQNEDPYRPVHEAV--GAASI\nRSKQ-------GRNGSSILSASTV-A--------PSG-----MNDTGSPDGQGLLS------------NAAAMSR--TTP\nPTAGFNPP------------P---RSNS--I-----P-P-----------------------------------------\n-AKMP-EEPRQSP-EQNVDKKGP--------------PGNFRPQ-NGFSSTRSIPM-PFL-------------DWESYAG\n>Phy000PFY6_UNCRE\nMA------------RHAFQPA--SGLV-PRALA-IPLDRSI------LLTSLDHPSHVKRSPAATASSSAAATTSCGPND\nTTGICTRPVSSTTTMTLPIVLGAAIPITCAI-IAFFFLHRRHVKKLRLEDANDKHKSLDFGLDFVP-SG----SNNNKRG\nNGGNG-G------LMGEKSTRQRAHGVSLDLTMGNPYLLPPVSM-GSHESIHSLSKSL-HGGDDKYRHAAAFPSSENRF-\n--------------------------------------------------RHSVLQ------------PTNPLA---SEP\nRS-PLSPPGRNELTKLKQQ--------------------L----------------------------------------\n------------------------------------------------DK-----EQSV-LR---------KS-------\n>Phy00201Y5_COCHE\nM-------------------------------------------------------------------------------\n--------------------CATTVPVVGIA-VVLAFLHRRNKQKLREEDQRDKYKSNDWGMEGVIPK---------TSK\nKGG-P-EM---S-ISEKEISGGHDRGLSIETG--SPYILPPGLH-GSRESFHSLSRST-HDPHDPYGPVAFLR--DDQSL\nRSH----GPY-KGETNSVYT--A-SS-------SGT---------KKEGLQAGLLQ------------NAQRMST--SAP\nVR-GESLS------------P---DSTR--SPD--SK-FAEAGIPLSPLNPRYEPEAPA-----AAPAPAPA-------P\nAHAAP-VASKPTDVP-TI----------------------SIPE-PQVTEKQV---------------------------\n>Phy00208KX_MYCGR\nMY---IPRA------------EDS-----------R-VQRMV------DGAAAGLRIVARSL------A-------ERAE\nSNSKEDTPNDRMKVQNIGIALGVIIPIGGAI-IVLTYLHRRHVKRQRVEDMNDPHKSLDFGLEGLG-SMPPQAPKKSRRG\nKKGPE-MIV-TDFGGPTAHPSKRGHGMSLDLGVPSPYLLPAGLQ-GSKESIHSMSRN--YDEHDPYRSVAMM--RPSGET\nDRF---R----GDDKGSVYSMSTG-N------R-SA------L--PQD--RASLIA------------NARPMS---ITP\nSK-RSDPATSHPSTPADVSP------------------------------------------R-----------------\nDSHSPISRTRSPLAKLSVDE-----------------TAIAEKQLEPLPS-----P-PTVPE------VALMMPPP-RKS\n>Phy0020GNV_PYRTR\nMP---HSHHLHHMRHQL----R--HDN-QLGSP-ITGSKTMH------VFERATRVLVARAESS-----------C-TND\nSDPGCTKP---TQVPTMAIALAVIVPIVGVS-IVLCFLHRRNKRKLAEEDSKDQYKSNDWGMEGVA-K---------TNK\nKKR-P-EM---S-LSEKDAGGGHDRGLSIEAG--SPYILPVGLH-GSRESFHSLSRSQ-HDPHDPYGPVAFLK--DDQST\nRGSSVRGGPY-RNETGSVYT--T-SS-------SGT---------RKEGLQAGLLQ------------NAQRMST--SNP\nVR-GDSLS------------P---VSTS--SPD--TK-FPDPGIPLSPLNPRFENQSPI-----SPPAASPS-------P\nS-------IKPNSVP-TI----------------------SIPE-PGVTEKQV---------------------------\n>Phy0022J75_CRYPA\nMDEMLARRNGHLMGPRI----P--IGR-RVAAV--AE-DTSV------EASTPPSHVVGRSSSSTSDASSST-ATCSSSS\nASNTCEKP--TSTSIAGEISIGIAVPMAIIFICVLIYFHRRNLKRQAAEDRDPHHRSLDFGLGDTS-S----------GK\nSKR-K-SM-----LGLGGEKSKHPRGLSIDMNLSSPYLLPEHVQ-GSRESMNSLAKTL-HQADDPYRPITKYM-SETGSV\nGSLE-K---N-GRYTPSVMTASTK-RVSRQSYANPM------SPALQQPLRQNSYP-KSPLTPSAA--------------\n----SSVT------------A---VETDIST-P--TAAKE----------------------PTVPEDGPMPPPQC---D\nLPPLP-VVP-EIRQPAPVAQRGAA----REPVMQEHEEELDLPD-FSNNSKRESAD-EL---------------------\n>Phy0022OIS_VERA1\nMAATAFNGNGYRMGSRI----H--VRT-AEPTHEDAA-L----------LRSPGPVIAARKEC---------------DP\nDHPDCEAPAVKPQTLI--IALSVVIPIVAIM-SILYYLHRRGIKKQRMEEASDPTMSLDFGINDDK-M---------GRG\nGKRKS-VF---R-EKMLNLDPKHRAQVSMDMNLSSPYLLPPALQ-GSKQSLHSLARNL-HDDDDPYRPVNQYG-SEVGSI\nRSFRPEK--E-GRAGSSVYTGSTE-R-------GSSL------HSRTHPPRQNSLP-KPPPLT-A---DPFATPTGARTP\nQLETSPIS------------P---TGGS--------L-PH----------------------AIIPEIGTVSYAEDFDDS\nNRNLP-HVP-DVTQPAPVAQRDARRVSSGASQSSWNEPAAQFPD-PAAHQVHNAAP-TL---------------------\n>Phy003AMS0_602072\nM--------------------------------------------------AETPTLARREPLPSTESSS-------SSS\nSSSSETKPTSTLTTTTLPVVLGAVIPVVIAI-AILLYLHRRNVKKLRNEDANDKHKSLDFGLDLAP-T-----GAKP--M\nQ-------Q----AEKLDRNAAHNKGVSLDIG-PSPYLLPPGLH-NSRESLSSLSRSIGDGDDDKYRHVGSFL-GDNASL\nRSHS-R---G-PQDDASSFTG-ST-R------R-GA------L---GDDMNQGLLR------------NAQRMSR--SSP\nPL-YTIPSGDRNVQSPASSD------------------------------------------H-----------------\nERDPG-FQLDLPRSPSPVHV-----------------PGMTISE-PTNSM-----TSNE-PE------FSGVHANTENSA\n>Phy003BKXA_GIBZE\nMGLTHYH--DQ----------R--ADIGQGASS-ISQ-KMAS------SSSHIFRRLARRENC----------------K\nDDNSCAQS-SVSNS--------LVLPIVVAI-I--------NMKKQMLEDAHDPHKSLDFGLGDEG-G---------AKK\nSAR-R-SI---FMGGGEKTLAHKPSQLSMDMNLSSPYLLPPGLQ-ESRESLNSLAKSLGNDNQDPYQYVAAITQSETGSL\nRSFNPK---D-SHSRNTKFNSPRN---------SGKP-----GSLKMPPSRMNSLP-ETPVSATESRVDPFGTPKM--PA\nPA-HPAKS------------P---FDS---E-KDAFH-PA----------------------PIVPEIGVVSD-------\nFDEKN-AVP-SVQQPPIARSKT----------------------------------------------------------\n>Phy003BOHC_AJECA\n-----------------------------------------------MQIPPPPPTLARRHVVPK---------------\nTPPEDARD----LLVMLPLPLYPYIPLTIAI-LVLVFLHRRHIRKLRSEDANDKHKSLDFGLDVVP-SG------NKKRG\nRGRKG-G-MEMTTADAEKSVRRNDRGLSMDITMTSPYLLPPALN-GSHDSLHSLSRSV-HADDDRYRTATAFSAGDNSSM\nRSFT-SNLKP-FPDDSVSFTGMSS-R------H-AP------P---GDEMHANLLR------------NAQRMSR--ASP\nPP-GTATHSIGSSQSHRSPPR---KLTT--PT------PN----------------------I-----VS----------\nDRSGI-HSPD---------------------------------------R-----SLAP-KSISTPGSELRKS-------\n>Phy003DGO9_PENCH\nMP---HAHH-------AGLVMRNH----VRRDV-IPPHRLPFLVPSTSSIATELPSLVARAE-----------AS-----\nTTVTGEKPTSNLTTTVLPVVLGAGVPILCAI-VVLIVLHRRHVKKLLREDAMDKHKSLDFGMDTVG-PA----TRRK--G\nP------------GMPPMSEPTHTKGLSLDVG---PYLMPPGLK-NSPESLRSMSI-----DDDKYRPATA-------SI\nRSYP-R---------GSRFEG-------------------------ADDGNSGLLQ------------NAQRMSR--SSP\nPL-YSSPIESHGRSLDQHND------------------------------------------Y-----------------\nL-----GEVPGVTHPPAAQQ-----------------PGMAIGS-PNANRIPSPEP-LP--------------HLDSSLG\n>Phy003PHXT_PENMQ\nMS--HRHGMHHHVRRHI------PEDP-VQLES-VPLEPAP--------TISEAPSVIRRTSSATST--------C----\nTGSSCETTSSSNLVNTLPVVLGVVIPVVLAI-AVLLFLHRRHVRKLRQEDANDKHKSLDFGMEVVR-AG----GGK----\n-------------ANPEMGEKPHKHGMSLDI-ISSPYLLPPGLH-GSKESLRSLSKVI-SPDDDKYRLGLAAQ-SDTASL\nRSYR-SHPRM-GQDDASSFRG-ST-R------H-GP------L---PDDMNQGLLQ------------NASRMSR--SPP\nVD-ATSPLSVNHTIHEEQFD------------------------------------------H-----------------\nPRTVG-NQSPIRQAESPPMA--------------------KSPK-NHVSP-----DHSG-QG---------DE-------\n>Phy003PVXT_TALSN\n-M--PHRHGIHHVHRRN------AENL-IKLES-LPLKPAP--------TISEPPSVVRRASSETST--------C----\nSGASCEKSSSSGLVNTLPVVLGVVIPVVAAI-IVLLILHRRHVRKLRQEDANDKHKSLDFGMEVVR-AG----GGN----\n---P---------KQPEMGEKPHKHGMSLDI-IGSPYLLPPGLH-NSKESLRSLTKVI-SVEDDKYRVAAQ---SDTASL\nRSHR-T---M-GNDDASSFGG-ST-R------H-GP------I---PDDMNQGLLQ------------NASRMSR--SPP\nVD-ASSPLSVSQTIHEEPFD------------------------------------------H-----------------\nSNAMR-NQSQNHQAVDS-HM-----------------PPEDLPK-NHSSP-----APSG-PG---------DE-------\n>Phy003PZPF_FUSOX\nMGIAHYE--GARLR-------P--RTNIEDVSS-ASQ-NGVA------LSSSIFRRLVTRENC----------------Q\nDTDSCAAA-SANTNLVVPIVVAIVVPIVLIA-IFLYYLHRKNMKRQMLEDANDPHKSLDFGLDGA--G---------GKK\nSAR-R-SL---FMGGGEKGLNHKPSQLSMDMNLSSPYLLPPGLQ-ESRESLNSLAKSLGNDNQDPYHP------------\n----------------------RN---------SGKP-----GSMKMPPSRMNSLP-ETPVSATDSKVDPFGTPKA--PA\nPT-HQPNS------------H---FDE-----KDGFQ-PT----------------------AIIPEIGVVSD-------\nFDEKR-DGA-SVQPPPAVRSKT----------------------------------------------------------\n>Phy003QBJJ_PENDI\nMS---HAHH-------AGLVMRNH----VRRDV-IPANRVPIFVPS-LSVATQLPTLVARSE------------S-----\nEPTSGPKATSNLATTVFPIVFGAGIPIFCAL-IILVVLHRRQVKKLVREDAMDKHKSLDFGLDTVG-PA----TRRK--G\nA-------K----GMPPMSEHNHTKGLSLDVG---PYLLPPGLQ-HSTDSLRSMSI-----DDGKYRPATA-------SI\nRSNS-R---------NSKYGG-------------------------TDDGNSGLLQ------------NAQRIPR--SSP\nPL-CSPIEPRARSPLNQHDD------------------------------------------Y-----------------\nI-----GQVPEVTHPPAVHQ-----------------PGMAIGS-PNTNRIPSPEP-LP--------------HVDSSSG\n>Phy0043OCA_COLGM\nMASASFSANGYVMGSRI----P--IRD-VNPINMTPT-PASP-------IRIASRIIGARDE------------QC--TG\nSATLCEKP-VDPASLTLPITLGVTIPIVGAL-FLLYYFHRRNMRRQAQEDATDPNRGLDFGLGDAP-I--D------KGG\nKKRKS-LM---FREKGMGIETNKQRQLSMDMNLSSPYLLPPGLQ-SSRESLNSLARTL-HNEADPYRPVYASS--DAGSI\nYTKT-------TSR-----------R-------GSSMTGRTTMTQNTLPPRQTSLP-RPPPAT-A---DPLGASR--SGS\nPSL-PPTS------------P---AIR---S-P--LV-AE----------------------PVIPQIETVP-------S\nGSSLP-QIP-DVPEPEPVAQRGL--------------PGNSRPS-PGHPTILEARE-PE---------------------\n>Phy0043W64_36779\nM-AGVAEAGSYRMSGRI----P--IVR-RNASG-VEA-LDVP-------QPDQTRPLVARESID-----------C-TGE\nNANLCEKP-YGANSLGVPIALGVAIPIVALL-GVVFWLHRRNIKKQRSEEANDPHKSLDFGLGDGS-R--G------SKG\nGKRKS-AF---FGGGGAEKASHRNNQLSMDMNLSSPYLLPPSAQAGSRESLHSLARTL-HGNEDPYSPVYQ--QSDARSM\nRSTK-K---G-SRDD-------YN---------GPSG-----PGLSVPPSRKSSFP-TSPTSPVTSIPPRYEASK---DE\nVT-PPPPA------------HSPGQAN---F-P--LN-DT----------------------SPYPNDHQLDA------H\nGVSMP-AVP-ELQEPAQAKMPS-------------SP---RFPL-P----------------------------------\n>Phy00443NV_MAGO7\nMVGVTVHEGEYHLGSRM----P--VMA-RDAST-PAL-QIAA------DGPGFFKRLVARQSSDD----------CVNGE\nPSNLCEKP-VTSQTLALPIALGVTIPLVALV-VMLIWLHRKNVRRQRQEDANDPHKSLDFGLDMGP------------GK\nRKS-K--L---F-GGEKLGGGPHNRQISMDMNLSSPYLLPPNMQ-NSRESIHSLAKTL--HNEDPYRHITQYNASDAGSL\nRSYK-A---G-GMD-------------------RPIG-----PKITVPTSRKGSLQATSPTSTIGSVPPRYEASQ---DD\nYV-KPPPP------------A---ALK---S-P--TQ-DS----------------------TPYPDDKSGP-------L\nATVMP-SVP-EIQEPKPASLSK----E-SS---QAPS---LAAV-PPSSPLTISAP-EI--------------------A\n>Phy0044G80_PHANO\nM-------------HHL----R--RDA-QMAAS-TSATHTL--------VDRASRVLVARTT-------------C-TND\nSDPGCTKP---TQVPTIAIALAAIVPVVGLL-IVLVFLHRRNQKKLAAEDAKDKYKSMDFGMGGAG-K---------KNK\n-GG-P-EM---SITEKDIRGGAHSRGISLEGG--NPYILPVGLH-GSRESFHSLSRSQ-NDPHDPYRPVTFLR-NDNQSI\nRSQS-RG--Y-GHDNGSLYTTRTMSS-------GGT---------QRNRMGDGLLN------------NAQRMST--SRP\nMR-SESLS------------P---DSTT--SPD--VK-FPEQNIALSPLNPRFEGEPLAMPATELPHSRTPP-------S\nA-------SSPPNVP-II----------------------AVPA-PAAAKPEI---------------------------\n"
  },
  {
    "path": "inst/extdata/tp53.fa",
    "content": ">Homo_sapiens\n----MDDLMLSP-------DDIEQWFTED-----------------PGPDEAPRMPEAAPPVAPAPA---------APTPAAPAPAPSWPLSSSVPSQKTYQGSYGFRLGFLHSGTAKSVTCTYSPALNKMFCQLAKTCPVQLWVDSTPPPGTRVRAMAIYKQSQHMTEVVRRCPHHERCSD-SDGLAPPQHLIRVEGNLRVEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENLRKKGEPHHELPPGS-TKRALPNNTSSS---PQPKKKP----LDGEYFTLQIRGRERFEMFRELNEALELKDAQAGKEPGGSRAHSSHLKSKKGQSTS---RHKKLMFKTEG-PDSD\n>Sus_scrofa\nMEESQSELGVEPPLSQETFSDLWKLLPENNLLSSELSL-AAVNDLLLSP-VTNWLDENPDDASRVPAPPA----ATAPAPAAPAPATSWPLSSFVPSQKTYPGSYDFRLGFLHSGTAKSVTCTYSPALNKLFCQLAKTCPVQLWVSSPPPPGTRVRAMAIYKKSEYMTEVVRRCPHHERSSDYSDGLAPPQHLIRVEGNLRAEYLDDRNTFRHSVVVPYEPPEVGSDCTTIHYNFMCNSSCMGGMNRRPILTIITLEDASGNLLGRNSFEVRVCACPGRDRRTEEENFLKKGQSCPEPPPGS-TKRALPTSTSSS---PVQKKKP----LDGEYFTLQIRGRERFEMFRELNDALELKDAQTARESGENRAHSSHLKSKKGQSPS---RHKKPMFKREG-PDSD\n>Rattus_norvegicus\nMEDSQSDMSIELPLSQETFSCLWKLLPPDDILPTTATGSPNSMEDLFLPQDVAELLEGPEEALQVSAPAAQEPGTEAPAPVAPASATPWPLSSSVPSQKTYQGNYGFHLGFLQSGTAKSVMCTYSISLNKLFCQLAKTCPVQLWVTSTPPPGTRVRAMAIYKKSQHMTEVVRRCPHHERCSD-GDGLAPPQHLIRVEGNPYAEYLDDRQTFRHSVVVPYEPPEVGSDYTTIHYKYMCNSSCMGGMNRRPILTIITLEDSSGNLLGRDSFEVRVCACPGRDRRTEEENFRKKEEHCPELPPGS-AKRALPTSTSSS---PQQKKKP----LDGEYFTLKIRGRERFEMFRELNEALELKDARAAEESGDSRAHSSYPKTKKGQSTS---RHKKPMIKKVG-PDSD\n>Equus_caballus\nMEETQTELGIEPPLSQETFSDLWKLLPENNVLSPDLS--PAVNNLLLSPDVVNWLDEGPDEAPRMPA---------APAPLAPAPATSWPLSSFVPSQKTYPGCYGFRLGFLNSGTAKSVTCTYSPTLNKLFCQLAKTCPVQLLVSSPPPPGTRVRAMAIYKKSEFMTEVVRRCPHHERCSDSSDGLAPPQHLIRVEGNLRAEYLEDRNTFRHSVVVPYEPPEVGSDCTTIHYNFMCNSSCMGGMNRRPILTIITLEDSSGNLLGRNSFEVRVCACPGRDRRTEEENFRKKEEPCPEPPPRS-TKRVLSSNTSSS---PPQKKKP----LDGEYFTLQIRGRERFEMFRELNEALELKDAQTGKEPGGSKAHSSHLKSKKGQSTS---SHKKLIFKREG-PDSD\n>Danio_rerio\nMAQNDSQE----------FAELWEKNLIS-----------------IQPPGGGSCWDIINDEEYLPGSFDPN--FFENVLEEQPQPSTLPPTSTVPETSDYPGDHGFRLRFPQSGTAKSVTCTYSPDLNKLFCQLAKTCPVQMVVDVAPPQGSVVRATAIYKKSEHVAEVVRRCPHHERTPD-GDNLAPAGHLIRVEGNQRANYREDNITLRHSVFVPYEAPQLGAEWTTVLLNYMCNSSCMGGMNRRPILTIITLETQEGQLLGRRSFEVRVCACPGRDRKTEESNFKKDQETKTMAKTTTGTKRSLVKESSSATLRPEGSKKAKGSSSDEEIFTLQVRGRERYEILKKLNDSLELSDVVPASDAEKYRQKFMTKNKKENRESSEPKQGKKLMVKDEGRSDSD\n"
  },
  {
    "path": "man/GVariation.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{GVariation}\n\\alias{GVariation}\n\\title{GVariation}\n\\format{\na folder\n}\n\\source{\n\\url{https://link.springer.com/article/10.1007/s11540-015-9307-3}\n}\n\\description{\nA folder containing 4 MAS files as a sample\ndata set to identify the sequence recombination event.\n}\n\\details{\n\\itemize{\n  \\item A.Mont.fas MSA with sequences of 'Mont' and 'CF_YL21'\n  \\item B.Oz.fas MSA with sequences of 'Oz' and 'CF_YL21'\n  \\item C.Wilga5.fas MSA with sequences of 'Wilga5' and 'CF_YL21'\n  \\item sample_alignment.fa MSA with sequences of 'Mont', 'CF_YL21', \n  'Oz', and 'Wilga5'\n}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/Gram-negative_AKL.fasta.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{Gram-negative_AKL.fasta}\n\\alias{Gram-negative_AKL.fasta}\n\\title{Gram-negative_AKL}\n\\format{\nA MSA fasta with 100 sequences and 36 positions.\n}\n\\source{\n\\url{http://biovis.net/year/2013/info/redesign-contest}\n}\n\\description{\nAmino acids in the adenylate kinase lid (AKL) domain\nfrom Gram-negative bacteria.\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/Gram-positive_AKL.fasta.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{Gram-positive_AKL.fasta}\n\\alias{Gram-positive_AKL.fasta}\n\\title{Gram-positive_AKL}\n\\format{\nA MSA fasta with 100 sequences and 36 positions.\n}\n\\source{\n\\url{http://biovis.net/year/2013/info/redesign-contest}\n}\n\\description{\nAmino acids in the adenylate kinase lid (AKL) domain\nfrom Gram-positive bacteria.\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/LeaderRepeat_All.fa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{LeaderRepeat_All.fa}\n\\alias{LeaderRepeat_All.fa}\n\\title{A sample DNA alignment sequences}\n\\format{\nA MSA fasta\n}\n\\description{\nDNA alignment sequences with 24 sequences and 56 positions.\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/Rfam.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{Rfam}\n\\alias{Rfam}\n\\title{Rfam}\n\\format{\na folder\n}\n\\source{\n\\url{https://rfam.xfam.org/}\n}\n\\description{\nA folder containing seed alignment sequences and \ncorresponding consensus RNA secondary structure.\n}\n\\details{\n\\itemize{\n  \\item RF00458.fasta seed alignment sequences of Cripavirus internal \n  ribosome entry site (IRES)\n  \\item RF03120.fasta seed alignment sequences of Sarbecovirus 5'UTR\n  \\item RF03120_SS.txt consensus RNA secondary structure of \n  Sarbecovirus 5'UTR\n \n}\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/TP53_genes.xlsx.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{TP53_genes.xlsx}\n\\alias{TP53_genes.xlsx}\n\\title{genome locus}\n\\format{\nxlsx\n}\n\\description{\nThe local genome map shows the 30000 sites around the TP53 gene.\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/adjust_ally.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ancestor_seq.R\n\\name{adjust_ally}\n\\alias{adjust_ally}\n\\title{adjust_ally}\n\\usage{\nadjust_ally(tree, node, sub = FALSE, seq_colname = \"mol_seq\")\n}\n\\arguments{\n\\item{tree}{ggtree object}\n\n\\item{node}{internal node in tree}\n\n\\item{sub}{logical value.}\n\n\\item{seq_colname}{the colname of MSA on tree$data}\n}\n\\value{\ntree\n}\n\\description{\nadjust the tree branch position after assigning ancestor node\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/assign_dms.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/dms.R\n\\name{assign_dms}\n\\alias{assign_dms}\n\\title{assign_dms}\n\\usage{\nassign_dms(x, dms)\n}\n\\arguments{\n\\item{x}{data frame from tidy_msa()}\n\n\\item{dms}{dms data frame}\n}\n\\value{\ntree\n}\n\\description{\nassign dms value to alignments.\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/available_colors.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/available.R\n\\name{available_colors}\n\\alias{available_colors}\n\\title{List Color Schemes currently available}\n\\usage{\navailable_colors()\n}\n\\value{\nA character vector of available color schemes\n}\n\\description{\nThis function lists color schemes currently available that\n can be used by 'ggmsa'\n}\n\\examples{\navailable_colors()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/available_fonts.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/available.R\n\\name{available_fonts}\n\\alias{available_fonts}\n\\title{List Font Families currently available}\n\\usage{\navailable_fonts()\n}\n\\value{\nA character vector of available font family names\n}\n\\description{\nThis function lists font families currently available \nthat can be used by 'ggmsa'\n}\n\\examples{\navailable_fonts()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/available_msa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/available.R\n\\name{available_msa}\n\\alias{available_msa}\n\\title{List MSA objects currently available}\n\\usage{\navailable_msa()\n}\n\\value{\nA character vector of available objects\n}\n\\description{\nThis function lists MSA objects currently available that\n can be used by 'ggmsa'\n}\n\\examples{\navailable_msa()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/extract_seq.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ancestor_seq.R\n\\name{extract_seq}\n\\alias{extract_seq}\n\\title{extract_seq}\n\\usage{\nextract_seq(tree_adjust, seq_colname = \"mol_seq\")\n}\n\\arguments{\n\\item{tree_adjust}{ggtree object}\n\n\\item{seq_colname}{the colname of MSA on tree$data}\n}\n\\value{\ncharacter\n}\n\\description{\nextract ancestor sequence from tree data\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/facet_msa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/facet_msa.R\n\\name{facet_msa}\n\\alias{facet_msa}\n\\title{segment MSA}\n\\usage{\nfacet_msa(field)\n}\n\\arguments{\n\\item{field}{a numeric vector of the field size.}\n}\n\\value{\nggplot layers\n}\n\\description{\nThe MSA would be plot in a field that you set.\n}\n\\examples{\nlibrary(ggplot2)\nf <- system.file(\"extdata/sample.fasta\", package=\"ggmsa\")\n# 2 fields\nggmsa(f, end = 120, font = NULL, color=\"Chemistry_AA\") + \n  facet_msa(field = 60)\n# 3 fields\nggmsa(f, end = 120, font = NULL,  color=\"Chemistry_AA\") + \n  facet_msa(field = 40)\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/geom_GC.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/geom_GC.R\n\\name{geom_GC}\n\\alias{geom_GC}\n\\title{geom_GC}\n\\usage{\ngeom_GC(show.legend = FALSE)\n}\n\\arguments{\n\\item{show.legend}{logical. Should this layer be included in the legends?}\n}\n\\value{\na ggplot layer\n}\n\\description{\nMultiple sequence alignment layer for ggplot2. It plot points of GC content.\n}\n\\examples{\n#plot GC content\nf <- system.file(\"extdata/LeaderRepeat_All.fa\", package=\"ggmsa\")\nggmsa(f, font = NULL, color=\"Chemistry_NT\") + geom_GC()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/geom_helix.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/arc.R\n\\name{geom_helix}\n\\alias{geom_helix}\n\\title{geom_helix}\n\\usage{\ngeom_helix(helix_data, color_by = \"length\", overlap = FALSE, ...)\n}\n\\arguments{\n\\item{helix_data}{a data frame. The file of nucleltide secondary structure\nand then read by readSSfile().}\n\n\\item{color_by}{generate colors for helices by various rules,\nincluding integer counts and value ranges one of \"length\" and \"value\"}\n\n\\item{overlap}{Logicals. If TRUE, two structures data called predict \nand known must be given(eg:heilx_data = list(known = data1, \n                                             predicted = data2)), \nplots the predicted helices that are known on top,\npredicted helices that are not known on the bottom, and finally plots \nunpredicted helices on top in black.}\n\n\\item{...}{additional parameter}\n}\n\\value{\nggplot2 layers\n}\n\\description{\nThe layer of helix plot\n}\n\\examples{\nRF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\nRF03120_fas <- system.file(\"extdata/Rfam/RF03120.fasta\", package=\"ggmsa\")\nSS <- readSSfile(RF03120, type = \"Vienna\")\nggmsa(RF03120_fas, font = NULL,border = NA, \n    color = \"Chemistry_NT\", seq_name = FALSE) +\ngeom_helix(SS)\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/geom_msa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/geom_msa.R\n\\name{geom_msa}\n\\alias{geom_msa}\n\\title{geom_msa}\n\\usage{\ngeom_msa(\n  data,\n  font = \"helvetical\",\n  mapping = NULL,\n  color = \"Chemistry_AA\",\n  custom_color = NULL,\n  char_width = 0.9,\n  none_bg = FALSE,\n  by_conservation = FALSE,\n  position_highlight = NULL,\n  seq_name = NULL,\n  border = NULL,\n  consensus_views = FALSE,\n  use_dot = FALSE,\n  disagreement = TRUE,\n  ignore_gaps = FALSE,\n  ref = NULL,\n  position = \"identity\",\n  show.legend = FALSE,\n  dms = FALSE,\n  position_color = FALSE,\n  ...\n)\n}\n\\arguments{\n\\item{data}{sequence alignment with data frame, generated by tidy_msa().}\n\n\\item{font}{font families, possible values are 'helvetical', 'mono', \nand 'DroidSansMono', 'TimesNewRoman'. Defaults is 'helvetical'.}\n\n\\item{mapping}{aes mapping\nIf font = NULL, only plot the background tile.}\n\n\\item{color}{A Color scheme. One of 'Clustal', 'Chemistry_AA', 'Shapely_AA',\n'Zappo_AA', 'Taylor_AA', 'LETTER','CN6',, 'Chemistry_NT', 'Shapely_NT', \n'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.}\n\n\\item{custom_color}{A data frame with two column called \"names\" and \n\"color\".Customize the color scheme.}\n\n\\item{char_width}{a numeric vector. Specifying the character width in \nthe range of 0 to 1. Defaults is 0.9.}\n\n\\item{none_bg}{a logical value indicating whether background \nshould be displayed. Defaults is FALSE.}\n\n\\item{by_conservation}{a logical value. The most conserved regions have\nthe brightest colors.}\n\n\\item{position_highlight}{A numeric vector of the position that\nneed to be highlighted.}\n\n\\item{seq_name}{a logical value indicating whether sequence names\nshould be displayed. Defaults is 'NULL' which indicates that the \nsequence name is displayed when 'font = null', but 'font = char' \nwill not be displayed. If 'seq_name = TRUE' the sequence name will \nbe displayed in any case. If 'seq_name = FALSE' the sequence name will not\n be displayed under any circumstances.}\n\n\\item{border}{a character string. The border color.}\n\n\\item{consensus_views}{a logical value that opening consensus views.}\n\n\\item{use_dot}{a logical value. Displays characters as dots instead of\nfading their color in the consensus view.}\n\n\\item{disagreement}{a logical value. Displays characters that disagreement\nto consensus(excludes ambiguous disagreements).}\n\n\\item{ignore_gaps}{a logical value. When selected TRUE, \ngaps in column are treated as if that row didn't exist.}\n\n\\item{ref}{a character string. Specifying the reference sequence\nwhich should be one of input sequences when 'consensus_views' is TRUE.}\n\n\\item{position}{Position adjustment, either as a string, or\n the result of a call to a position adjustment function,\ndefault is 'identity' meaning 'position_identity()'.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?}\n\n\\item{dms}{logical.}\n\n\\item{position_color}{logical.}\n\n\\item{...}{additional parameter}\n}\n\\value{\nA list\n}\n\\description{\nMultiple sequence alignment layer for ggplot2. \nIt creates background tiles with/without sequence characters.\n}\n\\examples{\nlibrary(ggplot2)\naln <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\ntidy_aln <- tidy_msa(aln, start = 150, end = 170)\nggplot() + geom_msa(data = tidy_aln, font = NULL) + coord_fixed()\n}\n\\author{\nGuangchuang Yu, Lang Zhou\nseq_name' work\nposition_highlight' work\nborder' work\nnone_bg' work\n}\n"
  },
  {
    "path": "man/geom_msaBar.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/geom_msaBar.R\n\\name{geom_msaBar}\n\\alias{geom_msaBar}\n\\title{geom_msaBar}\n\\usage{\ngeom_msaBar()\n}\n\\value{\nA list\n}\n\\description{\nMultiple sequence alignment layer for ggplot2.\n It plot sequence conservation bar.\n}\n\\examples{\n#plot multiple sequence alignment and conservation bar.\nf <- system.file(\"extdata/sample.fasta\", package=\"ggmsa\")\nggmsa(f, 221, 280, font = NULL, seq_name = TRUE) + geom_msaBar()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/geom_seed.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/geom_seed.R\n\\name{geom_seed}\n\\alias{geom_seed}\n\\title{geom_seed}\n\\usage{\ngeom_seed(seed, star = FALSE)\n}\n\\arguments{\n\\item{seed}{a character string.Specifying the miRNA seed sequence\nlike 'GAGGUAG'.}\n\n\\item{star}{a logical value indicating whether asterisks should \nbe displayed.}\n}\n\\value{\na ggplot layer\n}\n\\description{\nHighlighting the seed in miRNA sequences\n}\n\\examples{\nmiRNA_sequences <- system.file(\"extdata/seedSample.fa\", package=\"ggmsa\")\nggmsa(miRNA_sequences, font = 'DroidSansMono', \n      color = \"Chemistry_NT\", none_bg = TRUE) +\ngeom_seed(seed = \"GAGGUAG\", star = FALSE)\nggmsa(miRNA_sequences, font = 'DroidSansMono', \n      color = \"Chemistry_NT\") +\ngeom_seed(seed = \"GAGGUAG\", star = TRUE)\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/geom_seqlogo.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/seqlogo.R\n\\name{geom_seqlogo}\n\\alias{geom_seqlogo}\n\\title{geom_seqlogo}\n\\usage{\ngeom_seqlogo(\n  font = \"DroidSansMono\",\n  color = \"Chemistry_AA\",\n  adaptive = TRUE,\n  top = TRUE,\n  custom_color = NULL,\n  show.legend = FALSE,\n  ...\n)\n}\n\\arguments{\n\\item{font}{font families, possible values are 'helvetical', 'mono', \nand 'DroidSansMono', 'TimesNewRoman'. Defaults is 'DroidSansMono'.}\n\n\\item{color}{A Color scheme. One of 'Clustal', 'Chemistry_AA', \n'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \n'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.}\n\n\\item{adaptive}{A logical value indicating whether the overall height \nof seqlogo corresponds to the number of sequences.If is FALSE, \nseqlogo overall height = 4,fixedly.}\n\n\\item{top}{A logical value. If TRUE, seqlogo is aligned to the top of MSA.}\n\n\\item{custom_color}{A data frame with two cloumn called \"names\" and \n\"color\".Customize the color scheme.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?}\n\n\\item{...}{additional parameter}\n}\n\\value{\nA list\n}\n\\description{\nMultiple sequence alignment layer for ggplot2. It plot sequence motifs.\n}\n\\examples{\n#plot multiple sequence alignment and sequence motifs\nf <- system.file(\"extdata/LeaderRepeat_All.fa\", package=\"ggmsa\")\nggmsa(f,font = NULL,color = \"Chemistry_NT\") + geom_seqlogo()\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/ggSeqBundle.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/SeqBundles.R\n\\name{ggSeqBundle}\n\\alias{ggSeqBundle}\n\\title{ggSeqBundle}\n\\usage{\nggSeqBundle(\n  msa,\n  line_width = 0.3,\n  line_thickness = 0.3,\n  line_high = 0,\n  spline_shape = 0.3,\n  size = 0.5,\n  alpha = 0.2,\n  bundle_color = c(\"#2ba0f5\", \"#424242\"),\n  lev_molecule = c(\"-\", \"A\", \"V\", \"L\", \"I\", \"P\", \"F\", \"W\", \"M\", \"G\", \"S\", \"T\", \"C\", \"Y\",\n    \"N\", \"Q\", \"D\", \"E\", \"K\", \"R\", \"H\")\n)\n}\n\\arguments{\n\\item{msa}{Multiple sequence alignment file(FASTA) or object for \nrepresenting either nucleotide sequences or peptide sequences.Also receives\n multiple MSA files.\n eg:msa = c(\"Gram-negative_AKL.fasta\", \"Gram-positive_AKL.fasta\").}\n\n\\item{line_width}{The width of bundles at each site, default is 0.3.}\n\n\\item{line_thickness}{The thickness of bundles at each site, default is 0.3.}\n\n\\item{line_high}{The high of bundles at each site, default is 0.}\n\n\\item{spline_shape}{A numeric vector of values between -1 and 1, which \ncontrol the shape of the spline relative to the control points.}\n\n\\item{size}{A numeric vector of values between 0 and 1, \nwhich control the size of each lines.}\n\n\\item{alpha}{A numeric vector of values between 0 and 1, \nwhich control the alpha of each lines.}\n\n\\item{bundle_color}{The colors of each sequence bundles.\neg: bundle_color = c(\"#2ba0f5\",\"#424242\").}\n\n\\item{lev_molecule}{Reassigning the Y-axis and displaying \nletter-coded amino acids/nucleotides arranged by physiochemical \nproperties or others.eg:amino acids hydrophobicity \nlev_molecule = c(\"-\",\"A\", \"V\", \"L\", \"I\", \"P\", \"F\", \"W\", \"M\", \n   \"G\", \"S\",\"T\", \"C\", \"Y\", \"N\", \"Q\", \"D\", \"E\", \"K\",\"R\", \"H\").}\n}\n\\value{\nggplot object\n}\n\\description{\nplot Sequence Bundles for MSA based 'ggolot2'\n}\n\\examples{\naln <- system.file(\"extdata\", \"Gram-negative_AKL.fasta\", package = \"ggmsa\")\nggSeqBundle(aln)\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/gghelix.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/arc.R\n\\name{gghelix}\n\\alias{gghelix}\n\\title{gghelix}\n\\usage{\ngghelix(helix_data, color_by = \"length\", overlap = FALSE)\n}\n\\arguments{\n\\item{helix_data}{a data frame. The file of nucleltide secondary structure\nand then read by readSSfile().}\n\n\\item{color_by}{generate colors for helices by various rules,\nincluding integer counts and value ranges one of \"length\" and \"value\"}\n\n\\item{overlap}{Logicals. If TRUE, two structures data called predict \nand known must be given(eg:heilx_data = list(known = data1, \n                                             predicted = data2)), \nplots the predicted helices that are known on top, predicted helices that\n are not known on the bottom, and finally plots unpredicted helices \n on top in black.}\n}\n\\value{\nggplot object\n}\n\\description{\nPlots nucleltide secondary structure as helices in arc diagram\n}\n\\examples{\nRF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\nhelix_data <- readSSfile(RF03120, type = \"Vienna\")\ngghelix(helix_data)\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/ggmaf.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ggmaf.R\n\\name{ggmaf}\n\\alias{ggmaf}\n\\title{ggmaf}\n\\usage{\nggmaf(\n  data,\n  ref,\n  block_start = NULL,\n  block_end = NULL,\n  facet_field = NULL,\n  heights = c(0.4, 0.6),\n  facet_heights = NULL\n)\n}\n\\arguments{\n\\item{data}{a tidy MAF data frame.You can get it by tidy_maf_df()}\n\n\\item{ref}{character, the name of reference genome. \neg:\"hg38.chr1_KI270707v1_random\"}\n\n\\item{block_start}{a numeric vector(>0). The start block to plot.}\n\n\\item{block_end}{a numeric vector(< max block). The end block to plot.}\n\n\\item{facet_field}{a numeric vector. The field in a facet panel.}\n\n\\item{heights}{two numeric vector.The plot proportion between \n\"Genomic location\" panel(upon) and \"Alignment\" panel(down).\nDefault:c(0.4,0.6)}\n\n\\item{facet_heights}{Numeric vectors.The facet proportion.}\n}\n\\value{\nggplot object\n}\n\\description{\nplot MAF\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/ggmsa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ggmsa.R\n\\name{ggmsa}\n\\alias{ggmsa}\n\\title{ggmsa}\n\\usage{\nggmsa(\n  msa,\n  start = NULL,\n  end = NULL,\n  font = \"helvetical\",\n  color = \"Chemistry_AA\",\n  custom_color = NULL,\n  char_width = 0.9,\n  none_bg = FALSE,\n  by_conservation = FALSE,\n  position_highlight = NULL,\n  seq_name = NULL,\n  border = NULL,\n  consensus_views = FALSE,\n  use_dot = FALSE,\n  disagreement = TRUE,\n  ignore_gaps = FALSE,\n  ref = NULL,\n  show.legend = FALSE\n)\n}\n\\arguments{\n\\item{msa}{Multiple aligned sequence files or objects representing either \nnucleotide sequences or AA sequences.}\n\n\\item{start}{a numeric vector. Start position to plot.}\n\n\\item{end}{a numeric vector. End position to plot.}\n\n\\item{font}{font families, possible values are 'helvetical', 'mono', and \n'DroidSansMono', 'TimesNewRoman'.  Defaults is 'helvetical'. \nIf font = NULL, only plot the background tile.}\n\n\\item{color}{a Color scheme. One of 'Clustal', 'Chemistry_AA', \n'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \n'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.}\n\n\\item{custom_color}{A data frame with two column called \"names\" and \n\"color\".Customize the color scheme.}\n\n\\item{char_width}{a numeric vector. Specifying the character width in \nthe range of 0 to 1. Defaults is 0.9.}\n\n\\item{none_bg}{a logical value indicating whether background should be\ndisplayed. Defaults is FALSE.}\n\n\\item{by_conservation}{a logical value. The most conserved regions have \nthe brightest colors.}\n\n\\item{position_highlight}{A numeric vector of the position that need to be\nhighlighted.}\n\n\\item{seq_name}{a logical value indicating whether sequence names \nshould be displayed. Defaults is 'NULL' which indicates that the \nsequence name is displayed when 'font = null', but 'font = char' \nwill not be displayed. If 'seq_name = TRUE' the sequence name will \nbe displayed in any case. If 'seq_name = FALSE' the sequence name \nwill not be displayed under any circumstances.}\n\n\\item{border}{a character string. The border color.}\n\n\\item{consensus_views}{a logical value that opening consensus views.}\n\n\\item{use_dot}{a logical value. Displays characters as dots instead \nof fading their color in the consensus view.}\n\n\\item{disagreement}{a logical value. Displays characters that \ndisagreememt to consensus(excludes ambiguous disagreements).}\n\n\\item{ignore_gaps}{a logical value. When selected TRUE, gaps in column \nare treated as if that row didn't exist.}\n\n\\item{ref}{a character string. Specifying the reference sequence which \nshould be one of input sequences when 'consensus_views' is TRUE.}\n\n\\item{show.legend}{logical. Should this layer be included in the legends?}\n}\n\\value{\nggplot object\n}\n\\description{\nPlot multiple sequence alignment using ggplot2 with multiple color schemes \nsupported.\n}\n\\examples{\n#plot multiple sequences by loading fasta format\nfasta <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\nggmsa(fasta, 164, 213, color=\"Chemistry_AA\")\n\n\\dontrun{\n#XMultipleAlignment objects can be used as input in the 'ggmsa'\nAAMultipleAlignment <- Biostrings::readAAMultipleAlignment(fasta)\nggmsa(AAMultipleAlignment, 164, 213, color=\"Chemistry_AA\")\n\n#XStringSet objects can be used as input in the 'ggmsa'\nAAStringSet <- Biostrings::readAAStringSet(fasta)\nggmsa(AAStringSet, 164, 213, color=\"Chemistry_AA\")\n\n#Xbin objects from 'seqmagick' can be used as input in the 'ggmsa'\nAAbin <- seqmagick::fa_read(fasta)\nggmsa(AAbin, 164, 213, color=\"Chemistry_AA\")\n}\n}\n\\author{\nGuangchuang Yu\n}\n"
  },
  {
    "path": "man/merge_seq.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/pp_interactive.R\n\\name{merge_seq}\n\\alias{merge_seq}\n\\title{merge_seq}\n\\usage{\nmerge_seq(previous_seq, gap, subsequent_seq, adjust_name = TRUE)\n}\n\\arguments{\n\\item{previous_seq}{previous MSA}\n\n\\item{gap}{gap length}\n\n\\item{subsequent_seq}{subsequent MSA}\n\n\\item{adjust_name}{logical value. merge seq name or not}\n}\n\\value{\ntidy MSA data frame\n}\n\\description{\nmerge two MSA\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/plot-methods.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method-plot.R\n\\docType{methods}\n\\name{plot}\n\\alias{plot}\n\\alias{plot,SeqDiff,ANY-method}\n\\title{plot method for SeqDiff object}\n\\usage{\n\\S4method{plot}{SeqDiff,ANY}(\n  x,\n  width = 50,\n  title = \"auto\",\n  xlab = \"Nucleotide Position\",\n  by = \"bar\",\n  fill = \"firebrick\",\n  colors = c(A = \"#ff6d6d\", C = \"#769dcc\", G = \"#f2be3c\", T = \"#74ce98\"),\n  xlim = NULL\n)\n}\n\\arguments{\n\\item{x}{SeqDiff object}\n\n\\item{width}{bin width}\n\n\\item{title}{plot title}\n\n\\item{xlab}{xlab}\n\n\\item{by}{one of 'bar' and 'area'}\n\n\\item{fill}{fill color of upper part of the plot}\n\n\\item{colors}{color of lower part of the plot}\n\n\\item{xlim}{limits of x-axis}\n}\n\\value{\nplot\n}\n\\description{\nplot method for SeqDiff object\n}\n\\examples{\nfas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n                  pattern=\"fas\", full.names=TRUE)\nx1 <- seqdiff(fas[1], reference=1)\nplot(x1)\n}\n\\author{\nguangchuang yu\n}\n"
  },
  {
    "path": "man/readSSfile.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/arc.R\n\\name{readSSfile}\n\\alias{readSSfile}\n\\title{readSSfile}\n\\usage{\nreadSSfile(file, type = NULL)\n}\n\\arguments{\n\\item{file}{A text file in connect format}\n\n\\item{type}{file type. one of \"Helix, \"Connect\", \"Vienna\" and \"Bpseq\"}\n}\n\\value{\ndata frame\n}\n\\description{\nRead secondary structure file\n}\n\\examples{\nRF03120 <- system.file(\"extdata/Rfam/RF03120_SS.txt\", package=\"ggmsa\")\nhelix_data <- readSSfile(RF03120, type = \"Vienna\")\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/read_maf.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/read_maf.R\n\\name{read_maf}\n\\alias{read_maf}\n\\title{read_maf}\n\\usage{\nread_maf(multiple_alignment_format)\n}\n\\arguments{\n\\item{multiple_alignment_format}{a multiple alignment format(MAF) file}\n}\n\\value{\ndata frame\n}\n\\description{\nread 'multiple alignment format'(MAF) file\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/reset_pos.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/pp_interactive.R\n\\name{reset_pos}\n\\alias{reset_pos}\n\\title{reset_pos}\n\\usage{\nreset_pos(seq_df)\n}\n\\arguments{\n\\item{seq_df}{MSA data}\n}\n\\value{\ndata frame\n}\n\\description{\nreset MSA position\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/sample.fasta.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{sample.fasta}\n\\alias{sample.fasta}\n\\title{A sample data used in ggmsa}\n\\format{\nA MSA fasta with 9 sequences and 456 positions.\n}\n\\description{\nA dataset containing the alignment sequences of \nthe phenylalanine hydroxylase protein (PH4H) \nwithin nine species\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/seedSample.fa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{seedSample.fa}\n\\alias{seedSample.fa}\n\\title{microRNA data used in ggmsa}\n\\format{\nA MSA fasta with 6 sequences and 22 positions.\n}\n\\source{\n\\url{https://www.mirbase.org/ftp.shtml}\n}\n\\description{\nFasta format sequences of mature miRNA sequences \nfrom miRBase\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/seqdiff.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/seqdiff.R\n\\name{seqdiff}\n\\alias{seqdiff}\n\\title{seqdiff}\n\\usage{\nseqdiff(fasta, reference = 1)\n}\n\\arguments{\n\\item{fasta}{fasta file}\n\n\\item{reference}{which sequence serve as reference, 1 or 2}\n}\n\\value{\nSeqDiff object\n}\n\\description{\ncalculate difference of two aligned sequences\n}\n\\examples{\nfas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n                  pattern=\"fas\", full.names=TRUE)\nseqdiff(fas[1], reference=1)\n}\n\\author{\nguangchuang yu\n}\n"
  },
  {
    "path": "man/seqlogo.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/seqlogo.R\n\\name{seqlogo}\n\\alias{seqlogo}\n\\title{seqlogo}\n\\usage{\nseqlogo(\n  msa,\n  start = NULL,\n  end = NULL,\n  font = \"DroidSansMono\",\n  color = \"Chemistry_AA\",\n  adaptive = FALSE,\n  top = FALSE,\n  custom_color = NULL\n)\n}\n\\arguments{\n\\item{msa}{Multiple sequence alignment file or object for representing \neither nucleotide sequences or peptide sequences.}\n\n\\item{start}{Start position to plot.}\n\n\\item{end}{End position to plot.}\n\n\\item{font}{font families, possible values are 'helvetical', 'mono', and \n'DroidSansMono', 'TimesNewRoman'.  Defaults is 'DroidSansMono'. \nIf font=NULL, only the background tiles is drawn.}\n\n\\item{color}{A Color scheme. One of 'Clustal', 'Chemistry_AA', \n'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6','Chemistry_NT', \n'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.}\n\n\\item{adaptive}{A logical value indicating whether the overall height of \nseqlogo corresponds to the number of sequences. If FALSE, seqlogo \noverall height = 4,fixedly.}\n\n\\item{top}{A logical value. If TRUE, seqlogo is aligned to the top of MSA.}\n\n\\item{custom_color}{A data frame with two cloumn called \"names\" and \n\"color\".Customize the color scheme.}\n}\n\\value{\nggplot object\n}\n\\description{\nplot sequence logo for MSA based 'ggolot2'\n}\n\\examples{\n#plot sequence motif independently\nnt_sequence <- system.file(\"extdata\", \"LeaderRepeat_All.fa\", \n                           package = \"ggmsa\")\nseqlogo(nt_sequence, color = \"Chemistry_NT\")\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/sequence-link-tree.fasta.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{sequence-link-tree.fasta}\n\\alias{sequence-link-tree.fasta}\n\\title{sequence-link-tree}\n\\format{\nA MSA fasta with 28 sequences and 480 positions.\n}\n\\description{\nAlignment sequences used to demonstrate circular MSA layout\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/show-methods.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/method-show.R\n\\docType{methods}\n\\name{show}\n\\alias{show}\n\\alias{SeqDiff-class}\n\\alias{show,SeqDiff-method}\n\\title{show method}\n\\usage{\nshow(object)\n}\n\\arguments{\n\\item{object}{SeqDiff object}\n}\n\\value{\nmessage\n}\n\\description{\nshow method\n}\n\\examples{\nfas <- list.files(system.file(\"extdata\", \"GVariation\", package=\"ggmsa\"),\n                  pattern=\"fas\", full.names=TRUE)\nx1 <- seqdiff(fas[1], reference=1)\nx1\n}\n"
  },
  {
    "path": "man/simplify_hdata.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/pp_interactive.R\n\\name{simplify_hdata}\n\\alias{simplify_hdata}\n\\title{simplify_hdata}\n\\usage{\nsimplify_hdata(hdata, sim_msa)\n}\n\\arguments{\n\\item{hdata}{data from tidy_hdata()}\n\n\\item{sim_msa}{MSA data frame}\n}\n\\value{\ndata frame\n}\n\\description{\nreset hdata data position\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/simplot.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/simplot.R\n\\name{simplot}\n\\alias{simplot}\n\\title{simplot}\n\\usage{\nsimplot(\n  file,\n  query,\n  window = 200,\n  step = 20,\n  group = FALSE,\n  id,\n  sep,\n  sd = FALSE,\n  smooth = FALSE,\n  smooth_params = list(method = \"loess\", se = FALSE)\n)\n}\n\\arguments{\n\\item{file}{alignment fast file}\n\n\\item{query}{query sequence}\n\n\\item{window}{sliding window size (bp)}\n\n\\item{step}{step size to slide the window (bp)}\n\n\\item{group}{whether grouping sequence.(eg. For \"A-seq1,A-seq-2,B-seq1 and \nB-seq2\", using sep = \"-\" and id = 1 to divide sequences into groups A and \nB)}\n\n\\item{id}{position to extract id for grouping; only works if group = TRUE}\n\n\\item{sep}{separator to split sequence name; only works if group = TRUE}\n\n\\item{sd}{whether display standard deviation of \nsimilarity among each group; only works if group=TRUE}\n\n\\item{smooth}{FALSE(default)or TRUE; whether display smoothed spline.}\n\n\\item{smooth_params}{a list that add params for geom_smooth,\n(default: smooth_params = list(method = \"loess\", se = FALSE))}\n}\n\\value{\nggplot object\n}\n\\description{\nSequence similarity plot\n}\n\\examples{\nfas <- system.file(\"extdata/GVariation/sample_alignment.fa\", \n                    package=\"ggmsa\")\nsimplot(fas, 'CF_YL21')\n}\n\\author{\nguangchuang yu\n}\n"
  },
  {
    "path": "man/theme_msa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/theme_msa.R\n\\name{theme_msa}\n\\alias{theme_msa}\n\\title{theme_msa}\n\\usage{\ntheme_msa()\n}\n\\description{\nTheme for ggmsa.\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/tidy_hdata.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/pp_interactive.R\n\\name{tidy_hdata}\n\\alias{tidy_hdata}\n\\title{tidy_hdata}\n\\usage{\ntidy_hdata(gap, inter, previous_seq, subsequent_seq)\n}\n\\arguments{\n\\item{gap}{gap length}\n\n\\item{inter}{protein-protein interactive position data}\n\n\\item{previous_seq}{previous MSA}\n\n\\item{subsequent_seq}{subsequent MSA}\n}\n\\value{\nhelix data\n}\n\\description{\ntidy protein-protein interactive position data\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/tidy_maf_df.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ggmaf.R\n\\name{tidy_maf_df}\n\\alias{tidy_maf_df}\n\\title{tidy_maf_df}\n\\usage{\ntidy_maf_df(maf_df, ref)\n}\n\\arguments{\n\\item{maf_df}{a MAF data frame.You can get it by read_maf()}\n\n\\item{ref}{character, the name of reference genome. \neg:\"hg38.chr1_KI270707v1_random\"}\n}\n\\value{\ndata frame\n}\n\\description{\ntidy MAF data frame\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "man/tidy_msa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/msa_data.R\n\\name{tidy_msa}\n\\alias{tidy_msa}\n\\title{tidy_msa}\n\\usage{\ntidy_msa(msa, start = NULL, end = NULL)\n}\n\\arguments{\n\\item{msa}{multiple sequence alignment file or sequence object in \nDNAStringSet, RNAStringSet, AAStringSet, BStringSet, DNAMultipleAlignment, \nRNAMultipleAlignment, AAMultipleAlignment, DNAbin or AAbin}\n\n\\item{start}{start position to extract subset of alignment}\n\n\\item{end}{end position to extract subset of alignemnt}\n}\n\\value{\ntibble data frame\n}\n\\description{\nConvert msa file/object to tidy data frame.\n}\n\\examples{\nfasta <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\naln <- tidy_msa(msa = fasta, start = 10, end = 100)\n}\n\\author{\nGuangchuang Yu\n}\n"
  },
  {
    "path": "man/tp53.fa.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/data.R\n\\docType{data}\n\\name{tp53.fa}\n\\alias{tp53.fa}\n\\title{TP53 MSA}\n\\format{\nA MSA fasta with 5 sequences and 404 positions.\n}\n\\description{\nAlignment sequences of used to show graphical combination\n}\n\\keyword{datasets}\n"
  },
  {
    "path": "man/treeMSA_plot.Rd",
    "content": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ancestor_seq.R\n\\name{treeMSA_plot}\n\\alias{treeMSA_plot}\n\\title{treeMSA_plot}\n\\usage{\ntreeMSA_plot(\n  p_tree,\n  tidymsa_df,\n  ancestral_node = \"none\",\n  sub = FALSE,\n  panel = \"MSA\",\n  font = NULL,\n  color = \"Chemistry_AA\",\n  seq_colname = NULL,\n  ...\n)\n}\n\\arguments{\n\\item{p_tree}{tree view}\n\n\\item{tidymsa_df}{tidy MSA data}\n\n\\item{ancestral_node}{vector, internal node in tree. Assigning a internal \nnode to display \"ancestral sequences\",If ancestral_node = \"none\" hides \nall ancestral sequences, if ancestral_node = \"all\" shows all ancestral \nsequences.}\n\n\\item{sub}{logical value. Displaying a subset of ancestral sequences or not.}\n\n\\item{panel}{panel name for plot of MSA data}\n\n\\item{font}{font families, possible values are 'helvetical', 'mono', and \n'DroidSansMono', 'TimesNewRoman'.  Defaults is 'helvetical'. \nIf font = NULL, only plot the background tile.}\n\n\\item{color}{a Color scheme. One of 'Clustal', 'Chemistry_AA', \n'Shapely_AA', 'Zappo_AA', 'Taylor_AA', 'LETTER', 'CN6', 'Chemistry_NT', \n'Shapely_NT', 'Zappo_NT', 'Taylor_NT'. Defaults is 'Chemistry_AA'.}\n\n\\item{seq_colname}{the colname of MSA on tree$data}\n\n\\item{...}{additional parameters for 'geom_msa'}\n}\n\\value{\nggplot object\n}\n\\description{\nplot Tree-MSA plot\n}\n\\details{\n'treeMSA_plot()' automatically re-arranges the MSA data according to \nthe tree structure,\n}\n\\author{\nLang Zhou\n}\n"
  },
  {
    "path": "tests/testthat/test-main.R",
    "content": "library(ggmsa)\nlibrary(ggplot2)\n\n\ntest_that(\"check whether `ggmsa` create a `ggplot` object\", {\n    p <- ggmsa(msa = system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\"), \n               start = 10, \n               end = 20,\n               font = NULL)\n    \n    \n    expect_true(is.ggplot(p))\n})\n\n\n\n"
  },
  {
    "path": "tests/testthat/test-msa_data.R",
    "content": "\n\nlibrary(ggmsa)\n\nmsa <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\ntidymsa <- tidy_msa(msa, 10, 20)\n\n\ntest_that(\"check msaData integrity when using `font`\", {\n    msaData <- msa_data(tidymsa)\n    msaFull_names <- c(\"label\", \n                       \"x\",\n                       \"yy\",\n                       \"order\",\n                       \"name\",\n                       \"position\",\n                       \"character\", \n                       \"color\",\n                       \"group\", \n                       \"y\")\n    \n    expect_true(is.data.frame(msaData))\n    expect_named(msaData, msaFull_names)\n})\n\n\ntest_that(\"check msaData integrity when using `font = NULL`\", {\n    msaData <- msa_data(tidymsa, font = NULL)\n    msaFull_names <- c(\"name\", \"position\", \"character\", \"color\"  )\n    \n    expect_true(is.data.frame(msaData))\n    expect_named(msaData, msaFull_names)\n})\n"
  },
  {
    "path": "tests/testthat/test-tidy_msa.R",
    "content": "\n\nlibrary(ggmsa)\nlibrary(Biostrings)\n\nmsa <- system.file(\"extdata\", \"sample.fasta\", package = \"ggmsa\")\ntidy_names <- c(\"name\", \"position\", \"character\")\n\n\ntest_that(\"tidy FASTA format by tidy_msa\", {\n  fasta_tidy <- tidy_msa(msa, 10, 20)\n  expect_true(is.data.frame(fasta_tidy))\n  expect_named(fasta_tidy, tidy_names)\n})\n\n\ntest_that(\"tidy Biostrings objects by tidy_msa\", {\n    AAMultipleAlignment <- readAAMultipleAlignment(msa)\n    expect_s4_class(AAMultipleAlignment, \"AAMultipleAlignment\")\n    \n    AAStringSet <- readAAStringSet(msa)\n    expect_s4_class( AAStringSet, \"AAStringSet\")\n    \n    AAMultipleAlignment_tidy <- tidy_msa(AAMultipleAlignment, 10, 20)\n    AAStringSet_tidy <- tidy_msa(AAStringSet, 10, 20)\n    \n    expect_true(is.data.frame(AAMultipleAlignment_tidy))\n    expect_named(AAMultipleAlignment_tidy, tidy_names)\n    \n    \n    expect_true(is.data.frame(AAStringSet_tidy))\n    expect_named(AAStringSet_tidy, tidy_names)\n})\n\n\ntest_that(\"tidy AAbin objects by tidy_msa\", {\n    AAbin <- ape::read.FASTA(msa, \"AA\")\n    expect_s3_class(AAbin, \"AAbin\")\n    \n    AAbin_tidy <- tidy_msa(AAbin, 10, 20)\n    \n    expect_true(is.data.frame(AAbin_tidy))\n    expect_named(AAbin_tidy, tidy_names)\n})\n\n\n\n"
  },
  {
    "path": "tests/testthat.R",
    "content": "library(testthat)\nlibrary(ggmsa)\n\ntest_check(\"ggmsa\")\n"
  },
  {
    "path": "vignettes/.gitignore",
    "content": "Annotations.Rmd\nColor_schemes_And_Font_Families.Rmd\nMSA_theme.Rmd\nOther_Modules.Rmd\nView_modes.Rmd"
  },
  {
    "path": "vignettes/ggmsa.Rmd",
    "content": "---\ntitle: \"ggmsa-Getting Started\"\nauthor: \"GuangChuang Yu and Lang Zhou\"\noutput:\n  prettydoc::html_pretty:\n    toc: false\n    theme: cayman\n    highlight: github\n  pdf_document:\n    toc: true\ndate: \"`r Sys.Date()`\"\nbibliography: ggmsa.bib\nvignette: >\n  %\\VignetteIndexEntry{ggmsa}\n  %\\VignetteEngine{knitr::rmarkdown}\n  %\\VignetteEncoding{UTF-8}\n---\n\n```{r, include = FALSE}\nknitr::opts_chunk$set(\n  collapse = TRUE,\n  comment = \"#>\"\n)\n\n# Packages -------------------------------------------------------------------\nlibrary(ggmsa)\nlibrary(ggplot2)\nlibrary(yulab.utils)\n```\n\n\n#  Install package\n```{r eval = FALSE}\nif (!require(\"BiocManager\"))\n    install.packages(\"BiocManager\")\nBiocManager::install(\"ggmsa\")\n```\n\n#  Introduction\n\nggmsa is a package designed to plot multiple sequence alignments.\n\nThis package implements functions to visualize publication-quality \nmultiple sequence alignments (protein/DNA/RNA) in R extremely \nsimple and powerful. It uses module design to annotate sequence \nalignments and allows to accept other data sets for diagrams combination.\n\nIn this tutorial, we’ll work through the basics of using ggmsa.\n\n```{r results=\"hide\", message=FALSE, warning=FALSE}\nlibrary(ggmsa)\n```\n\n\n```{r echo=FALSE, out.width='50%'}\nknitr::include_graphics(\"man/figures/workflow.png\")\n```\n\n\n#  Importing MSA data\n\nWe’ll start by importing some example data to use throughout this \ntutorial. Expect FASTA files, some of the objects in R can also \nas input. `available_msa()` can be used to list MSA objects \ncurrently available.\n\n```{r warning=FALSE}\n available_msa()\n\n protein_sequences <- system.file(\"extdata\", \"sample.fasta\", \n                                  package = \"ggmsa\")\n miRNA_sequences <- system.file(\"extdata\", \"seedSample.fa\", \n                                package = \"ggmsa\")\n nt_sequences <- system.file(\"extdata\", \"LeaderRepeat_All.fa\", \n                             package = \"ggmsa\")\n \n```\n\n# Basic use: MSA Visualization\n\nThe most simple code to use ggmsa:\n```{r fig.height = 2, fig.width = 10, warning=FALSE}\nggmsa(protein_sequences, 300, 350, color = \"Clustal\", \n      font = \"DroidSansMono\", char_width = 0.5, seq_name = TRUE )\n```\n\n##  Color Schemes\n\nggmsa predefines several color schemes for rendering MSA \nare shipped in the package. In the same ways, using \n`available_msa()` to list color schemes currently available. \nNote that amino acids (protein) and nucleotides (DNA/RNA) have \ndifferent names.\n\n```{r warning=FALSE}\navailable_colors()\n```\n\n```{r echo=FALSE, out.width = '50%'}\nknitr::include_graphics(\"man/figures/schemes.png\")\n```\n\n##  Font\n\nSeveral predefined fonts are shipped ggmsa. \nUsers can use `available_fonts()` to list the font currently available.\n\n```{r warning=FALSE}\navailable_fonts()\n```\n\n#  MSA Annotation\n\nggmsa supports annotations for MSA. Similar to the ggplot2, \nit implements annotations by `geom` and users can perform \nannotation with `+` , like this: `ggmsa() + geom_*()`. \nAutomatically generated annotations that containing colored\nlabels and symbols are overlaid on MSAs to indicate \npotentially conserved or divergent regions.\n\nFor example, visualizing multiple sequence alignment\nwith **sequence logo** and **bar chart**:\n\n```{r fig.height = 2.5, fig.width = 11, warning = FALSE, message = FALSE}\nggmsa(protein_sequences, 221, 280, seq_name = TRUE, char_width = 0.5) + \n  geom_seqlogo(color = \"Chemistry_AA\") + geom_msaBar()\n```\n\n\nThis table shows the annnotation layers supported by ggmsa as following:\n\n```{r  echo=FALSE, results='asis', warning=FALSE, message=FALSE}  \nlibrary(kableExtra)\nx <- \"geom_seqlogo()\\tgeometric layer\\tautomatically generated sequence logos for a MSA\\n\ngeom_GC()\\tannotation module\\tshows GC content with bubble chart\\n\ngeom_seed()\\tannotation module\\thighlights seed region on miRNA sequences\\n\ngeom_msaBar()\\tannotation module\\tshows sequences conservation by a bar chart\\n\ngeom_helix()\\tannotation module\\tdepicts RNA secondary structure as arc diagrams(need extra data)\\n\n \"\n\nxx <- strsplit(x, \"\\n\\n\")[[1]]\ny <- strsplit(xx, \"\\t\") %>% do.call(\"rbind\", .)\ny <- as.data.frame(y, stringsAsFactors = FALSE)\n\ncolnames(y) <- c(\"Annotation modules\", \"Type\", \"Description\")\n\nknitr::kable(y, align = \"l\", booktabs = TRUE, escape = TRUE) %>% \n    kable_styling(latex_options = c(\"striped\", \"hold_position\", \"scale_down\"))\n  \n```\n\n# Learn more\n\nCheck out the guides for learning everything there is to know about all the different features:\n\n- [Getting Started](https://yulab-smu.top/ggmsa/articles/ggmsa.html)\n- [Annotations](https://yulab-smu.top/ggmsa/articles/Annotations.html)\n- [Color Schemes and Font Families](https://yulab-smu.top/ggmsa/articles/Color_schemes_And_Font_Families.html)\n- [Theme](https://yulab-smu.top/ggmsa/articles/guides/MSA_theme.html)\n- [Other Modules](https://yulab-smu.top/ggmsa/articles/Other_Modules.html)\n- [View Modes](https://yulab-smu.top/ggmsa/articles/View_modes.html)\n\n\n\n#  Session Info\n```{r echo = FALSE}\nsessionInfo()\n```\n\n"
  },
  {
    "path": "vignettes/ggmsa.bib",
    "content": "@article{Taylor1997Residual,\n         title={Residual colours: a proposal for aminochromography.},\n         author={Taylor, W R},\n         journal={Protein Eng},\n         volume={10},\n         number={7},\n         pages={743-746},\n         year={1997},\n}\n\n@article{Waterhouse2009Jalview,\n         title={Jalview Version 2--a multiple sequence alignment editor and analysis workbench},\n         author={Waterhouse, A. M. and Procter, J. B. and Martin, D. M. and Clamp, M and Barton, G. J.},\n         journal={Bioinformatics},\n         volume={25},\n         number={9},\n         pages={1189},\n         year={2009},\n}\n\n@article{yu2017ggtree,\n         title={ggtree: an R package for visualization and annotation of phylogenetic trees with their covariates and other associated data},\n         author={Yu, Guangchuang and Smith, David K and Zhu, Huachen and Guan, Yi and Lam, Tommy Tsanyuk},\n         journal={Methods in Ecology and Evolution},\n         volume={8},\n         number={1},\n         pages={28--36},\n         year={2017}\n}\n\n@article{Wagih2017ggseqlogo,\n         title={ggseqlogo: a versatile R package for drawing sequence logos},\n         author={Wagih, Omar},\n         journal={Bioinformatics},\n         volume={33},\n         number={22},\n         year={2017},\n}"
  }
]