Full Code of YuLab-SMU/ChIPseeker for AI

devel c1f4a507f226 cached

119 files

433.4 KB

123.7k tokens

1 requests

Download .txt

Showing preview only (463K chars total). Download the full file or copy to clipboard to get everything.

Repository: YuLab-SMU/ChIPseeker
Branch: devel
Commit: c1f4a507f226
Files: 119
Total size: 433.4 KB

Directory structure:
gitextract_66qccjqf/

├── .Rbuildignore
├── .github/
│   └── issue_template.md
├── .gitignore
├── .svnignore
├── .travis.yml
├── CONDUCT.md
├── DESCRIPTION
├── GEODATA
├── Makefile
├── NAMESPACE
├── NEWS
├── NEWS.md
├── R/
│   ├── AllGenerics.R
│   ├── ChIPseeker-package.R
│   ├── GEO.R
│   ├── addGeneAnno.R
│   ├── annotatePeak.R
│   ├── covplot.R
│   ├── csAnno.R
│   ├── dplyr-verb.R
│   ├── enrichOverlap.R
│   ├── getFlankingGene.R
│   ├── getGenomicAnnotation.R
│   ├── getNearestFeatureIndicesAndDistances.R
│   ├── plotAnno.R
│   ├── plotDistToTSS.R
│   ├── plotTagMatrix.R
│   ├── readPeakFile.R
│   ├── seq2gene.R
│   ├── subset.R
│   ├── tagMatrix.R
│   ├── upsetplot.R
│   ├── utilities.R
│   ├── vennpie.R
│   ├── vennplot.R
│   └── zzz.R
├── README.Rmd
├── README.md
├── appveyor.yml
├── data/
│   ├── gsminfo.rda
│   ├── tagMatrixList.rda
│   └── ucsc_release.rda
├── inst/
│   ├── CITATION
│   ├── extdata/
│   │   ├── processedGSM.rda
│   │   └── sample_peaks.txt
│   └── test-plot/
│       ├── test-plotPeakProf.R
│       └── test-plotTagMatrix.R
├── man/
│   ├── ChIPseeker-package.Rd
│   ├── ChIPseekerCache.Rd
│   ├── annotatePeak.Rd
│   ├── as.GRanges.Rd
│   ├── as.data.frame.csAnno.Rd
│   ├── check_upstream_and_downstream.Rd
│   ├── combine_csAnno.Rd
│   ├── covplot.Rd
│   ├── csAnno-class.Rd
│   ├── dot-ChIPseekerEnv.Rd
│   ├── dotFun.Rd
│   ├── downloadGEObedFiles.Rd
│   ├── downloadGSMbedFiles.Rd
│   ├── dropAnno.Rd
│   ├── enrichAnnoOverlap.Rd
│   ├── enrichPeakOverlap.Rd
│   ├── getAnnoStat.Rd
│   ├── getBioRegion.Rd
│   ├── getGEOInfo.Rd
│   ├── getGEOgenomeVersion.Rd
│   ├── getGEOspecies.Rd
│   ├── getGeneAnno.Rd
│   ├── getGenomicAnnotation.Rd
│   ├── getNearestFeatureIndicesAndDistances.Rd
│   ├── getPromoters.Rd
│   ├── getSampleFiles.Rd
│   ├── getTagMatrix.Rd
│   ├── getTagMatrix.binning.internal.Rd
│   ├── getTagMatrix.internal.Rd
│   ├── getTagMatrix2.Rd
│   ├── getTagMatrix2.binning.internal.Rd
│   ├── getTagMatrix2.internal.Rd
│   ├── info.Rd
│   ├── makeBioRegionFromGranges.Rd
│   ├── make_label.Rd
│   ├── overlap.Rd
│   ├── peakHeatmap.Rd
│   ├── peakHeatmap_multiple_Sets.Rd
│   ├── peak_Profile_Heatmap.Rd
│   ├── plotAnnoBar-methods.Rd
│   ├── plotAnnoBar.Rd
│   ├── plotAnnoPie-methods.Rd
│   ├── plotAnnoPie.Rd
│   ├── plotAvgProf.Rd
│   ├── plotAvgProf.binning.Rd
│   ├── plotAvgProf2.Rd
│   ├── plotDistToTSS-methods.Rd
│   ├── plotDistToTSS.data.frame.Rd
│   ├── plotMultiProf.Rd
│   ├── plotMultiProf.binning.Rd
│   ├── plotMultiProf.binning.internal.Rd
│   ├── plotMultiProf.normal.Rd
│   ├── plotMultiProf.normal.internal.Rd
│   ├── plotPeakProf.Rd
│   ├── plotPeakProf2.Rd
│   ├── plotPeakProf_MultiWindows.Rd
│   ├── readPeakFile.Rd
│   ├── reexports.Rd
│   ├── seq2gene.Rd
│   ├── show-methods.Rd
│   ├── shuffle.Rd
│   ├── tagHeatmap.Rd
│   ├── upsetplot-methods.Rd
│   ├── vennpie-methods.Rd
│   ├── vennplot.Rd
│   └── vennplot.peakfile.Rd
├── tests/
│   ├── testthat/
│   │   ├── test-bed.R
│   │   ├── test-getTagMatrix.R
│   │   └── test-txdb.R
│   └── testthat.R
└── vignettes/
    ├── ChIPseeker.Rmd
    └── ChIPseeker.bib

================================================
FILE CONTENTS
================================================

================================================
FILE: .Rbuildignore
================================================
.travis.yml
.svnignore
.gitignore
^.*\.DS_Store
Makefile
README.Rmd
appveyor.yml
GEODATA
docs
mkdocs
.github
^CONDUCT\.md$


================================================
FILE: .github/issue_template.md
================================================
### Prerequisites

+ [ ] Have you read [Feedback](https://guangchuangyu.github.io/chipseeker/#feedback) and follow the [guide](https://guangchuangyu.github.io/2016/07/how-to-bug-author/)?
	* [ ] make sure your are using the latest release version
	* [ ] read the [documents](https://guangchuangyu.github.io/chipseeker/documentation/)
	* [ ] google your quesion/issue

### Describe you issue

* [ ] Make a reproducible example (*e.g.* [1](https://gist.github.com/talonsensei/e1fad082657054207f249ec98f0920eb))
* [ ] your code should contain comments to describe the problem (*e.g.* what expected and actually happened?)


### Ask in right place

* [ ] for bugs or feature requests, post here (github issue)
* [ ] for questions, please post to [Bioconductor](https://support.bioconductor.org/) or [Biostars](https://www.biostars.org/) with tag `ChIPseeker`


================================================
FILE: .gitignore
================================================
.DS_Store
data/.DS_Store
inst/.DS_Store
inst/extdata/.DS_Store
inst/extdata/GEO_sample_data/.DS_Store
.svn
*~
docs/__init__.py
docs/__pycache__
__init__.pyc


================================================
FILE: .svnignore
================================================
.git
*.Rhistory
.travis.yml
appveyor.yml
docs
mkdocs
.github


================================================
FILE: .travis.yml
================================================
## reference: http://docs.travis-ci.com/user/languages/r/

language: r
r: bioc-devel
  
cache: packages
bioc_required: true
bioc_use_devel: true

os:
  - linux
  - osx


env:
  global:
    - _R_CHECK_FORCE_SUGGESTS_=False
    - R_LIBS="http://cran.rstudio.com"

r_packages:
    - knitr
    - rmarkdown

bioc_packages:
    - DO.db
    - DOSE
    - graphite
    - ReactomePA
    - org.Hs.eg.db
    - TxDb.Hsapiens.UCSC.hg19.knownGene
    - GenomicRanges
    - GenomicFeatures
    
after_failure:
  - ./travis-tool.sh dump_logs

r_github_packages:
  - jimhester/covr

after_success:
  - Rscript -e 'library(covr); codecov()'

notifications:
  email:
    recipients: gcyu@connect.hku.hk
    on_success: never
    on_failure: always



================================================
FILE: CONDUCT.md
================================================
# Contributor Code of Conduct

As contributors and maintainers of this project, we pledge to respect all people who 
contribute through reporting issues, posting feature requests, updating documentation,
submitting pull requests or patches, and other activities.

We are committed to making participation in this project a harassment-free experience for
everyone, regardless of level of experience, gender, gender identity and expression,
sexual orientation, disability, personal appearance, body size, race, ethnicity, age, or religion.

Examples of unacceptable behavior by participants include the use of sexual language or
imagery, derogatory comments or personal attacks, trolling, public or private harassment,
insults, or other unprofessional conduct.

Project maintainers have the right and responsibility to remove, edit, or reject comments,
commits, code, wiki edits, issues, and other contributions that are not aligned to this 
Code of Conduct. Project maintainers who do not follow the Code of Conduct may be removed 
from the project team.

Instances of abusive, harassing, or otherwise unacceptable behavior may be reported by 
opening an issue or contacting one or more of the project maintainers.

This Code of Conduct is adapted from the Contributor Covenant 
(http:contributor-covenant.org), version 1.0.0, available at 
http://contributor-covenant.org/version/1/0/0/


================================================
FILE: DESCRIPTION
================================================
Package: ChIPseeker
Type: Package
Title: ChIPseeker for ChIP peak Annotation, Comparison, and Visualization
Version: 1.49.0
Authors@R: c( 
    person(given = "Guangchuang", family = "Yu",       email = "guangchuangyu@gmail.com",      role  = c("aut", "cre"), comment = c(ORCID = "0000-0002-6485-8781")),
    person(given = "Ming",        family = "Li",       email = "limiang929@gmail.com",         role = "ctb"),
    person(given = "Qianwen",     family = "Wang",     email = "treywea@gmail.com",            role = "ctb"),
    person(given = "Yun",         family = "Yan",      email = "youryanyun@gmail.com",         role = "ctb"),
    person(given = "Hervé",       family = "Pagès",    email = "hpages.on.github@gmail.com",   role  = "ctb"),
    person(given = "Michael",     family = "Kluge",    email = "michael.kluge@bio.ifi.lmu.de", role  = "ctb"),
    person(given = "Thomas",      family = "Schwarzl", email = "schwarzl@embl.de",             role  = "ctb"),
    person(given = "Zhougeng",    family = "Xu",       email = "xuzhougeng@163.com",           role  = "ctb"),
	person(given = "Chun-Hui",    family = "Gao",      email="gaospecial@gmail.com",           role  = "ctb", comment=c(ORCID = "0000-0002-1445-7939"))
	)
Maintainer: Guangchuang Yu <guangchuangyu@gmail.com>
Description: This package implements functions to retrieve the nearest genes
    around the peak, annotate genomic region of the peak, statstical methods
    for estimate the significance of overlap among ChIP peak data sets, and
    incorporate GEO database for user to compare the own dataset with those
    deposited in database. The comparison can be used to infer cooperative
    regulation and thus can be used to generate hypotheses. Several
    visualization functions are implemented to summarize the coverage of the
    peak experiment, average profile and heatmap of peaks binding to TSS
    regions, genomic annotation, distance to TSS, and overlap of peaks or
    genes.
Depends:
    R (>= 3.5.0)
Imports:
    AnnotationDbi,
    aplot,
    BiocGenerics,
    boot,
    dplyr,
    enrichplot,
    IRanges,
    GenomeInfoDb,
    GenomicRanges,
    GenomicFeatures,
    ggplot2,
    gplots,
    graphics,
    grDevices,
    gtools,
    magrittr,
    methods,
    plotrix,
    parallel,
    RColorBrewer,
    rlang,
    rtracklayer,
    S4Vectors,
    scales,
    stats,
    tibble,
    TxDb.Hsapiens.UCSC.hg19.knownGene,
    utils,
    yulab.utils (>= 0.2.0)
Suggests:
    clusterProfiler,
    ggimage,
    ggplotify,
    ggupset,
    ggVennDiagram,
    knitr,
    org.Hs.eg.db,
    prettydoc,
    ReactomePA,
    rmarkdown,
    testthat,
    TxDb.Hsapiens.UCSC.hg38.knownGene
Remotes:
    GuangchuangYu/enrichplot
URL: https://yulab-smu.top/contribution-knowledge-mining/
BugReports: https://github.com/YuLab-SMU/ChIPseeker/issues
Encoding: UTF-8
VignetteBuilder: knitr
ByteCompile: true
License: Artistic-2.0
biocViews: Annotation, ChIPSeq, Software, Visualization, MultipleComparison
RoxygenNote: 7.3.3


================================================
FILE: GEODATA
================================================
UPDATE OF GEO DATA
 + 20947 bed file information in ChIPseeker (version >=1.9.8) <2016-09-20, Tue>
 + 19348 bed file information in ChIPseeker (version >= 1.7.15) <2016-03-21, Mon>
 + 18813 bed file information in ChIPseeker (version >= 1.5.11, BioC 3.2 devel) <2015-09-24, Thu>
 + >17,726 bed file information in ChIPseeker (version>=1.4.0, BioC 3.1)
 + >15,000 bed file information ChIPseeker (version>=1.2.0, BioC 3.0)


================================================
FILE: Makefile
================================================
PKGNAME := $(shell sed -n "s/Package: *\([^ ]*\)/\1/p" DESCRIPTION)
PKGVERS := $(shell sed -n "s/Version: *\([^ ]*\)/\1/p" DESCRIPTION)
PKGSRC  := $(shell basename `pwd`)
BIOCVER := RELEASE_3_23


all: rd check clean

alldocs: rd readme

rd:
	Rscript -e 'roxygen2::roxygenise(".")'

readme:
	Rscript -e 'rmarkdown::render("README.Rmd", encoding="UTF-8")'

build:
	#cd ..;\
	#R CMD build $(PKGSRC)
	Rscript -e 'devtools::build()'
	
build2:
	cd ..;\
	R CMD build --no-build-vignettes $(PKGSRC)

install:
	cd ..;\
	R CMD INSTALL $(PKGNAME)_$(PKGVERS).tar.gz

check: 
	Rscript -e 'devtools::check()'
	## cd ..;\
	## Rscript -e 'rcmdcheck::rcmdcheck("$(PKGNAME)_$(PKGVERS).tar.gz")'

check2: build
	cd ..;\
	R CMD check $(PKGNAME)_$(PKGVERS).tar.gz

bioccheck:
	cd ..;\
	Rscript -e 'BiocCheck::BiocCheck("$(PKGNAME)_$(PKGVERS).tar.gz")'

clean:
	cd ..;\
	$(RM) -r $(PKGNAME).Rcheck/


gitmaintain:
	git gc --auto;\
	git prune -v;\
	git fsck --full

rmrelease:
	git branch -D $(BIOCVER)

release:
	git checkout $(BIOCVER);\
	git fetch --all


update:
	git fetch --all;\
	git checkout devel;\
	git merge upstream/devel;\
	git merge origin/devel

biocinit:
	git remote add upstream git@git.bioconductor.org:packages/$(PKGNAME).git;\
	git fetch --all

push:
	git push upstream devel;\
	git push origin devel




================================================
FILE: NAMESPACE
================================================
# Generated by roxygen2: do not edit by hand

S3method(arrange,GRanges)
S3method(as.data.frame,csAnno)
S3method(filter,GRanges)
S3method(mutate,GRanges)
S3method(rename,GRanges)
S3method(subset,csAnno)
export(.)
export(GRangesList)
export(annotatePeak)
export(as.GRanges)
export(combine_csAnno)
export(covplot)
export(downloadGEObedFiles)
export(downloadGSMbedFiles)
export(dropAnno)
export(enrichAnnoOverlap)
export(enrichPeakOverlap)
export(getAnnoStat)
export(getBioRegion)
export(getGEOInfo)
export(getGEOgenomeVersion)
export(getGEOspecies)
export(getPromoters)
export(getSampleFiles)
export(getTagMatrix)
export(makeBioRegionFromGranges)
export(overlap)
export(peakHeatmap)
export(peakHeatmap_multiple_Sets)
export(peak_Profile_Heatmap)
export(plotAnnoBar)
export(plotAnnoPie)
export(plotAnnoPie.csAnno)
export(plotAvgProf)
export(plotAvgProf2)
export(plotDistToTSS)
export(plotPeakProf)
export(plotPeakProf2)
export(readPeakFile)
export(rel)
export(seq2gene)
export(shuffle)
export(tagHeatmap)
export(vennpie)
export(vennplot)
export(vennplot.peakfile)
exportClasses(csAnno)
exportMethods(plotAnnoBar)
exportMethods(plotAnnoPie)
exportMethods(plotDistToTSS)
exportMethods(show)
exportMethods(upsetplot)
exportMethods(vennpie)
import(BiocGenerics)
import(GenomeInfoDb)
import(GenomicRanges)
import(IRanges)
import(S4Vectors)
importFrom(AnnotationDbi,get)
importFrom(AnnotationDbi,select)
importFrom(BiocGenerics,end)
importFrom(BiocGenerics,start)
importFrom(GenomicFeatures,exonsBy)
importFrom(GenomicFeatures,fiveUTRsByTranscript)
importFrom(GenomicFeatures,genes)
importFrom(GenomicFeatures,intronsByTranscript)
importFrom(GenomicFeatures,threeUTRsByTranscript)
importFrom(GenomicFeatures,transcripts)
importFrom(GenomicFeatures,transcriptsBy)
importFrom(GenomicRanges,GRangesList)
importFrom(S4Vectors,metadata)
importFrom(S4Vectors,subset)
importFrom(TxDb.Hsapiens.UCSC.hg19.knownGene,TxDb.Hsapiens.UCSC.hg19.knownGene)
importFrom(aplot,insert_bottom)
importFrom(aplot,plot_list)
importFrom(boot,boot)
importFrom(boot,boot.ci)
importFrom(dplyr,arrange)
importFrom(dplyr,filter)
importFrom(dplyr,group_by)
importFrom(dplyr,mutate)
importFrom(dplyr,summarise)
importFrom(enrichplot,upsetplot)
importFrom(ggplot2,aes)
importFrom(ggplot2,aes_)
importFrom(ggplot2,aes_string)
importFrom(ggplot2,coord_fixed)
importFrom(ggplot2,coord_flip)
importFrom(ggplot2,element_blank)
importFrom(ggplot2,element_text)
importFrom(ggplot2,facet_grid)
importFrom(ggplot2,geom_bar)
importFrom(ggplot2,geom_blank)
importFrom(ggplot2,geom_hline)
importFrom(ggplot2,geom_line)
importFrom(ggplot2,geom_rect)
importFrom(ggplot2,geom_ribbon)
importFrom(ggplot2,geom_segment)
importFrom(ggplot2,geom_text)
importFrom(ggplot2,geom_tile)
importFrom(ggplot2,geom_vline)
importFrom(ggplot2,ggplot)
importFrom(ggplot2,ggtitle)
importFrom(ggplot2,guide_legend)
importFrom(ggplot2,labs)
importFrom(ggplot2,rel)
importFrom(ggplot2,scale_color_manual)
importFrom(ggplot2,scale_fill_brewer)
importFrom(ggplot2,scale_fill_distiller)
importFrom(ggplot2,scale_fill_hue)
importFrom(ggplot2,scale_fill_manual)
importFrom(ggplot2,scale_x_continuous)
importFrom(ggplot2,scale_y_continuous)
importFrom(ggplot2,theme)
importFrom(ggplot2,theme_bw)
importFrom(ggplot2,theme_classic)
importFrom(ggplot2,theme_minimal)
importFrom(ggplot2,xlab)
importFrom(ggplot2,xlim)
importFrom(ggplot2,ylab)
importFrom(gplots,plot.venn)
importFrom(grDevices,colorRampPalette)
importFrom(graphics,layout)
importFrom(graphics,legend)
importFrom(graphics,par)
importFrom(graphics,pie)
importFrom(graphics,plot.new)
importFrom(gtools,permutations)
importFrom(magrittr,"%<>%")
importFrom(magrittr,"%>%")
importFrom(methods,as)
importFrom(methods,is)
importFrom(methods,missingArg)
importFrom(methods,new)
importFrom(methods,show)
importFrom(parallel,detectCores)
importFrom(parallel,mclapply)
importFrom(plotrix,floating.pie)
importFrom(rlang,.data)
importFrom(rlang,quos)
importFrom(rtracklayer,import.chain)
importFrom(rtracklayer,liftOver)
importFrom(stats,p.adjust)
importFrom(stats,phyper)
importFrom(utils,data)
importFrom(utils,download.file)
importFrom(utils,read.delim)
importFrom(utils,setTxtProgressBar)
importFrom(utils,txtProgressBar)
importFrom(yulab.utils,get_cache_element)
importFrom(yulab.utils,get_cache_item)
importFrom(yulab.utils,initial_cache_item)
importFrom(yulab.utils,mat2df)
importFrom(yulab.utils,rm_cache_item)
importFrom(yulab.utils,update_cache_item)
importFrom(yulab.utils,yulab_msg)


================================================
FILE: NEWS
================================================
CHANGES IN VERSION 1.15.2
------------------------
 o bug fixed for 'overlap = "all"' to consider strand information <2017-12-12, Tue>

CHANGES IN VERSION 1.15.1
------------------------
 o define downstream distance via options(ChIPseeker.downstreamDistance = 3000)
   + https://support.bioconductor.org/p/103135/

CHANGES IN VERSION 1.13.1
------------------------
 o fixed issue of naming intronList <2017-07-06, Thu>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/57#issuecomment-313342399

CHANGES IN VERSION 1.12.0
------------------------
 o BioC 3.5 release <2017-04-26, Wed>

CHANGES IN VERSION 1.11.4
------------------------
 o bug fixed of intron rank <2017-04-19, Wed>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/54

CHANGES IN VERSION 1.11.3
------------------------
 o bug fixed of dropAnno <2017-04-10, Mon>
 o bug fixed of peak width generated by shuffle <2017-03-31, Fri>
   + <https://github.com/GuangchuangYu/ChIPseeker/issues/51>

CHANGES IN VERSION 1.11.2
------------------------
 o optimize getGeneAnno <2016-12-21, Wed>
 o change plotAnnoBar and plotDistToTSS according to stacking bar order change in ggplot2 (v2.2.0) <2016-12-16, Fri>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/47
   + https://blog.rstudio.org/2016/11/14/ggplot2-2-2-0/

CHANGES IN VERSION 1.11.1
------------------------
 o update startup message <2016-11-09, Wed>

CHANGES IN VERSION 1.10.0
------------------------
 o BioC 3.4 released <2016-10-18, Tue>

CHANGES IN VERSION 1.9.8
------------------------
 o plotAvgProf/plotAvgProf2 order of panel by names of input tagMatrix List <2016-09-25, Sun>
 o test ENSEMBL ID using '^ENS' instead of '^ENSG' <2016-09-20, Tue>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/41

CHANGES IN VERSION 1.9.7
------------------------
 o unit test <2016-08-16, Tue>

CHANGES IN VERSION 1.9.6
------------------------
 o update vignette <2016-08-16, Tue>

CHANGES IN VERSION 1.9.5
------------------------
 o when TxDb doesn't have gene_id information, converting gene ID (ensembl/entrez and symbol) will be omitted instead of throw error. <2016-08-02, Tue>
   + https://www.biostars.org/p/204142
 o bug fixed if testing targetPeak is a list of GRanges objects in enrichPeakOverlap function <2016-07-20, Wed>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/37
   + https://github.com/GuangchuangYu/ChIPseeker/issues/36
 o fixed typo in determine gene ID type <2016-06-21, Tue>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/28#issuecomment-227212519
 o move upsetplot generics to DOSE and import from DOSE to prevent function name conflict <2016-06-14, Tue>

CHANGES IN VERSION 1.9.4
------------------------
 o bug fixed <2016-06-08, Wed>
   + https://github.com/GuangchuangYu/ChIPseeker/issues/17#issuecomment-224407402
   + https://github.com/GuangchuangYu/ChIPseeker/pull/24/files

CHANGES IN VERSION 1.9.3
------------------------
 o use byte compiler <2016-05-18, Wed>
 o https://github.com/Bioconductor-mirror/ChIPseeker/commit/f1ada57b9c66a1a44355bbbbdaf5b0a88e10cf7d

CHANGES IN VERSION 1.9.2
------------------------
 o name tagMatrix in plotAvgProf automatically if missing <2016-05-12, Thu>
 o https://github.com/Bioconductor-mirror/ChIPseeker/commit/d5f16b2bc01725e30282c3acb33007ef521a514c

CHANGES IN VERSION 1.9.1
------------------------
 o bug fixed in getNearestFeatureIndicesAndDistances <2016-05-11, Wed>
   + correct metadata in dummy NA feature

CHANGES IN VERSION 1.8.0
------------------------
 o BioC 3.3 released <2016-05-05, Thu>

CHANGES IN VERSION 1.7.15
------------------------
 o update GEO data <2016-03-21, Mon>

CHANGES IN VERSION 1.7.14
------------------------
 o list_to_dataframe works with data frames that have different colnames <2016-03-10, Thu>

CHANGES IN VERSION 1.7.13
------------------------
 o support annotate peaks with custom regions via passing TxDb=user_defined_GRanges to annotatePeak <2016-03-06, Sun>

CHANGES IN VERSION 1.7.12
------------------------
 o fixed R check <2016-03-05, Sat>
 o implement list_to_dataframe that mimic ldply and remove ldply dependency <2016-03-05, Sat>

CHANGES IN VERSION 1.7.11
------------------------
 o fixed issue in testing list in covplot introduced in 1.7.9 <2016-03-02, Wed>

CHANGES IN VERSION 1.7.10
------------------------
 o determined gene ID type if TxDb doesn't contain corresponding metadata <2016-03-01, Tue>
   + fixed https://github.com/GuangchuangYu/ChIPseeker/issues/28

CHANGES IN VERSION 1.7.9
------------------------
 o covplot support GRangesList <2016-02-24, Wed>
 o update ReactomePA citation info <2016-02-17, Wed>

CHANGES IN VERSION 1.7.8
------------------------
 o fixed BUG of Peaks upstream first or downstream last gene not annotated <2016-01-20, Wed>
   + contributed by Michael Kluge
   + see https://github.com/GuangchuangYu/ChIPseeker/pull/24

CHANGES IN VERSION 1.7.7
------------------------
 o bug fixed in newly introduced parameter 'overlap'. solve NA issue. <2016-01-13, Wed>

CHANGES IN VERSION 1.7.6
------------------------
 o introduce 'overlap' parameter in annotatePeak, by default overlap="TSS" and only overlap with TSS will be reported as the nearest gene.
   if overlap="all", then gene overlap with peak will be reported as nearest gene, no matter the overlap is at TSS region or not. <2016-01-12, Tue>
 o bug fixed in find overlap with peaks have strand info. <2016-01-12, Tue>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/23

CHANGES IN VERSION 1.7.5
------------------------
 o add paramters, sameStrand,ignoreOverlap, ignoreUpstream and ignoreDownstream in annotatePeak <2016-01-10, Sun>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/17
 o bug fixed in peak orientation <2016-01-10, Sun>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/22

CHANGES IN VERSION 1.7.4
------------------------
 o stop if input list of csAnno object has no name attribute
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/21
   + plotAnnoBar
   + plotDistToTSS
 o [covplot] xlim now not only restrict the window of data but also set the limit of the graphic object <2015-12-30, Wed>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/20

CHANGES IN VERSION 1.7.3
------------------------
 o fixed R check <2015-12-29, Tue>

CHANGES IN VERSION 1.7.2
------------------------
 o use geom_rect instead of geom_segment in covplot <2015-11-30, Mon>
 o open lower parameter (by default =1) to specific lower cutoff of coverage signal <2015-11-29, Sun>
 o fixed covplot to work with None RleViews of specific chromosome <2015-11-29, Sun>
 o addFlankGeneInfo now works with level="gene" <2015-11-19, Thu>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/18

CHANGES IN VERSION 1.7.1
------------------------
 o fixed extracting ID type from TxDb object, since the change of metadata(TxDb). now using grep to extract. <2015-10-27, Tue>
 o add vp parameter to set viewport of vennpie on top of upsetplot by user request <2015-10-26, Mon>
   + see http://ygc.name/2015/07/28/upsetplot-in-chipseeker/#comment-19470
 o getBioRegion function <2015-10-20, Tue>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/16

CHANGES IN VERSION 1.7.0
------------------------
 o BioC 3.3 branch

CHANGES IN VERSION 1.5.11
------------------------
 o remove ellipsis parameter in enrichPeakOverlap function and extend it to support GRanges objects <2015-10-08, Thu>
    + see https://support.bioconductor.org/p/73069/
 o fixed the issue, https://github.com/GuangchuangYu/ChIPseeker/issues/13 <2015-10-05, Mon>
 o update GEO info, now contains >18,000 bed file information <2015-09-24, Thu>

CHANGES IN VERSION 1.5.10
------------------------
 o dropAnno function, eg. drop nearest gene annotation that far from TSS (>10k). <2015-09-17, Thu>
   + see https://github.com/GuangchuangYu/ChIPseeker/issues/9
   + add parameter distanceToTSS_cutoff to enrichAnnoOverlap
 o use base::subset in plotDistToTSS instead of subsetting data within geom_bar <2015-09-17, Thu>
   + see https://github.com/hadley/ggplot2/issues/1295
   + subset parameter in layer will be removed in next release of ggplot2.

CHANGES IN VERSION 1.5.9
------------------------
 o bug fixed of enrichAnnoOverlap <2015-08-26, Wed>
 o change parameter order.matrix to order.by in upsetplot to meet the change of UpSetR pkg <2015-08-26, Wed>

CHANGES IN VERSION 1.5.8
------------------------
 o better implementation of getFirstHitIndex.  <2015-07-29, Wed>
   + contributed by Herve Pages.
   + see https://support.bioconductor.org/p/70432/#70545.

CHANGES IN VERSION 1.5.7
------------------------
 o add vennpie parameter in upsetplot <2015-07-20, Mon>
 o upsetplot function for csAnno object <2015-07-20, Mon>

CHANGES IN VERSION 1.5.6
------------------------
 o update citation info <2015-07-09, Thu>
 o BED file +1 shift for BED coordinate system start at 0 <2015-07-07, Tue>

CHANGES IN VERSION 1.5.5
------------------------
 o seq2gene for linking genomic regions to genes by many-to-many mapping. <2015-06-29, Mon>

CHANGES IN VERSION 1.5.4
------------------------
 o add pseudocount in enrichPeakOverlap to prevent 0 pvalue <2015-05-22, Fri>

CHANGES IN VERSION 1.5.3
------------------------
 o convert the vignette from Rnw to Rmd format <2015-05-17, Sun>

CHANGES IN VERSION 1.5.1
------------------------
 o minor bug fixed in getChrCov <2015-04-27, Mon>

CHANGES IN VERSION 1.3.15
------------------------
 o update vignette <2015-03-31, Tue>

CHANGES IN VERSION 1.3.14
------------------------
 o add pool parameter in enrichPeakOverlap <2015-03-30, Mon>

CHANGES IN VERSION 1.3.13
------------------------
 o update enrichPeakOverlap to support nShuffle = 0, which now will report only overlay with pvalue = NA <2015-03-29, Sun>
 o add facet and free_y parameter for plotAvgProf and plotAvgProf2 <2015-03-29, Sun>
 o update docs <2015-03-29, Sun>
 o update plotAvgProf and plotAvgProf2 to fully supporting confidence interval,
   see https://github.com/GuangchuangYu/ChIPseeker/pull/6 <2015-03-29, Sun>

CHANGES IN VERSION 1.3.12
------------------------
 o add confidence interval for plotAvgProf, see https://github.com/GuangchuangYu/ChIPseeker/issues/3 <2015-03-26, Thu>

CHANGES IN VERSION 1.3.11
------------------------
 o add citation <2015-03-16, Mon>

CHANGES IN VERSION 1.3.10
------------------------
 o update GEO data <2015-03-03, Tue>

CHANGES IN VERSION 1.3.9
------------------------
 o add parameter *genomicAnnotationPriority* for annotatePeak function <2015-02-27, Fri>

CHANGES IN VERSION 1.3.8
------------------------
 o add DOSE citation <2015-02-13, Fri>

CHANGES IN VERSION 1.3.7
------------------------
 o bug fixed in plotDistToTSS <2015-02-06, Fri>

CHANGES IN VERSION 1.3.6
------------------------
 o when peak is exactly located at gene end and near the end of chromosome,
   NA will be generated and throw error when assigning downstream of gene end.
   This bug has been fixed <2015-02-03, Tue>

CHANGES IN VERSION 1.3.5
------------------------
 o bug fixed in getNearestFeatureIndicesAndDistances when peak in the
   very begining or end of the chromosome <2015-01-30, Fri>

CHANGES IN VERSION 1.3.4
------------------------
 o bug fixed for introducing dplyr in plotDistToTSS <2015-01-28, Wed>

CHANGES IN VERSION 1.3.3
------------------------
 o update vignette to use BiocStyle::latex() <2015-01-26, Mon>

CHANGES IN VERSION 1.3.2
------------------------
 o fixed import issue to meet the changes of AnnotationDbi and S4Vectors <2015-01-22, Thu>

CHANGES IN VERSION 1.1.21
------------------------
 o use data.table instead of data.frame to optimize covplot <2014-10-06, Mon>

CHANGES IN VERSION 1.1.20
------------------------
 o update annotatePeak to store the seqinfo information <2014-09-30, Tue>
 o modified runValue(x) to sapply(x, runValue) <2014-09-30, Tue>

CHANGES IN VERSION 1.1.19
------------------------
 o implement csAnno S4 object <2014-09-28, Sun>
 o modify plot function for csAnno instance <2014-09-28, Sun>
 o implement vennpie function <2014-09-28, Sun>

CHANGES IN VERSION 1.1.17
------------------------
 o deprecate plotChrCov to new function covplot <2014-08-18, Mon>
 o add new paramter chrs and xlim to covplot <2014-08-18, Mon>

CHANGES IN VERSION 1.1.16
------------------------
 o optimize plotChrCov, running time reduce drastically <2014-08-15, Fri>

CHANGES IN VERSION 1.1.15
------------------------
 o remove un-mappable peak to prevent fail in peak annotation <2014-08-14, Thu>

CHANGES IN VERSION 1.1.14
------------------------
 o bug fixed in plotDistToTSS <2014-08-14, Thu>

CHANGES IN VERSION 1.1.13
------------------------
 o change TranscriptDb to TxDb according to GenomicFeatures <2014-07-29, Tue>

CHANGES IN VERSION 1.1.12
------------------------
 o bug fixed in plotChrCov <2014-07-21, Mon>

CHANGES IN VERSION 1.1.10
------------------------
 o bug fixed in calculating distances from peak end <2014-06-18, Wed>

CHANGES IN VERSION 1.1.9
------------------------
 o add level parameter to annotatePeak, and set it to "transcript" by default.
   Now annotatePeak will annotate peaks in transcript level
   except user specify level = "gene" <2014-06-16, Mon>
 o add addFlankGeneInfo parameter to annotatePeak.
   If it set to true, all features within the flankDistance will be annotated. <2014-06-16, Mon>

CHANGES IN VERSION 1.1.8
------------------------
 o bug fixed when peak overlap with feature <2014-06-11, Wed>
 o optimize for getting overlap features of peaks <2014-06-11, Wed>
 o update plotAnnoPie, separate the pie and legend to prevent label overlap <2014-06-12, Thu>

CHANGES IN VERSION 1.1.7
------------------------
 o bug fixed in calculating distanceToTSS <2014-06-03, Tue>

CHANGES IN VERSION 1.1.6
------------------------
 o add chainFile parameter in enrichAnnoOverlap and enrichPeakOverlap to support different genome version comparision <2014-06-01, Sun>
 o fixed color bug in peakHeatmap.internal2 and plotAnnoBar <2014-06-02, Mon>
 o update vignettes <2014-06-02, Mon>

CHANGES IN VERSION 1.1.5
------------------------
 o export getPromoters and getTagMatrix <2014-05-31, Sat>
 o rename plotAvgProf to plotAvgProf2 and implement plotAvgProf based on tagMatrix  <2014-05-31, Sat>
 o implement tagHeatmap for visualize heatmap of the tagMatrix or a list of tagMatrix <2014-05-31, Sat>
 o implement shuffle function to generate a random ChIP data based on a real one <2014-05-31, Sat>
 o implement enrichPeakOverlap to calcuate significant of ChIP experiments based on the genome coordinations <2014-05-31, Sat>
 o implement enrichAnnoOverlap to calculate significant of ChIP experiments based on their nearest gene annotation <2014-05-31, Sat>
 o incorporate GEO database for mining significant overlap of ChIP data <2014-05-31, Sat>
   + getGEOspecies summarize the collected data by species
   + getGEOgenomeVersion summarize the colleted data by genome version
   + getGEOInfo extract the information by genome version query
   + downloadGEObedFiles download all bed files of a particular genome version
   + downloadGSMbedFiles download the bed files of the input GSM list.

CHANGES IN VERSION 1.1.4
------------------------
 o in the annotation column of output of annotatePeak function,
	if Exon/Intron, the output change to 'Transcript_Name/GeneID, Exon no. of total_no.' <2014-05-14, Wed>

CHANGES IN VERSION 1.1.3
------------------------
 o bug fixed when metadata(TranscriptDb) contained NA <2014-04-30, Wed>
 o support ID type of Ensembl in annotatePeak (Entrez was supported) <2014-04-30, Wed>

CHANGES IN VERSION 1.1.2
------------------------
 o implemented plotChrCov <2014-04-25, Fri>
 o implemented plotAvgProf and peakHeatmap <2014-04-24, Thu>

CHANGES IN VERSION 1.1.1
------------------------
 o output of annotatePeak now contain chromosome length information <2014-04-22, Tue>
 o re-implement plotAnnoPie to use ordinary pie plot instead of pie3D <2014-04-21, Mon>

CHANGES IN VERSION 1.0.0
------------------------
 o initial version with the following functions implemented:
   + annotatePeak
   + overlap
   + plotAnnoBar
   + plotAnnoPie
   + plotDistToTSS
   + readPeakFile
   + vennplot
   + vennplot.peakfile


================================================
FILE: NEWS.md
================================================
# ChIPseeker 1.48.0

+ Bioconductor RELEASE_3_23 (2026-04-29, Wed)

# ChIPseeker 1.47.1

+ fixed issue in 'test-txdb.R' as 'TxDb.Hsapiens.UCSC.hg19.knownGene' changes its transcript ID from UCSC (e.g., uc002qsd.4) to Ensembl (e.g., ENST00000487630.1_3) (2025-11-04, Tue)

# ChIPseeker 1.46.0

+ Bioconductor RELEASE_3_22 (2025-11-01, Sat)

# ChIPseeker 1.45.2

+ new cache mechanism from 'yulab.utils' (2025-10-15, Wed)

# ChIPseeker 1.44.0

+ Bioconductor RELEASE_3_21 (2025-04-17, Thu)

# ChIPseeker 1.42.0

+ Bioconductor RELEASE_3_20 (2024-10-30, Wed)

# ChIPseeker 1.41.3

+ Better `covplot()`. Support universal chromosome names, and keep the default order of multiple peaks when plot a list of `GRanges` object.
+ Robust `generate_colors()`. Edit the logical of decision, and can validate color code automatically.
+ Extend dplyr verbs (`filter()`, `mutate()`, `arrange()`, `rename()`) to peak (`GRanges` object or `data.frame`), see #242.

# ChIPseeker 1.41.2

+ Enhancement of `plotDistToTSS()`, see #241.

# ChIPseeker 1.41.1

+ use `yulab.utils::yulab_msg()` for startup message (2024-07-26, Fri)

# ChIPseeker 1.40.0

+ Bioconductor RELEASE_3_19 (2024-05-15, Wed)

# ChIPseeker 1.38.0

+ Bioconductor RELEASE_3_18 (2023-10-25, Wed)

# ChIPseeker 1.36.0

+ Bioconductor RELEASE_3_17 (2023-05-03, Wed)

# ChIPseeker 1.35.3

+ fixed R check by removing calling `BiocStyle::Biocpkg()` in vignette, instead we use `yulab.utils::Biocpkg()` (2023-04-11, Tue)

# ChIPseeker 1.35.2

+ fixed R check by adding 'prettydoc' to Suggests (2023-04-04, Tue)

# ChIPseeker 1.35.1

+ use `ggplot` to plot heatmap (2022-12-30, Fri, #203)
+ update startup message to display the 'Current Protocols (2022)' paper. 

# ChIPseeker 1.34.0

+ Bioconductor RELEASE_3_16 (2022-11-02, Wed)


# ChIPseeker 1.33.4

+ add citation Q. Wang (2022) (2022-10-29, Sat)

# ChIPseeker 1.33.3

+ allows passing user defined color to `vennpie()` (2022-10-20, Thu, #202, #207)
+ add `columns` paramter to `annotatePeak()` to better support passing `EnsDb` to `annoDb` (#193, #205)
+ export `getAnnoStat()` (#200, #204)

# ChIPseeker 1.33.2

+ supports `by = "ggVennDiagram"` in `vennplot` function (2022-09-13, Tue)

# ChIPseeker 1.33.1

+ `plotPeakProf()` allows passing GRanges object or a list of GRanges objects to TxDb parameter (2022-06-04, Sat)
+ add test files for `getTagMatrix()` and `plotTagMatrix()`
+ `getBioRegion()` supports UTR regions (3'UTR + 5'UTR)
+ `makeBioRegionFromGranges()` supports generating windoes from self-made GRanges object
+ allow specify colors in `covplot()` (2022-05-09, Mon, #185, #188)

# ChIPseeker 1.32.0

+ Bioconductor 3.15 release

# ChIPseeker 1.31.4

+ `readPeakFile` now supports `.broadPeak` and `.gappedPeak` files (2021-12-17, Fri, #173) 

# ChIPseeker 1.31.3

+ bug fixed of determining promoter region in minus strand (2021-12-16, Thu, #172)

# ChIPseeker 1.31.2

+ update vignette

# ChIPseeker 1.31.1

+ bug fixed to take strand information (2021-11-10, Wed, #167)

# ChIPseeker 1.30.0

+ Bioconductor 3.14 release

# ChIPseeker 1.29.2

+ extend functions for plotting peak profiles to support other types of bioregions (2021-10-15, Fri, @MingLi-929, #156, #160, #162, #163)

# ChIPseeker 1.29.1

+ add example for `seq2gene` function (2021-05-21, Fri)

# ChIPseeker 1.28.0

+ Bioconductor 3.13 release (2021-05-20, Thu)

# ChIPseeker 1.27.5

+ update GEO data (103398/1973025 GSM) (2021-05-14, Fri)

# ChIPseeker 1.27.4

+ bug fixed in determine downstream gene (2021-04-27, Thu)
  - <https://github.com/YuLab-SMU/ChIPseeker/pull/148>
+ `getBioRegion` now supports '3UTR' and '5UTR' (2021-03-30, Tue)
  - <https://github.com/YuLab-SMU/ChIPseeker/pull/146>

# ChIPseeker 1.27.3

+ add two parameter, cex and radius, to `plotAnnoPie` (2021-03-12, Fri)
  - <https://github.com/YuLab-SMU/ChIPseeker/pull/144>

# ChIPseeker 1.27.2

+ bug fixed of `getGenomicAnnotation` (2021-03-03, Wed)
  - <https://github.com/YuLab-SMU/ChIPseeker/issues/142>

# ChIPseeker 1.27.1

+ Add support for `EnsDb` annotation databases in `annotatePeak`. 
  - <https://github.com/YuLab-SMU/ChIPseeker/pull/120>

# ChIPseeker 1.26.0

+ Bioconductor 3.12 release (2020-10-28, Wed)


# ChIPseeker 1.23.1

+ update GEO data (51079/762820 GSM) (2019-12-20, Fri)

# ChIPseeker 1.22.0

+ Bioconductor 3.10 release
 
# ChIPseeker 1.21.1

+ new implementation of `upsetplot` (2019-08-29, Thu)
  - use `ggupset`, `ggimage` and `ggplotify`
+ `subset` method for `csAnno` object (2019-08-27, Tue)

# ChIPseeker 1.20.0

+ Bioconductor 3.9 release

# ChIPseeker 1.19.1

+ add `origin_label = "TSS"` parameter to `plotAvgProf` (2018-12-12, Wed)
  - <https://github.com/GuangchuangYu/ChIPseeker/issues/91>
  
# ChIPseeker 1.18.0

+ Bioconductor 3.8 release

# ChIPseeker 1.17.2

+ add `flip_minor_strand` parameter in `getTagMatrix` (2018-08-10, Fri)
  - should set to FALSE if windows if not symetric
  
# ChIPseeker 1.17.1

+ fixed issue of `vennpie` by adding pseudo-count +1 (2018-07-21, Sat)
  - <https://www.biostars.org/p/326456/>

# ChIPseeker 1.16.0

+ Bioconductor 3.7 release

# ChIPseeker 1.15.4

+ If the required input is a named list and user input a list without name,
  set the name automatically and throw warning msg instead of error <2018-03-14,
  Wed>
    - <https://support.bioconductor.org/p/106903/#106936>
+ change `plotAvgProf`'s default y label <2018-03-14, Wed>
    - <https://github.com/GuangchuangYu/ChIPseeker/issues/76>
+ plotAnnoBar now visualize barplot according to the order of input list
  (y-axis) (2018-02-27, Tue)
    - <https://github.com/GuangchuangYu/ChIPseeker/issues/73>
+ follow renaming of RangesList class -> IntegerRangesList in IRanges v2.13.12
    - <https://github.com/GuangchuangYu/ChIPseeker/commit/b62d7922fb61e58620bbb685e4def4fb863c8e81>

# ChIPseeker 1.15.3

+ options to ignore '1st exon', '1st intron', 'downstream' and promoter
  subcategory when summarizing result and visualization (2018-01-09, Tue)
    - <https://support.bioconductor.org/p/104676/#104689>
+ throw msg of 'file not found and skip' when requested url is not available
  when downloading BED file from GEO (2017-12-28, Thu)
    - <https://support.bioconductor.org/p/104491/#104507>
+ bug fixed of getGene (2017-12-27, Wed)


================================================
FILE: R/AllGenerics.R
================================================
##' vennpie method generics
##'
##'
##' @docType methods
##' @name vennpie
##' @rdname vennpie-methods
##' @export
setGeneric("vennpie", 
  function(x, r = 0.2, cex = 1.2, ...) 
  standardGeneric("vennpie")
)


##' plotDistToTSS method generics
##'
##'
##' @docType methods
##' @name plotDistToTSS
##' @rdname plotDistToTSS-methods
##' @export
setGeneric("plotDistToTSS", 
  function(x, 
    distanceColumn="distanceToTSS",
    xlab="", ylab="Binding sites (%) (5'->3')",
    title="Distribution of transcription factor-binding loci relative to TSS", 
    ...)
  standardGeneric("plotDistToTSS")
)

##' plotAnnoBar method generics
##'
##'
##' @docType methods
##' @name plotAnnoBar
##' @rdname plotAnnoBar-methods
##' @export
setGeneric("plotAnnoBar", 
  function(x,
    xlab="",
    ylab="Percentage(%)",
    title="Feature Distribution",
    ...)
  standardGeneric("plotAnnoBar")
)


##' plotAnnoPie method generics
##'
##'
##' @docType methods
##' @name plotAnnoPie
##' @rdname plotAnnoPie-methods
##' @export
setGeneric("plotAnnoPie", 
  function(x, 
    ndigit=2,
    cex=0.9,
    col=NA,
    legend.position="rightside",
    pie3D=FALSE,
    radius=0.8,
    ...)
  standardGeneric("plotAnnoPie")
)


================================================
FILE: R/ChIPseeker-package.R
================================================
#' @keywords internal
"_PACKAGE"



##' Information Datasets
##' 
##' ucsc genome version, precalcuated data and gsm information
##' 
##' @name info
##' @aliases ucsc_release
##' gsminfo
##' tagMatrixList
##' @docType data
##' @keywords datasets
NULL

##' Name of the ChIPseeker cache environment (internal static variable)
##' @format character vector 
ChIPseekerCache <- "ChIPseekerEnv"



================================================
FILE: R/GEO.R
================================================
########################################
##                                    ##
## data last update: Mar 03, 2015     ##
##                                    ##
########################################



##' accessing species statistics collecting from GEO database
##'
##'
##' @title getGEOspecies
##' @return data.frame
##' @author G Yu
##' @export
getGEOspecies <- function() {
    gsminfo <- get_gsminfo()
    species <- gsminfo$organism
    res <- as.data.frame(table(species))
    return(res)
}

##' get genome version statistics collecting from GEO ChIPseq data
##'
##'
##' @title getGEOgenomeVersion
##' @return data.frame
##' @author G Yu
##' @export
getGEOgenomeVersion <- function() {
    gsminfo <- get_gsminfo()
    gv <- gsminfo[, c("organism",
                      "genomeVersion")]
    genomeVersion <- gv$genomeVersion
    res <- as.data.frame(table(genomeVersion))
    gv <- unique(gv)

    res <- merge(gv, res, by.x="genomeVersion", by.y="genomeVersion", all.y=TRUE)
    res <- res[, c("organism", "genomeVersion", "Freq")]
    return(res)
}

##' get subset of GEO information by genome version keyword
##'
##'
##' @title getGEOInfo
##' @param genome genome version
##' @param simplify simplify result or not
##' @return data.frame
##' @author G Yu
##' @export
getGEOInfo <- function(genome, simplify =TRUE) {
    gsminfo <- get_gsminfo()
    genomeVersion <- NULL ## to satisfy codetools
    res <- subset(gsminfo, subset = genomeVersion == genome)
    if (simplify) {
        res <- res[,c("series_id", "gsm", "organism", "title", "supplementary_file", "genomeVersion", "pubmed_id")]
    }
    return(res)
}

##' download all BED files of a particular genome version
##'
##'
##' @title downloadGEObedFiles
##' @param genome genome version
##' @param destDir destination folder
##' @return NULL
##' @author G Yu
##' @export
downloadGEObedFiles <- function(genome, destDir=getwd()) {
    info <- getGEOInfo(genome)
    downloadGEO.internal(info, destDir)
}

##' download BED supplementary files of a list of GSM accession numbers
##'
##'
##' @title downloadGSMbedFiles
##' @param GSM GSM accession numbers
##' @param destDir destination folder
##' @return NULL
##' @author G Yu
##' @export
downloadGSMbedFiles <- function(GSM, destDir=getwd()) {
    gsminfo <- get_gsminfo()
    info <- gsminfo[gsminfo$gsm %in% GSM,]
    downloadGEO.internal(info, destDir)
}

##' @importFrom utils download.file
downloadGEO.internal <- function(info, destDir) {
    fnames <- as.character(info$supplementary_file)
    destfiles <- sub(".*\\/", paste(destDir, "/", sep=""), fnames)
    names(destfiles) <- NULL

    for (i in seq_along(fnames)) {
        if ( ! file.exists(destfiles[i]) )
            tryCatch(download.file(fnames[i],
                          destfile=destfiles[i],
                          mode="wb"),
                     error = function(e) message(fnames[i], ': file not found and skip'))
    }
}

##' @importFrom utils data
## @importFrom GEOmetadb
## @importFrom RSQLite dbConnect
## @importFrom RSQLite dbGetQuery
prepareGSMInfo <- function() {
    pkg <- "GEOmetadb"
    require(pkg, character.only=TRUE)
    getSQLiteFile <- eval(parse(text="getSQLiteFile"))
    ## get the latest version of sql file
    is.dl <- tryCatch(getSQLiteFile(), error = function(e) NULL)

    if (is.null(is.dl)) {
        url <- 'http://starbuck1.s3.amazonaws.com/sradb/GEOmetadb.sqlite.gz'
        HEAD <- eval(parse(text = "httr::HEAD"))
        hh <- HEAD(url)
        size <- hh$headers[["content-length"]]
        cmd <- paste('wget -c', url)
        while(file.info("GEOmetadb.sqlite.gz")$size < size) {
            system(cmd)
        }
        if (file.exists('GEOmetadb.sqlite') && file.exists('GEOmetadb.sqlite.gz')) {
            file.remove("GEOmetadb.sqlite")
        }
        system('gunzip GEOmetadb.sqlite.gz')
    }

    GEOmetadbFile="GEOmetadb.sqlite"
    file.info(GEOmetadbFile)

    sqlpkg <- "RSQLite"
    require(sqlpkg, character.only=TRUE)
    dbConnect <- eval(parse(text="dbConnect"))
    dbGetQuery <- eval(parse(text="dbGetQuery"))
    SQLite <- eval(parse(text="SQLite"))

    con <- dbConnect(SQLite(),GEOmetadbFile)
    ## dbListTables(con)


    pkg <- "GEOquery"
    require(pkg, character.only=TRUE)
    getGEO <- eval(parse(text="getGEO"))
    Meta <- eval(parse(text="Meta"))

    ## get all GPL IDs
    ## download soft using gpl = getGEO("GPLXXX")
    ## using Meta(gpl) find the technology match sequencing
    ## get all gsm IDs
    ## parse it

    gpl <- dbGetQuery(con, 'select gpl, technology from gpl')
    gpl <- gpl[gpl[,2] == "high-throughput sequencing",1]
    gpl <- gpl[!is.na(gpl)]

    ## save the processedGSM vector that contain all the GSM that have been processed.
    ## next time when preparing GSMInfo, filter those have been processed before.
    load(system.file("extdata/processedGSM.rda", package="ChIPseeker"))
    processedGSM <- get("processedGSM")
    newGSM <- c()

    gpldir <- "GPL"
    if (!file.exists(gpldir)) {
        dir.create(gpldir)
    }

    for (gid in gpl) {
        gg <- tryCatch(getGEO(gid, destdir=gpldir), error=function(e) NULL)
        if (is.null(gg)) {
            next
        }
        gsm <- Meta(gg)$sample_id
        gsm <- gsm[! (gsm %in% processedGSM) ]
        if (length(gsm) == 0) {
            next
        }
        newGSM <- c(newGSM, gsm)

        sf <- batchGetGSMsuppFile(gsm)
        if (!is.null(sf)) {
            save(sf, file=paste(gid, "_sf.rda", sep=""))
        }
    }

    processedGSM <- c(processedGSM, newGSM)
    processedGSM <- unique(processedGSM)
    save(processedGSM, file="../processedGSM.rda", compress="xz")


    sfiles <- list.files(pattern="_sf.rda")
    res <- data.frame(gsm=NULL, remoteFile=NULL)
    for (ff in sfiles) {
        load(ff)
        if (!is.null(sf)) {
            res <- rbind(res, sf)
        }
    }
    colnames(res)[2] <- "supplementary_file"


    GSMInfo <- lapply(unique(as.character(res$gsm)), function(i) {
        dbGetQuery(con,paste("select gsm,series_id,gpl,organism_ch1,title,characteristics_ch1,source_name_ch1,extract_protocol_ch1,description,data_processing,submission_date ",
                             "from gsm where gsm='", i, "'", sep=""))
    })

    GSMInfo <- do.call("rbind", GSMInfo)

    colnames(GSMInfo) <- sub("_ch1", "", colnames(GSMInfo))

    gsminfo <- merge(GSMInfo, res, by.x="gsm", by.y="gsm")

    tryCatch(utils::data("ucsc_release", package="ChIPseeker"))
    ucsc_release <- get("ucsc_release")

    genVer <- lapply(1:nrow(gsminfo), function(i)
                     getGenomicVersion(ucsc_release,
                                       gsminfo[i, "data_processing"],
                                       gsminfo[i, "organism"],
                                       gsminfo[i, "supplementary_file"])
                     )

    gsminfo$genomeVersion <- unlist(genVer)

    gse <- as.character(gsminfo$series_id)
    pubmed <- lapply(gse, function(i) {
        dbGetQuery(con,paste("select gse,pubmed_id ",
                             "from gse where gse='", i, "'", sep=""))
    })
    pm <- do.call(rbind, pubmed)
    pm <- unique(pm)
    gsminfo <- merge(gsminfo, pm, by.x="series_id", by.y="gse", all.x=TRUE)

    ## remove non-ASCII characters
    for(i in 1:ncol(gsminfo)) {
        gsminfo[,i] = iconv(gsminfo[,i], "latin1", "ASCII", sub="")
    }
    gsminfo2 <- gsminfo
    rm(gsminfo)

    utils::data("gsminfo", package="ChIPseeker")
    gsminfo <- get("gsminfo")
    gsminfo <- rbind(gsminfo, gsminfo2)
    gsminfo <- unique(gsminfo)

    save(gsminfo, file="../gsminfo.rda", compress="xz")
}


getGenomicVersion <- function(ucsc_release, data_processing, organism, supplementary_file) {
    data_processing <- as.character(data_processing)
    organism <- as.character(organism)
    supplementary_file <- as.character(supplementary_file)

    species <- NULL
    gs <- subset(ucsc_release, subset = species == organism)
    if (nrow(gs) == 0) return(NA)

    genMatch <- unlist(sapply(gs$ucsc_version, grep, data_processing))
    if (length(genMatch) == 0) {
        genMatch <- unlist(sapply(gs$ucsc_version, grep, supplementary_file))
        if (length(genMatch) == 0) {
            return(NA)
        }
    }

    genVer <- names(genMatch)
    if (length(genVer) > 1) {
        genVer <- genVer[1]
    }

    return(genVer)
}

## getGSE_ENCODE <- function() {
##     encode=readLines("http://www.ncbi.nlm.nih.gov/geo/info/ENCODE.html")
##     encode.chipseq <- encode[grep("ChIP-Seq", encode)]
##     ## require(gsubfn)
##     ## gse <- sapply(encode.chipseq, function(i) {
##     ##     res <- strapply(i, "(GSE\\d+)")
##     ##     unique(unlist(res))
##     ## })
##     gse <- sapply(encode.chipseq, gsub, pattern='.*(GSE\\d+).*', replacement='\\1')
##     names(gse) <- NULL
##     return(gse)
## }

## GSE2GSM <- function(GSE) {
##     info <- getGEO(GSE, GSEMatrix=FALSE)
##     metaInfo <- Meta(info)
##     gsm <- metaInfo$sample_id
##     return(gsm)
## }

##' @importFrom parallel mclapply
##' @importFrom parallel detectCores
batchGetGSMsuppFile <- function(gsm) {
    suppfiles <- mclapply(seq_along(gsm), function(i) {
        cat("processing ", gsm[i], "\t",  i , " of ", length(gsm), "\n")
        tryCatch(getGSMsuppFile(gsm[i]), error=function(e) NULL)
    }, mc.cores=detectCores())

    suppfiles <- suppfiles[!unlist(lapply(suppfiles, is.null))]

    sf <- do.call("rbind", suppfiles)
    return(sf)
}

getGSMsuppFile <- function(GSM) {
    pkg <- "GEOquery"
    require(pkg, character.only=TRUE)
    getGEO <- eval(parse(text="getGEO"))
    Meta <- eval(parse(text="Meta"))

    destdir="geo_soft"
    if (!file.exists(destdir)) {
        dir.create(destdir)
    }
    info <- getGEO(GSM, GSEMatrix=FALSE, destdir=destdir)
    ## http://www.ncbi.nlm.nih.gov/geo/info/soft2.html
    metaInfo <- Meta(info)

    ## suppmentary file names
    fnames <- unlist(metaInfo[grep("supplementary", names(metaInfo))])
    names(fnames) <- NULL
    i <- c(grep("bed.gz", fnames), grep("Peak.gz", fnames), grep("bedGraph.gz", fnames))

    if (length(i) == 0) {
        message("No bed files found")
        return(NULL)
    }
    fnames <- fnames[i]
    res <- data.frame(gsm=GSM, remoteFile = fnames)
    return(res)
}

get_gsminfo <- function() {
    tryCatch(utils::data("gsminfo", package="ChIPseeker"))
    gsminfo <- get("gsminfo")
    return(gsminfo)
}


================================================
FILE: R/addGeneAnno.R
================================================
##' get gene annotation, symbol, gene name etc.
##'
##'
##' @title getGeneAnno
##' @param annoDb annotation package
##' @param geneID query geneID
##' @param type gene ID type
##' @param columns names of columns to be obtained from database
##' @return data.frame
##' @importFrom AnnotationDbi select
##' @author G Yu
getGeneAnno <- function(annoDb, geneID, type, columns){
    kk <- unlist(geneID)
    require(annoDb, character.only = TRUE)
    annoDb <- eval(parse(text=annoDb))

    if (type == "Entrez Gene ID") {
        kt <- "ENTREZID"
    } else if (type =="Ensembl gene ID" || type == "Ensembl Gene ID") {
        kt <- "ENSEMBL"
    } else {
        message("geneID type is not supported...\tPlease report it to developer...\n")
        return(NA)
    }

    i <- which(!is.na(kk))
    kk <- gsub("\\.\\d+$", "", kk)
    ann <- tryCatch(
        suppressWarnings(select(annoDb,
                                keys=unique(kk[i]),
                                keytype=kt,
                                columns=columns)),
        error = function(e) NULL)

    if (is.null(ann)) {
        warning("ID type not matched, gene annotation will not be added...")
        return(NA)
    }
    idx <- getFirstHitIndex(ann[,kt])
    ann <- ann[idx,]

    ## idx <- unlist(sapply(kk, function(x) which(x==ann[,kt])))
    ## res <- matrix(NA, ncol=ncol(ann), nrow=length(kk)) %>% as.data.frame
    ## colnames(res) <- colnames(ann)
    ## res[i,] <- ann[idx,]

    rownames(ann) <- ann[, kt]
    res <- ann[as.character(kk),]

    return(res)
}


addGeneAnno <- function(peak.gr, annoDb, type, columns) {
    geneAnno <- getGeneAnno(annoDb, peak.gr$geneId, type, columns)
    if (! all(is.na(geneAnno))) {
        for(cn in colnames(geneAnno)[-1]) {
            mcols(peak.gr)[[cn]] <- geneAnno[, cn]
        }
    }
    return(peak.gr)
}



================================================
FILE: R/annotatePeak.R
================================================
##' Annotate peaks
##'
##'
##' @title annotatePeak
##' @param peak peak file or GRanges object
##' @param tssRegion Region Range of TSS
##' @param TxDb TxDb or EnsDb annotation object
##' @param level one of transcript and gene
##' @param assignGenomicAnnotation logical, assign peak genomic annotation or not
##' @param genomicAnnotationPriority genomic annotation priority
##' @param annoDb annotation package
##' @param addFlankGeneInfo logical, add flanking gene information from the peaks
##' @param flankDistance distance of flanking sequence
##' @param sameStrand logical, whether find nearest/overlap gene in the same strand
##' @param ignoreOverlap logical, whether ignore overlap of TSS with peak
##' @param ignoreUpstream logical, if True only annotate gene at the 3' of the peak.
##' @param ignoreDownstream logical, if True only annotate gene at the 5' of the peak.
##' @param overlap one of 'TSS' or 'all', if overlap="all", then gene overlap with peak will be reported as nearest gene, no matter the overlap is at TSS region or not.
##' @param verbose print message or not
##' @param columns names of columns to be obtained from database
##' @return data.frame or GRanges object with columns of:
##'
##' all columns provided by input.
##'
##' annotation: genomic feature of the peak, for instance if the peak is
##' located in 5'UTR, it will annotated by 5'UTR. Possible annotation is
##' Promoter-TSS, Exon, 5' UTR, 3' UTR, Intron, and Intergenic.
##'
##' geneChr: Chromosome of the nearest gene
##'
##' geneStart: gene start
##'
##' geneEnd: gene end
##'
##' geneLength: gene length
##'
##' geneStrand: gene strand
##'
##' geneId: entrezgene ID
##'
##' distanceToTSS: distance from peak to gene TSS
##'
##' if annoDb is provided, extra column will be included:
##'
##' ENSEMBL: ensembl ID of the nearest gene
##'
##' SYMBOL: gene symbol
##'
##' GENENAME: full gene name
##' @import BiocGenerics S4Vectors GenomeInfoDb
##' @examples
##' \dontrun{
##' require(TxDb.Hsapiens.UCSC.hg19.knownGene)
##' txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
##' peakfile <- system.file("extdata", "sample_peaks.txt", package="ChIPseeker")
##' peakAnno <- annotatePeak(peakfile, tssRegion=c(-3000, 3000), TxDb=txdb)
##' peakAnno
##' }
##' @seealso \code{\link{plotAnnoBar}} \code{\link{plotAnnoPie}} \code{\link{plotDistToTSS}}
##' @export
##' @author G Yu
annotatePeak <- function(peak,
                         tssRegion=c(-3000, 3000),
                         TxDb=NULL,
                         level = "transcript",
                         assignGenomicAnnotation=TRUE,
                         genomicAnnotationPriority = c("Promoter", "5UTR", "3UTR", "Exon", "Intron", "Downstream", "Intergenic"),
                         annoDb=NULL,
                         addFlankGeneInfo=FALSE,
                         flankDistance=5000,
                         sameStrand = FALSE,
                         ignoreOverlap=FALSE,
                         ignoreUpstream=FALSE,
                         ignoreDownstream=FALSE,
                         overlap = "TSS",
                         verbose=TRUE,
                         columns=c("ENTREZID", "ENSEMBL", "SYMBOL", "GENENAME")) {

    is_GRanges_of_TxDb <- FALSE
    if (is(TxDb, "GRanges")) {
        is_GRanges_of_TxDb <- TRUE
        assignGenomicAnnotation <- FALSE
        annoDb <- NULL
        addFlankGeneInfo <- FALSE
        message("#\n#.. 'TxDb' is a self-defined 'GRanges' object...\n#")
        message("#.. Some parameters of 'annotatePeak' will be disable,")
        message("#.. including:")
        message("#..\tlevel, assignGenomicAnnotation, genomicAnnotationPriority,")
        message("#..\tannoDb, addFlankGeneInfo and flankDistance.")
        message("#\n#.. Some plotting functions are designed for visualizing genomic annotation")
        message("#.. and will not be available for the output object.\n#")
    }

    if (is_GRanges_of_TxDb) {
        level <- "USER_DEFINED"
    } else {
        level <- match.arg(level, c("transcript", "gene"))
    }

    if (assignGenomicAnnotation && all(genomicAnnotationPriority %in% c("Promoter", "5UTR", "3UTR", "Exon", "Intron", "Downstream", "Intergenic")) == FALSE) {
        stop('genomicAnnotationPriority should be any order of c("Promoter", "5UTR", "3UTR", "Exon", "Intron", "Downstream", "Intergenic")')
    }

    if ( is(peak, "GRanges") ){
        ## this test will be TRUE
        ## when peak is an instance of class/subclass of "GRanges"
        input <- "gr"
        peak.gr <- peak
    } else {
        input <- "file"
        peak.gr <- loadPeak(peak, verbose)
    }

    peakNum <- length(peak.gr)

    if (verbose)
        cat(">> preparing features information...\t\t",
            format(Sys.time(), "%Y-%m-%d %X"), "\n")

    if (is_GRanges_of_TxDb) {
        features <- TxDb
    } else {
        TxDb <- loadTxDb(TxDb)

        if (level=="transcript") {
            features <- getGene(TxDb, by="transcript")
        } else {
            features <- getGene(TxDb, by="gene")
        }
    }
    if (verbose)
        cat(">> identifying nearest features...\t\t",
            format(Sys.time(), "%Y-%m-%d %X"), "\n")

    ## nearest features
    idx.dist <- getNearestFeatureIndicesAndDistances(peak.gr, features,
                                                     sameStrand, ignoreOverlap,
                                                     ignoreUpstream,ignoreDownstream,
                                                     overlap=overlap)

    if (verbose)
        cat(">> calculating distance from peak to TSS...\t",
            format(Sys.time(), "%Y-%m-%d %X"), "\n")
    ## distance
    distance <- idx.dist$distance

    ## update peak, remove un-map peak if exists.
    peak.gr <- idx.dist$peak

    ## annotation
    if (assignGenomicAnnotation == TRUE) {
        if (verbose)
            cat(">> assigning genomic annotation...\t\t",
                format(Sys.time(), "%Y-%m-%d %X"), "\n")

        anno <- getGenomicAnnotation(peak.gr, distance, tssRegion, TxDb, level, genomicAnnotationPriority, sameStrand=sameStrand)
        annotation <- anno[["annotation"]]
        detailGenomicAnnotation <- anno[["detailGenomicAnnotation"]]
    } else {
        annotation <- NULL
        detailGenomicAnnotation <- NULL
    }

    ## append annotation to peak.gr
    if (!is.null(annotation))
        mcols(peak.gr)[["annotation"]] <- annotation


    has_nearest_idx <- which(idx.dist$index <= length(features))
    nearestFeatures <- features[idx.dist$index[has_nearest_idx]]

    ## duplicated names since more than 1 peak may annotated by only 1 gene
    names(nearestFeatures) <- NULL
    nearestFeatures.df <- as.data.frame(nearestFeatures)
    if (is_GRanges_of_TxDb) {
        colnames(nearestFeatures.df)[1:5] <- c("geneChr", "geneStart", "geneEnd",
                                          "geneLength", "geneStrand")
    } else if (level == "transcript") {
        if (is(TxDb, "EnsDb")) {
            nearestFeatures.df <- nearestFeatures.df[, c("seqnames", "start",
                                                         "end", "width",
                                                         "strand", "gene_id",
                                                         "tx_id", "tx_biotype"),
                                                     drop = FALSE]
            colnames(nearestFeatures.df) <- c(
                "geneChr", "geneStart", "geneEnd", "geneLength", "geneStrand",
                "geneId", "transcriptId", "transcriptBiotype")
        } else {
            colnames(nearestFeatures.df) <- c("geneChr", "geneStart", "geneEnd",
                                              "geneLength", "geneStrand",
                                              "geneId", "transcriptId")
            nearestFeatures.df$geneId <- TXID2EG(
                as.character(nearestFeatures.df$geneId), geneIdOnly=TRUE)
        }
    } else {
        if (is(TxDb, "EnsDb")) {
            nearestFeatures.df <- nearestFeatures.df[, c("seqnames", "start",
                                                         "end", "width",
                                                         "strand", "gene_id",
                                                         "gene_biotype"),
                                                     drop = FALSE]
            colnames(nearestFeatures.df) <- c("geneChr", "geneStart", "geneEnd",
                                              "geneLength", "geneStrand",
                                              "geneId", "geneBiotype")
        } else
            colnames(nearestFeatures.df) <- c("geneChr", "geneStart", "geneEnd",
                                              "geneLength", "geneStrand",
                                              "geneId")
    }

    for(cn in colnames(nearestFeatures.df)) {
        mcols(peak.gr)[[cn]][has_nearest_idx] <- unlist(nearestFeatures.df[, cn])
    }

    mcols(peak.gr)[["distanceToTSS"]] <- distance

    if (!is.null(annoDb)) {
        if (verbose)
            cat(">> adding gene annotation...\t\t\t",
                format(Sys.time(), "%Y-%m-%d %X"), "\n")
        .idtype <- IDType(TxDb)
        if (length(.idtype) == 0 || is.na(.idtype) || is.null(.idtype)) {
            n <- length(peak.gr)
            if (n > 100)
                n <- 100
            sampleID <- peak.gr$geneId[1:n]

            if (all(grepl('^ENS', sampleID))) {
                .idtype <- "Ensembl Gene ID"
            } else if (all(grepl('^\\d+$', sampleID))) {
                .idtype <- "Entrez Gene ID"
            } else {
                warning("Unknown ID type, gene annotation will not be added...")
                .idtype <- NA
            }
        }

        if (!is.na(.idtype)) {
            peak.gr %<>% addGeneAnno(annoDb, .idtype, columns)
        }
    }

    if (addFlankGeneInfo == TRUE) {
        if (verbose)
            cat(">> adding flank feature information from peaks...\t",
                format(Sys.time(), "%Y-%m-%d %X"), "\n")

        flankInfo <- getAllFlankingGene(peak.gr, features, level, flankDistance)

        if (level == "transcript") {
            mcols(peak.gr)[["flank_txIds"]] <- NA
            mcols(peak.gr)[["flank_txIds"]][flankInfo$peakIdx] <- flankInfo$flank_txIds
        }

        mcols(peak.gr)[["flank_geneIds"]] <- NA
        mcols(peak.gr)[["flank_gene_distances"]] <- NA

        mcols(peak.gr)[["flank_geneIds"]][flankInfo$peakIdx] <- flankInfo$flank_geneIds
        mcols(peak.gr)[["flank_gene_distances"]][flankInfo$peakIdx] <- flankInfo$flank_gene_distances

    }

    if (!is_GRanges_of_TxDb) {
        if(verbose)
            cat(">> assigning chromosome lengths\t\t\t",
                format(Sys.time(), "%Y-%m-%d %X"), "\n")

        peak.gr@seqinfo <- seqinfo(TxDb)[names(seqlengths(peak.gr))]
    }

    if(verbose)
        cat(">> done...\t\t\t\t\t",
            format(Sys.time(), "%Y-%m-%d %X"), "\n")

    if (assignGenomicAnnotation) {
        res <- new("csAnno",
                   anno = peak.gr,
                   tssRegion = tssRegion,
                   level=level,
                   hasGenomicAnnotation = TRUE,
                   detailGenomicAnnotation=detailGenomicAnnotation,
                   annoStat=getGenomicAnnoStat(peak.gr),
                   peakNum=peakNum
                   )
    } else {
        res <- new("csAnno",
                   anno = peak.gr,
                   tssRegion = tssRegion,
                   level=level,
                   hasGenomicAnnotation = FALSE,
                   peakNum=peakNum
                   )
    }

    return(res)
}


##' dropAnno
##'
##' drop annotation exceeding distanceToTSS_cutoff
##' @title dropAnno
##' @param csAnno output of annotatePeak
##' @param distanceToTSS_cutoff distance to TSS cutoff
##' @return csAnno object
##' @export
##' @author Guangchuang Yu
dropAnno <- function(csAnno, distanceToTSS_cutoff=10000) {
    idx <- which(abs(mcols(csAnno@anno)[["distanceToTSS"]]) < distanceToTSS_cutoff)
    csAnno@anno <- csAnno@anno[idx]
    csAnno@peakNum <- length(idx)
    if (csAnno@hasGenomicAnnotation) {
        csAnno@annoStat <- getGenomicAnnoStat(csAnno@anno)
        csAnno@detailGenomicAnnotation = csAnno@detailGenomicAnnotation[idx,]
    }
    csAnno
}


================================================
FILE: R/covplot.R
================================================

##' plot peak coverage
##'
##' 
##' @title covplot
##' @param peak peak file or GRanges object
##' @param weightCol weight column of peak
##' @param xlab xlab
##' @param ylab ylab
##' @param title title
##' @param chrs selected chromosomes to plot, all chromosomes by default
##' @param xlim ranges to plot, default is whole chromosome
##' @param lower lower cutoff of coverage signal
##' @param fill_color specify the color/palette for the plot. Order matters
##' @return ggplot2 object
##' @import GenomeInfoDb
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 geom_segment
##' @importFrom ggplot2 geom_blank
##' @importFrom ggplot2 geom_rect
##' @importFrom ggplot2 facet_grid
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 theme_classic
##' @importFrom ggplot2 element_text
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 xlim
##' @importFrom ggplot2 ggtitle
##' @export
##' @author G Yu
covplot <- function(peak, weightCol=NULL,
                    xlab  = "Chromosome Size (bp)",
                    ylab  = "",
                    title = "ChIP Peaks over Chromosomes",
                    chrs  = NULL,
                    xlim  = NULL,
                    lower = 1,
                    fill_color = "black") {
    isList <- is.list(peak)
    if(!isList) {  # Note: don't support data.frame
        tm <- getChrCov(peak = peak, weightCol = weightCol, chrs = chrs, xlim = xlim, lower = lower)
    } else {
        ltm <- lapply(peak, getChrCov, weightCol = weightCol, chrs = chrs, xlim = xlim, lower = lower)
        if (is.null(names(ltm))) {
            nn <- paste0("peak", seq_along(ltm))
            warning("input is not a named list, set the name automatically to ", paste(nn, collapse = ' '))
            names(ltm) <- nn
        }
        tm <- dplyr::bind_rows(ltm, .id = ".id")
        chr.sorted <- sortChrName(as.character(unique(tm$chr)))
        tm$chr <- factor(tm$chr, levels = chr.sorted)
    }
    
    chr <- start <- end <- value <- .id <- NULL
    
    if(length(tm$chr) == 0){
        p <- ggplot(data.frame(x = 1)) + geom_blank()
    } else {
        p <- ggplot(tm, aes(start, value))
        
        ## p <- p + geom_segment(aes(x=start, y=0, xend=end, yend= value))
        if (isList) {
            if (length(fill_color) == length(peak) && all(is_valid_color(fill_color))){
                cols = fill_color
            } else {
                cols = generate_colors(fill_color, n = length(peak))
            }
            p <- p + geom_rect(aes(xmin = start, ymin = 0, xmax = end, ymax = value, fill = .id, color = .id)) +
                scale_color_manual(values = cols) +
                scale_fill_manual(values = cols)
        } else {
            p <- p + geom_rect(aes(xmin = start, ymin = 0, xmax = end, ymax = value), fill = fill_color, color = fill_color)
        }
        
        if(length(unique(tm$chr)) > 1) {
            p <- p + facet_grid(chr ~., scales="free")
        }
        
    }
    
    p <- p + theme_classic()
    p <- p + labs(x = xlab, y = ylab, title = title, fill = NULL, color = NULL)
    p <- p + scale_y_continuous(expand = c(0,0))
    p <- p + theme(strip.text.y=element_text(angle=360))
    p <- p + scale_x_continuous(labels = scales::label_number(scale_cut = scales::cut_si("")))
    
    if (!is.null(xlim) && !all(is.na(xlim)) && is.numeric(xlim) && length(xlim) == 2) {
        p <- p + xlim(xlim)
    }
    
    return(p)
}

##' @import S4Vectors IRanges
##' @importFrom dplyr group_by
##' @importFrom dplyr summarise
##' @importFrom magrittr %>%
getChrCov <- function(peak, weightCol, chrs, xlim, lower=1) {
    if (is(peak, "GRanges")) {
        peak.gr <- peak
    } else if (file.exists(peak)) {
        peak.gr <- readPeakFile(peak, as="GRanges")
    } else {
        stop("peak should be a GRanges object or a peak file...")
    }

    if ( is.null(weightCol)) {
        peak.cov <- coverage(peak.gr)
    } else {
        weight <- mcols(peak.gr)[[weightCol]]
        peak.cov <- coverage(peak.gr, weight=weight)
    }

    cov <- lapply(peak.cov, IRanges::slice, lower=lower)

    get.runValue <- function(x) {
        y <- runValue(x)
        sapply(y@listData, mean)
        ## value <- x@subject@values
        ## value[value != 0]
    }

    chr <- start <- end <- cnt <- NULL
    
    ldf <- lapply(1:length(cov), function(i) {
        x <- cov[[i]]
        if (length(x@ranges) == 0) {
            msg <- paste0(names(cov[i]),
                          " dosen't contain signal higher than ",
                          lower)
            message(msg)
            return(NA)
        }
        data.frame(chr   = names(cov[i]),
                   start = start(x),
                   end   = end(x),
                   cnt   = get.runValue(x)
                                        # the following versions are more slower
                                        # unlist(runValue(x)) 
                                        # sapply(x, runValue)
                   )
    })

    ldf <- ldf[!is.na(ldf)]
    df <- do.call("rbind", ldf)
    
    chr.sorted <- sortChrName(as.character(unique(df$chr)))
    df$chr <- factor(df$chr, levels=chr.sorted)
    if (!is.null(chrs) && !all(is.na(chrs)) && all(chrs %in% chr.sorted)) {
        df <- df[df$chr %in% chrs, ]
    }
    if (!is.null(xlim) && !all(is.na(xlim)) && is.numeric(xlim) && length(xlim) == 2) {
        df <- df[df$start >= xlim[1] & df$end <= xlim[2],]
    }

    df2 <- group_by(df, chr, start, end) %>% summarise(value=sum(cnt), .groups = "drop")
    return(df2)
}

# a simple `stringr::str_sort(numeric=TRUE)` implementation
sortChrName <- function(chr.name, decreasing = FALSE) {
    ## universal sort function, support organisms other than human
    chr_part <- sub("^(\\D*)(\\d*)$", "\\1", chr.name)
    num_part <- as.numeric(sub("^(\\D*)(\\d*)$", "\\2", chr.name))
    chr.name[order(chr_part, num_part, decreasing = decreasing)]
}




================================================
FILE: R/csAnno.R
================================================
##' Class "csAnno"
##' This class represents the output of ChIPseeker Annotation
##'
##'
##' @name csAnno-class
##' @aliases csAnno-class
##' show,csAnno-method vennpie,csAnno-method
##' plotDistToTSS,csAnno-method plotAnnoBar,csAnno-method
##' plotAnnoPie,csAnno-method upsetplot,csAnno-method
##' subset,csAnno-method
##'
##' @docType class
##' @slot anno annotation
##' @slot tssRegion TSS region
##' @slot level transcript or gene
##' @slot hasGenomicAnnotation logical
##' @slot detailGenomicAnnotation Genomic Annotation in detail
##' @slot annoStat annotation statistics
##' @slot peakNum number of peaks
##' @exportClass csAnno
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
##' @seealso \code{\link{annotatePeak}}
##' @keywords classes
setClass("csAnno",
         representation=representation(
             anno = "GRanges",
             tssRegion = "numeric",
             level = "character",
             hasGenomicAnnotation = "logical",
             detailGenomicAnnotation="data.frame",
             annoStat="data.frame",
             peakNum="numeric"
             ))


##' convert csAnno object to GRanges
##'
##'
##' @title as.GRanges
##' @param x csAnno object
##' @return GRanges object
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
##' @export
as.GRanges <- function(x) {
    if (!is(x, "csAnno"))
        stop("not supported...")
    return(x@anno)
}

##' getting status of annotation
##' 
##' 
##' @title getAnnoStat
##' @param x csAnno object
##' @export
getAnnoStat <- function(x) {
    if (!is(x, "csAnno"))
        stop("not supported...")
    return(x@annoStat)
}



##' Combine csAnno Object
##'
##'
##' https://github.com/YuLab-SMU/ChIPseeker/issues/157
##' @title combine_csAnno
##' @param x csAnno object
##' @param ... csAnno objects
##' @return csAnno object
##' @export
combine_csAnno <- function(x, ...){
    z <- list(x, ...)
    
    if(sum(vapply(z, function(x) !is(x, "csAnno"), FUN.VALUE = logical(1))) != 0){
        stop("not supported...")
    }
    
    if(length(z)<2){
        stop("need two or more csAnno object...")
    }
    
    
    if(sum(!duplicated(lapply(z, function(x) x@tssRegion[1]))) != 1 
       && sum(!duplicated(lapply(z, function(x) x@tssRegion[2]))) != 1){
        stop("the tss regions of different csAnno objects should be the same...")
    }
    
    if(sum(!duplicated(lapply(z, function(x) x@level))) != 1){
        stop("the level of different csAnno object should be the same...")
    }
    
    if(sum(!duplicated(lapply(z, function(x) x@hasGenomicAnnotation))) != 1){
        stop("the status of GenomicAnnotation should be the same...")
    }
    
    combine_tssRegion <- x@tssRegion
    combine_level <- x@level
    combine_hasGenomicAnnotation <- x@hasGenomicAnnotation
    
    combine_anno <- x@anno
    for(i in 2:length(z)){
        combine_anno <- c(combine_anno,z[[i]]@anno)
    }
    
    combine_detailGenomicAnnotation <- lapply(z, function(x) x@detailGenomicAnnotation)
    combine_detailGenomicAnnotation <- do.call("rbind",combine_detailGenomicAnnotation)
    
    combine_peakNum <- x@peakNum
    for(i in 2:length(z)){
        combine_peakNum <- combine_peakNum+z[[i]]@peakNum
    }
    
    feature <- x@annoStat$Feature
    for(i in 2:length(z)){
        if(length(feature)<length(z[[i]]@annoStat$Feature)){
            feature_levels <- levels(z[[i]]@annoStat$Feature)
            feature <- c(as.vector(feature),as.vector(z[[i]]@annoStat$Feature))
            feature <- feature[!duplicated(feature)]
            feature <- factor(feature, 
                              levels = feature_levels)
            feature <- sort(feature)
        }else{
            feature_levels <- levels(feature)
            feature <- c(as.vector(feature),as.vector(z[[i]]@annoStat$Feature))
            feature <- feature[!duplicated(feature)]
            feature <- factor(feature, 
                              levels = feature_levels)
            feature <- sort(feature)
        }
    }
    
    combine_annoStat <- data.frame(Feature=feature)
    
    for(i in 1:length(z)){
        combine_annoStat <- merge(combine_annoStat, z[[i]]@annoStat, 
                                  by = "Feature", all = T, sort = F)
        combine_annoStat[is.na(combine_annoStat)] <- 0
        combine_annoStat <- combine_annoStat[order(combine_annoStat$Feature),]
    }
    
    total <- (ncol(combine_annoStat)-1)*100
    combine_annoStat$sum <- rowSums(combine_annoStat[, 2:ncol(combine_annoStat)])
    
    
    for (i in 1:length(combine_annoStat$sum)) {
        combine_annoStat$result[i] <- (combine_annoStat$sum[i]/total)*100
    }
    
    annoStat_result <- data.frame(Feature=combine_annoStat[,1],Frequency=combine_annoStat[,ncol(combine_annoStat)])
    
    res <- new("csAnno",
               anno = combine_anno,
               tssRegion = combine_tssRegion,
               level = combine_level,
               hasGenomicAnnotation = combine_hasGenomicAnnotation,
               detailGenomicAnnotation = combine_detailGenomicAnnotation,
               annoStat = annoStat_result,
               peakNum = combine_peakNum
    )
    
    return(res)
}

##' vennpie method generics
##'
##' @name vennpie
##' @docType methods
##' @rdname vennpie-methods
##' 
##' @title vennpie method
##' @param x A \code{csAnno} instance
##' @param r initial radius
##' @param cex value to adjust legend
##' @param ... additional parameter
##' @return plot
##' @usage vennpie(x, r = 0.2, cex=1.2, ...)
##' @exportMethod vennpie
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("vennpie", signature(x="csAnno"),
          function(x, 
                   r = 0.2, 
                   cex = 1.2, 
                   ...) {
            vennpie.csAnno(x, r, cex, ...)
          }
          )


##' upsetplot method generics
##'
##' @name upsetplot
##' @docType methods
##' @rdname upsetplot-methods
##'
##' @title upsetplot method
##' @param x A \code{csAnno} instance
##' @param ... additional parameter
##' @return plot
##' @usage upsetplot(x, ...)
##' @importFrom enrichplot upsetplot
##' @exportMethod upsetplot
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("upsetplot", signature(x="csAnno"),
          function(x, ...) {
              upsetplot.csAnno(x, ...)
          }
          )

##' convert csAnno object to data.frame
##'
##'
##' @title as.data.frame.csAnno
##' @param x csAnno object
##' @param row.names row names
##' @param optional should be omitted.
##' @param ... additional parameters
##' @return data.frame
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
##' @method as.data.frame csAnno
##' @export
as.data.frame.csAnno <- function(x, row.names=NULL, optional=FALSE, ...) {
    y <- as.GRanges(x)
    if (!(is.null(row.names) || is.character(row.names)))
        stop("'row.names' must be NULL or a character vector")
    df <- as.data.frame(y)
    rownames(df) <- row.names
    return(df)
}

##' show method for \code{csAnno} instance
##'
##' @name show
##' @docType methods
##' @rdname show-methods
##' @aliases show,csAnno,ANY-method
##' @title show method
##' @param object A \code{csAnno} instance
##' @return message
##' @importFrom methods show
##' @exportMethod show
##' @usage show(object)
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("show", signature(object="csAnno"),
          function(object) {
              cat("Annotated peaks generated by ChIPseeker\n")
              cat(paste(length(object@anno), object@peakNum, sep="/"),
                  " peaks were annotated\n")
              if (object@hasGenomicAnnotation) {
                  cat("Genomic Annotation Summary:\n")
                  print(object@annoStat)
              }
          }
          )

##' plotAnnoBar method for list of \code{csAnno} instances
##'
##' @name plotAnnoBar
##' @docType methods
##' @rdname plotAnnoBar-methods
##' @aliases plotAnnoBar,list-method
##' @exportMethod plotAnnoBar
setMethod("plotAnnoBar", signature(x="list"),
          function(x,
                   xlab="",
                   ylab='Percentage(%)',
                   title="Feature Distribution",
                   ...) {
              if (is.null(names(x))) {
                  nn <- paste0("Peak", seq_along(x))
                  warning("input is not a named list, set the name automatically to ", paste(nn, collapse = " "))
                  names(x) <- nn
                  ## stop("input object should be a named list...")
              }
              anno <- lapply(x, getAnnoStat)
              ## anno.df <- ldply(anno)
              anno.df <- list_to_dataframe(anno)
              categoryColumn <- ".id"
              plotAnnoBar.data.frame(anno.df, xlab, ylab, title, categoryColumn)
          })

##' plotAnnoBar method for \code{csAnno} instance
##'
##' @name plotAnnoBar
##' @docType methods
##' @rdname plotAnnoBar-methods
##' @aliases plotAnnoBar,csAnno,ANY-method
##' @title plotAnnoBar method
##' @param x \code{csAnno} instance
##' @param xlab xlab
##' @param ylab ylab
##' @param title title
##' @param ... additional paramter
##' @return plot
##' @exportMethod plotAnnoBar
##' @usage plotAnnoBar(x, xlab="", ylab='Percentage(\%)',title="Feature Distribution", ...)
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("plotAnnoBar", signature(x="csAnno"),
          function(x,
                   xlab="",
                   ylab="Percentage(%)",
                   title="Feature Distribution",
                   ...) {
              anno.df <- getAnnoStat(x)
              categoryColumn <- 1
              plotAnnoBar.data.frame(anno.df, xlab, ylab, title, categoryColumn)
          })



##' plotAnnoPie method for \code{csAnno} instance
##'
##' @name plotAnnoPie
##' @docType methods
##' @rdname plotAnnoPie-methods
##' @aliases plotAnnoPie,csAnno,ANY-method
##' @title plotAnnoPie method
##' @param x \code{csAnno} instance
##' @param ndigit number of digit to round
##' @param cex label cex
##' @param col color
##' @param legend.position topright or other.
##' @param pie3D plot in 3D or not
##' @param radius radius of the pie
##' @param ... extra parameter
##' @return plot
##' @exportMethod plotAnnoPie
##' @usage plotAnnoPie(x,ndigit=2,cex=0.9,col=NA,legend.position="rightside",pie3D=FALSE,radius=0.8,...)
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("plotAnnoPie", signature(x="csAnno"),
          function(x,
                   ndigit=2,
                   cex=0.9,
                   col=NA,
                   legend.position="rightside",
                   pie3D=FALSE,
                   radius=0.8,
                   ...){
              plotAnnoPie.csAnno(x, ndigit, cex, col, legend.position, pie3D, radius, ...)
          })



##' plotDistToTSS method for list of \code{csAnno} instances
##'
##' @name plotDistToTSS
##' @docType methods
##' @rdname plotDistToTSS-methods
##' @aliases plotDistToTSS,list-method
##' @exportMethod plotDistToTSS
setMethod("plotDistToTSS", signature(x="list"),
          function(x, distanceColumn="distanceToTSS",
                                     xlab="", ylab="Binding sites (%) (5'->3')",
                                     title="Distribution of transcription factor-binding loci relative to TSS",                       
                     distanceBreaks=c(0, 1000, 3000, 5000, 10000, 100000),
                     palette = NULL, ...) {
              if (is.null(names(x))) {
                  nn <- paste0("Peak", seq_along(x))
                  warning("input is not a named list, set the name automatically to ", paste(nn, collapse = " "))
                  names(x) <- nn
                  ## stop("input object should be a named list...")
              }

              peakAnno <- lapply(x, as.data.frame)
              ## peakDist <- ldply(peakAnno)
              peakDist <- list_to_dataframe(peakAnno)
              categoryColumn <- ".id"
              plotDistToTSS.data.frame(peakDist, distanceColumn = distanceColumn,
                                       distanceBreaks = distanceBreaks, palette = palette,
                                       xlab = xlab, ylab = ylab, title = title, categoryColumn = categoryColumn)
          })


##' plotDistToTSS method for \code{csAnno} instance
##'
##' @name plotDistToTSS
##' @docType methods
##' @rdname plotDistToTSS-methods
##' @aliases plotDistToTSS,csAnno,ANY-method
##' @title plotDistToTSS method
##' @param distanceColumn distance column name
##' @param distanceBreaks breaks of distance, default is 'c(0, 1000, 3000, 5000, 10000, 100000)'
##' @param palette palette name for coloring different distances. Run `RColorBrewer::display.brewer.all()` to see all applicable values.
##' @param x \code{csAnno} instance
##' @param xlab xlab
##' @param ylab ylab
##' @param title title
##' @param ... additional parameter
##' @return plot
##' @exportMethod plotDistToTSS
##' @usage plotDistToTSS(x,distanceColumn="distanceToTSS", xlab="",
##' ylab="Binding sites (\%) (5'->3')",
##' title="Distribution of transcription factor-binding loci relative to TSS",...)
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
setMethod("plotDistToTSS", signature(x="csAnno"),
          function(x, distanceColumn="distanceToTSS",
                                     xlab="", ylab="Binding sites (%) (5'->3')",
                                     title="Distribution of transcription factor-binding loci relative to TSS", 
                                     distanceBreaks=c(0, 1000, 3000, 5000, 10000, 100000),
                                     palette = NULL,...) {
              peakDist <- as.data.frame(x)
              categoryColumn <- 1
              plotDistToTSS.data.frame(peakDist, distanceColumn = distanceColumn, distanceBreaks = distanceBreaks, palette = palette,
                                       xlab = xlab, ylab = ylab, title = title, categoryColumn = categoryColumn)
          })



================================================
FILE: R/dplyr-verb.R
================================================
# extend filter to Peak (GRanges class object)
#' @method filter GRanges
#' @importFrom dplyr filter
#' @export
filter.GRanges = function(.data, ..., .by = NULL, .preserve = FALSE) {
  dots = rlang::quos(...)
  as.data.frame(.data) |> 
    dplyr::filter(!!!dots, .by = .by, .preserve = .preserve) |> 
    droplevels() |> 
    GenomicRanges::makeGRangesFromDataFrame(keep.extra.columns = TRUE)
}

# extend mutate to Peak (GRanges class object)
#' @method mutate GRanges
#' @importFrom dplyr mutate
#' @export
mutate.GRanges = function(.data, ..., .by = NULL, 
                           .keep = c("all", "used", "unused", "none"),
                           .before = NULL,
                           .after = NULL) {
  dots = rlang::quos(...)
  df = as.data.frame(.data)
  
  if (!is.null(.before) && !is.null(.after)) {
    stop("You can't supply both `.before` and `.after`.")
  }
  
  if (!is.null(.before)) {
    df = df |> 
      dplyr::mutate(!!!dots, .by = .by, .keep = .keep, .before = .before)
  } else if (!is.null(.after)) {
    df = df |> 
      dplyr::mutate(!!!dots, .by = .by, .keep = .keep, .after = .after)
  } else {
    df = df |> dplyr::mutate(!!!dots, .by = .by, .keep = .keep)
  }
  
  df |> 
    GenomicRanges::makeGRangesFromDataFrame(keep.extra.columns = TRUE)
}

# S4Vectors::rename
#' @method rename GRanges
#' @importFrom rlang quos
#' @export
rename.GRanges = function(x, ...){
  dots = rlang::quos(...)
  as.data.frame(x) |> 
    dplyr::rename(!!!dots) |> 
    GenomicRanges::makeGRangesFromDataFrame(keep.extra.columns = TRUE)
}

#' @method arrange GRanges
#' @importFrom dplyr arrange
#' @export
arrange.GRanges = function(.data, ..., .by_group = FALSE){
  dots = rlang::quos(...)
  as.data.frame(.data) |> 
    dplyr::arrange(!!!dots, .by_group = .by_group) |> 
    GenomicRanges::makeGRangesFromDataFrame(keep.extra.columns = TRUE)
}




================================================
FILE: R/enrichOverlap.R
================================================
##' calcuate overlap significant of ChIP experiments based on their nearest gene annotation
##'
##'
##' @title enrichAnnoOverlap
##' @param queryPeak query bed file
##' @param targetPeak target bed file(s) or folder containing bed files
##' @param TxDb TxDb
##' @param pAdjustMethod pvalue adjustment method
##' @param chainFile chain file for liftOver
##' @param distanceToTSS_cutoff restrict nearest gene annotation by distance cutoff
##' @return data.frame
##' @importFrom stats p.adjust
##' @importFrom stats phyper
##' @export
##' @importFrom rtracklayer import.chain
##' @importFrom rtracklayer liftOver
##' @importFrom yulab.utils get_cache_element
##' @importFrom yulab.utils update_cache_item
##' @author G Yu
enrichAnnoOverlap <- function(queryPeak, targetPeak, TxDb=NULL, pAdjustMethod="BH", chainFile=NULL, distanceToTSS_cutoff=NULL) {

    TxDb <- loadTxDb(TxDb)

    query.anno <- annotatePeak(queryPeak, TxDb=TxDb,
                               assignGenomicAnnotation=FALSE, annoDb=NULL, verbose=FALSE)


    if (is(targetPeak[1], "GRanges") || is(targetPeak[[1]], "GRanges")) {
        target.gr <- targetPeak
        targetFiles <- NULL
    } else {
        targetFiles <- parse_targetPeak_Param(targetPeak)
        target.gr <- lapply(targetFiles, loadPeak)
    }

    if (!is.null(chainFile)) {
        chain <- import.chain(chainFile)
        target.gr <- lapply(target.gr, liftOver, chain=chain)
    }

    target.anno <- lapply(target.gr, annotatePeak, TxDb=TxDb,
                          assignGenomicAnnotation=FALSE, annoDb=NULL, verbose=FALSE)


    if (!is.null(distanceToTSS_cutoff)) {
        query.anno <- dropAnno(query.anno, distanceToTSS_cutoff)
        target.anno <- lapply(target.anno, dropAnno, distanceToTSS_cutoff = distanceToTSS_cutoff)
    }

    # ChIPseekerEnv <- get("ChIPseekerEnv", envir=.GlobalEnv)
    features <- get_cache_element(item = ChIPseekerCache, elements = "Transcripts")

    if(is.null(features)){
        features <- transcriptsBy(TxDb)
        features <- unlist(features)
        update_cache_item(item = ChIPseekerCache, list("Transcripts" = features))
    }

    # if ( exists("Transcripts", envir=ChIPseekerEnv, inherits=FALSE) ) {
    #     features <- get("Transcripts", envir=ChIPseekerEnv)
    # } else {
    #     features <- transcriptsBy(TxDb)
    #     features <- unlist(features)
    #     assign("Transcripts", features, envir=ChIPseekerEnv)
    # }

    ol <- lapply(target.anno, function(i) unique(intersect(as.GRanges(query.anno)$geneId, as.GRanges(i)$geneId)))
    oln <- unlist(lapply(ol, length))
    N <- length(features)
    ## white ball
    m <- length(unique(as.GRanges(query.anno)$geneId))
    ## black ball
    n <- N - m
    ## drawn
    k <- unlist(lapply(target.anno, function(i) length(unique(as.GRanges(i)$geneId))))
    p <- phyper(oln, m, n, k, lower.tail=FALSE)


    if (is(queryPeak, "GRanges")) {
        qSample <- "queryPeak"
    } else {
        qSample <- basename(queryPeak)
    }

    if (is.null(targetFiles)) {
        tSample <- names(target.gr)
        if(is.null(tSample)) {
            tSample <- paste0("targetPeak", seq_along(target.gr))
        }
    } else {
        tSample <- basename(targetFiles)
    }

    padj <- p.adjust(p, method=pAdjustMethod)
    res <- data.frame(qSample=qSample,
                      tSample=tSample,
                      qLen=length(unique(as.GRanges(query.anno)$geneId)),
                      tLen=unlist(lapply(target.anno, function(i) length(unique(as.GRanges(i)$geneId)))),
                      N_OL=oln,
                      pvalue=p,
                      p.adjust=padj)
    return(res)
}

##' calculate overlap significant of ChIP experiments based on the genome coordinations
##'
##'
##' @title enrichPeakOverlap
##' @param queryPeak query bed file or GRanges object
##' @param targetPeak target bed file(s) or folder that containing bed files or a list of GRanges objects
##' @param TxDb TxDb
##' @param pAdjustMethod pvalue adjustment method
##' @param nShuffle shuffle numbers
##' @param chainFile chain file for liftOver
##' @param pool logical, whether pool target peaks
##' @param mc.cores number of cores, see \link[parallel]{mclapply}
##' @param verbose logical
##' @return data.frame
##' @export
##' @importFrom rtracklayer import.chain
##' @importFrom rtracklayer liftOver
##' @author G Yu
enrichPeakOverlap <- function(queryPeak, targetPeak, TxDb=NULL, pAdjustMethod="BH", nShuffle=1000,
                              chainFile=NULL, pool=TRUE, mc.cores=detectCores()-1, verbose=TRUE) {
    TxDb <- loadTxDb(TxDb)
    query.gr <- loadPeak(queryPeak)
    if (is(targetPeak[1], "GRanges") || is(targetPeak[[1]], "GRanges")) {
        target.gr <- targetPeak
        targetFiles <- NULL
    } else {
        targetFiles <- parse_targetPeak_Param(targetPeak)
        target.gr <- lapply(targetFiles, loadPeak)
    }

    if (!is.null(chainFile)) {
        chain <- import.chain(chainFile)
        target.gr <- lapply(target.gr, liftOver, chain=chain)
    }

    if (pool) {
        p.ol <- enrichOverlap.peak.internal(query.gr, target.gr, TxDb, nShuffle,
                                            mc.cores=mc.cores,verbose=verbose)
    } else {
        res_list <- lapply(1:length(target.gr), function(i) {
            enrichPeakOverlap(queryPeak = queryPeak,
                              targetPeak = target.gr[i],
                              TxDb = TxDb,
                              pAdjustMethod = pAdjustMethod,
                              nShuffle = nShuffle,
                              chainFile = chainFile,
                              mc.cores = mc.cores,
                              verbose = verbose)
        })
        res <- do.call("rbind", res_list)
        return(res)
    }

    if (is.null(p.ol$pvalue)) {
        p <- padj <- NA
    } else {
        p <- p.ol$pvalue
        padj <- p.adjust(p, method=pAdjustMethod)
    }

    ol <- p.ol$overlap


    if (is(queryPeak, "GRanges")) {
        qSample <- "queryPeak"
    } else {
        ## remove path, only keep file name
        qSample <- basename(queryPeak)
    }

    if (is.null(targetFiles)) {
        tSample <- names(target.gr)
        if(is.null(tSample)) {
            tSample <- paste0("targetPeak", seq_along(target.gr))
        }
    } else {
        tSample <- basename(targetFiles)
    }

    res <- data.frame(qSample=qSample,
                      tSample=tSample,
                      qLen=length(query.gr),
                      tLen=unlist(lapply(target.gr, length)),
                      N_OL=ol,
                      pvalue=p,
                      p.adjust=padj)

    return(res)
}



##' shuffle the position of peak
##'
##'
##' @title shuffle
##' @param peak.gr GRanges object
##' @param TxDb TxDb
##' @return GRanges object
##' @export
##' @author G Yu
shuffle <- function(peak.gr, TxDb) {
    chrLens <- seqlengths(TxDb)[names(seqlengths(peak.gr))]
    nn <- as.vector(seqnames(peak.gr))
    ii <- order(nn)
    w <- width(peak.gr)
    nnt <- table(nn)
    jj <- order(names(nnt))
    nnt <- nnt[jj]
    chrLens <- chrLens[jj]
    ss <- unlist(sapply(1:length(nnt), function(i) sample(chrLens[i],nnt[i])))

    res <- GRanges(seqnames=nn[ii], ranges=IRanges(ss, width=w[ii]), strand="*")
    return(res)
}




##' @import GenomeInfoDb
##' @importFrom utils txtProgressBar
##' @importFrom utils setTxtProgressBar
##' @importFrom parallel mclapply
##' @importFrom parallel detectCores
enrichOverlap.peak.internal <- function(query.gr, target.gr, TxDb, nShuffle=1000, mc.cores=detectCores()-1, verbose=TRUE) {
    if (verbose) {
        cat(">> permutation test of peak overlap...\t\t",
            format(Sys.time(), "%Y-%m-%d %X"), "\n")
    }

    idx <- sample(1:length(target.gr), nShuffle, replace=TRUE)
    len <- unlist(lapply(target.gr, length))

    if(Sys.info()[1] == "Windows") {
        qLen <- lapply(target.gr, function(tt) {
            length(intersect(query.gr, tt))
        })
    } else {
        qLen <- mclapply(target.gr, function(tt) {
            length(intersect(query.gr, tt))
        }, mc.cores=mc.cores
                         )
    }
    qLen <- unlist(qLen)
    ## query ratio
    qr <- qLen/len

    if (nShuffle < 1) {
        res <- list(pvalue=NULL, overlap=qLen)
        return(res)
    }

    if (verbose) {
        pb <- txtProgressBar(min=0, max=nShuffle, style=3)
    }
    if(Sys.info()[1] == "Windows") {
        rr <- lapply(seq_along(idx), function(j) {
            if (verbose) {
                setTxtProgressBar(pb, j)
            }
            i <- idx[j]
            tarShuffle <- shuffle(target.gr[[i]], TxDb)
            length(intersect(query.gr, tarShuffle))/len[i]
        })
    } else {
        rr <- mclapply(seq_along(idx), function(j) {
            if (verbose) {
                setTxtProgressBar(pb, j)
            }
            i <- idx[j]
            tarShuffle <- shuffle(target.gr[[i]], TxDb)
            length(intersect(query.gr, tarShuffle))/len[i]
        }, mc.cores=mc.cores
                       )
    }

    if (verbose) {
        close(pb)
    }

    rr <- unlist(rr) ## random ratio

    ## p <- lapply(qr, function(q) mean(rr>q))
    p <- lapply(qr, function(q) (sum(rr>q)+1)/(length(rr)+1))
    res <- list(pvalue=unlist(p), overlap=qLen)
    return(res)
}



================================================
FILE: R/getFlankingGene.R
================================================

##' @import IRanges
##' @importFrom dplyr mutate
##' @importFrom dplyr group_by
getAllFlankingGene <- function(peak.gr, features, level="transcript", distance=5000) {
    peak.gr2 <- peak.gr
    start(ranges(peak.gr)) = start(ranges(peak.gr)) - distance
    end(ranges(peak.gr)) = end(ranges(peak.gr)) + distance
    hit <- findOverlaps(peak.gr, unstrand(features))
    qh <- queryHits(hit)
    sh <- subjectHits(hit)
    
    featureHit <- features[sh]
    names(featureHit)=NULL
    hitInfo <- as.data.frame(featureHit)

    if (level == "transcript") {
        eg <- TXID2EG(featureHit$tx_id, geneIdOnly=TRUE)
        hitInfo$geneId <- eg
    } else {
        cn <- colnames(hitInfo)
        colnames(hitInfo)[cn == "gene_id"] <- "geneId"
    }


    hitInfo$peakIdx <- qh

    overlapHit <- findOverlaps(peak.gr2, unstrand(featureHit))
    hitInfo$distance <- NA
    hitInfo$distance[subjectHits(overlapHit)] <- 0

    psD <- ifelse(strand(featureHit) == "+",
                  start(peak.gr2[qh]) - start(featureHit),
                  end(featureHit)-end(peak.gr2[qh]))
    
    peD <- ifelse(strand(featureHit) == "+",
                  end(peak.gr2[qh]) - start(featureHit),
                  end(featureHit)-start(peak.gr2[qh]))

    idx <- abs(psD) > abs(peD)
    dd <- psD
    dd[idx] <- peD[idx]

    ii <- is.na(hitInfo$distance)
    hitInfo$distance[ii] <- dd[ii]

    peakIdx <- tx_name <- geneId <- distance <- NULL

    if (level == "transcript") {
        hitInfo2 <- group_by(hitInfo, peakIdx) %>%
            mutate(flank_txIds=paste(tx_name, collapse=";"),
                   flank_geneIds=paste(geneId, collapse=";"),
                   flank_gene_distances=paste(distance, collapse=";"))
        res <- hitInfo2[,c("peakIdx", "flank_txIds", "flank_geneIds", "flank_gene_distances")]
        res$flank_txIds <- as.character(res$flank_txIds)
    } else {
        hitInfo2 <- group_by(hitInfo, peakIdx) %>%
            mutate(flank_geneIds=paste(geneId, collapse=";"),
                   flank_gene_distances=paste(distance, collapse=";"))
        res <- hitInfo2[,c("peakIdx", "flank_geneIds", "flank_gene_distances")]
    }

    res <- unique(res)
    res$flank_geneIds <- as.character(res$flank_geneIds)
    res$flank_gene_distances <- as.character(res$flank_gene_distances)
    
    return(res)    
}


================================================
FILE: R/getGenomicAnnotation.R
================================================
updateGenomicAnnotation <- function(peaks, genomicRegion, type, anno, sameStrand=FALSE) {
    hits <- getGenomicAnnotation.internal(peaks, genomicRegion, type, sameStrand=sameStrand)
    if (length(hits) > 1) {
        hitIndex <- hits$queryIndex
        anno[["annotation"]][hitIndex] <- hits$annotation
        anno[["detailGenomicAnnotation"]][hitIndex, type] <- TRUE
    }
    return(anno)
}


##' get Genomic Annotation of peaks
##'
##'
##' @title getGenomicAnnotation
##' @param peaks peaks in GRanges object
##' @param distance distance of peak to TSS
##' @param tssRegion tssRegion, default is -3kb to +3kb
##' @param TxDb TxDb object
##' @param level one of gene or transcript
##' @param genomicAnnotationPriority genomic Annotation Priority
##' @param sameStrand whether annotate gene in same strand
##' @importFrom GenomicFeatures threeUTRsByTranscript
##' @importFrom GenomicFeatures fiveUTRsByTranscript
##' @importFrom yulab.utils get_cache_element
##' @importFrom yulab.utils update_cache_item
##' @return character vector
##' @author G Yu
getGenomicAnnotation <- function(peaks,
                                 distance,
                                 tssRegion=c(-3000, 3000),
                                 TxDb,
                                 level,
                                 genomicAnnotationPriority,
                                 sameStrand = FALSE
                                 ) {

    ##
    ## since some annotation overlap,
    ## a priority is assign based on *genomicAnnotationPriority*
    ## use the following priority by default:
    ##
    ## 1. Promoter
    ## 2. 5' UTR
    ## 3. 3' UTR
    ## 4. Exon
    ## 5. Intron
    ## 6. Downstream
    ## 7. Intergenic
    ##



    .ChIPseekerEnv(TxDb, item = ChIPseekerCache)
    # ChIPseekerEnv <- get("ChIPseekerEnv", envir=.GlobalEnv)

    annotation <- rep(NA, length(distance))

    flag <- rep(FALSE, length(distance))
    detailGenomicAnnotation <- data.frame(
        genic=flag,
        Intergenic=flag,
        Promoter=flag,
        fiveUTR=flag,
        threeUTR=flag,
        Exon=flag,
        Intron=flag,
        downstream=flag,
        distal_intergenic=flag)

    anno <- list(annotation=annotation,
                 detailGenomicAnnotation=detailGenomicAnnotation)

    genomicAnnotationPriority <- rev(genomicAnnotationPriority)
    for (AP in genomicAnnotationPriority) {
        if (AP == "Intron") {
            ## Introns
            # intronList <- get_intronList(ChIPseekerEnv)
            intronList <- get_intronList(item = ChIPseekerCache)
            anno <- updateGenomicAnnotation(peaks, intronList, "Intron", anno, sameStrand=sameStrand)
        } else if (AP == "Exon") {
            ## Exons
            # exonList <- get_exonList(ChIPseekerEnv)
            exonList <- get_exonList(item = ChIPseekerCache)
            anno <- updateGenomicAnnotation(peaks, exonList, "Exon", anno, sameStrand=sameStrand)
        } else if (AP == "3UTR") {
            ## 3' UTR Exons
            threeUTRList <- get_cache_element(item = ChIPseekerCache, elements = "threeUTRList")

            if(is.null(threeUTRList)){
                threeUTRList <- threeUTRsByTranscript(TxDb)
                update_cache_item(item = ChIPseekerCache, list("threeUTRList" = threeUTRList))
            }

            # if ( exists("threeUTRList", envir=ChIPseekerEnv, inherits=FALSE) ) {
            #     threeUTRList <- get("threeUTRList", envir=ChIPseekerEnv)
            # } else {
            #     threeUTRList <- threeUTRsByTranscript(TxDb)
            #     assign("threeUTRList", threeUTRList, envir=ChIPseekerEnv)
            # }
            anno <- updateGenomicAnnotation(peaks, threeUTRList, "threeUTR", anno, sameStrand=sameStrand)
        } else if (AP == "5UTR") {
            ## 5' UTR Exons
            fiveUTRList <- get_cache_element(item = ChIPseekerCache, elements = "fiveUTRList")

            if(is.null(fiveUTRList)){
                fiveUTRList <- fiveUTRsByTranscript(TxDb)
                update_cache_item(item = ChIPseekerCache, list("fiveUTRList" = fiveUTRList))
            }

            # if ( exists("fiveUTRList", envir=ChIPseekerEnv, inherits=FALSE) ) {
            #     fiveUTRList <- get("fiveUTRList", envir=ChIPseekerEnv)
            # } else {
            #     fiveUTRList <- fiveUTRsByTranscript(TxDb)
            #     assign("fiveUTRList", fiveUTRList, envir=ChIPseekerEnv)
            # }
            anno <- updateGenomicAnnotation(peaks, fiveUTRList, "fiveUTR", anno, sameStrand=sameStrand)
        } else if (AP == "Promoter") {
            annotation <- anno[["annotation"]]
            ## detailGenomicAnnotation <- anno[["detailGenomicAnnotation"]]

            ## TSS
            tssIndex <- distance >= tssRegion[1] & distance <= tssRegion[2]
            annotation[tssIndex] <- "Promoter"
            anno$detailGenomicAnnotation[tssIndex, "Promoter"] <- TRUE

            pm <- max(abs(tssRegion))
            if (pm/1000 >= 2) {
                dd <- seq(1:ceiling(pm/1000))*1000
                for (i in 1:length(dd)) {
                    if (i == 1) {
                        lbs <- paste("Promoter", " (<=", dd[i]/1000, "kb)", sep="")
                        annotation[abs(distance) <= dd[i] &
                                   annotation == "Promoter"] <- lbs
                    } else {
                        lbs <- paste("Promoter", " (", dd[i-1]/1000, "-", dd[i]/1000, "kb)", sep="")
                        annotation[abs(distance) <= dd[i] &
                                   abs(distance) > dd[i-1] &
                                   annotation == "Promoter"] <- lbs
                    }
                }
            }
            anno[["annotation"]] <- annotation
        } else {
            ## Intergenic
            annotation[is.na(annotation)] <- "Intergenic"
            anno[["annotation"]] <- annotation
        }
    }

    annotation <- anno[["annotation"]]
    detailGenomicAnnotation <- anno[["detailGenomicAnnotation"]]
    genicIndex <- which(apply(detailGenomicAnnotation[, c("Exon", "Intron")], 1, any))
    detailGenomicAnnotation[-genicIndex, "Intergenic"] <- TRUE
    detailGenomicAnnotation[genicIndex, "genic"] <- TRUE

    ## intergenicIndex <- anno[["annotation"]] == "Intergenic"
    ## anno[["detailGenomicAnnotation"]][intergenicIndex, "Intergenic"] <- TRUE
    ## anno[["detailGenomicAnnotation"]][!intergenicIndex, "genic"] <- TRUE


    features <- getGene(TxDb, by=level)

    ## nearest from gene end
    if (sameStrand) {
        idx <- follow(peaks, features)
    } else {
        idx <- follow(peaks, unstrand(features))
    }
    
    na.idx <- which(is.na(idx))
    if (length(na.idx)) {
        idx <- idx[-na.idx]
        peaks <- peaks[-na.idx]
    }
    
    peF <- features[idx]
    dd <- ifelse(strand(peF) == "+",
		 start(peaks) - end(peF),
		 end(peaks) - start(peF))
    
    if (length(na.idx)) {
        dd2 <- numeric(length(idx) + length(na.idx))
        dd2[-na.idx] <- dd
    } else {
        dd2 <- dd
    }

    dsd <- getOption("ChIPseeker.downstreamDistance")
    if (is.null(dsd))
	    dsd <- 3000 ## downstream 3k by default

    ## downstream within dsd
    if(dsd/1000<=1){
        j <- which(annotation == "Intergenic" & abs(dd2) <= dsd & dd2 != 0)
        if(length(j)>0){
            lbs <- paste("Downstream (<=", dsd, "bp)", sep="")
            annotation[j] <- lbs
        }
    }else{
        
        ## downstream within 0-dsd/1000 kb
        for(i in 1:(dsd/1000)){
            j <- which(annotation == "Intergenic" & abs(dd2) <= i*1000 & dd2 != 0)
            if (length(j) > 0){
                if (i == 1){
                    lbs <- "Downstream (<1kb)"
                }else{
                    lbs <- paste("Downstream (", i-1, "-", i, "kb)", sep="")
                }
		annotation[j] <- lbs
            }
        }
        
        ## downstream (dsd/1000) kb - dsd bp
        z <- which(annotation == "Intergenic" & abs(dd2) <= dsd & dd2 != 0)
        if(length(z)>0){
            lbs <- paste("Downstream (",dsd/1000,"kb-", dsd, "bp)", sep="")
            annotation[z] <- lbs
        }
    }
    annotation[which(annotation == "Intergenic")] = "Distal Intergenic"

    downstreamIndex <- dd2 > 0 & dd2 < dsd
    detailGenomicAnnotation[downstreamIndex, "downstream"] <- TRUE
    detailGenomicAnnotation[which(annotation == "Distal Intergenic"), "distal_intergenic"] <- TRUE
    return(list(annotation=annotation, detailGenomicAnnotation=detailGenomicAnnotation))
}


##' @import BiocGenerics S4Vectors IRanges
getGenomicAnnotation.internal <- function(peaks, genomicRegion, type, sameStrand=FALSE){
    GRegion <- unlist(genomicRegion)
    GRegionLen <- elementNROWS(genomicRegion)

    names(GRegionLen) <- names(genomicRegion)
    GRegion$gene_id <- rep(names(genomicRegion), times=GRegionLen)


    if (type == "Intron") {
        gr2 <- GRegion[!duplicated(GRegion$gene_id)]
        strd <- as.character(strand(gr2))
        len <- GRegionLen[GRegionLen != 0]

        GRegion$intron_rank <- lapply(seq_along(strd), function(i) {
            rank <- seq(1, len[i])
            if (strd[i] == '-')
                rank <- rev(rank)
            return(rank)
        }) %>% unlist
    }

    if (type == "Intron" || type =="Exon") {
        nn <- TXID2EG(names(genomicRegion))
        names(GRegionLen) <- nn
        GRegion$gene_id <- rep(nn, times=GRegionLen)
    }

    ## find overlap
    if (sameStrand) {
        GRegionHit <- findOverlaps(peaks, GRegion)
    } else {
        GRegionHit <- findOverlaps(peaks, unstrand(GRegion))
    }

    if (length(GRegionHit) == 0) {
        return(NA)
    }
    qh <- queryHits(GRegionHit)
    hit.idx <- getFirstHitIndex(qh)
    GRegionHit <- GRegionHit[hit.idx]
    queryIndex <- queryHits(GRegionHit)
    subjectIndex <- subjectHits(GRegionHit)

    hits <- GRegion[subjectIndex]
    geneID <- hits$gene_id

    if (type == "Intron") {
        anno <- paste(type, " (", geneID, ", intron ", hits$intron_rank,
                      " of ", GRegionLen[geneID], ")", sep="")
    } else if (type == "Exon") {
        anno <- paste(type, " (", geneID, ", exon ", hits$exon_rank,
                      " of ", GRegionLen[geneID], ")", sep="")
    } else if (type == "fiveUTR") {
        anno <- "5' UTR"
    } else if (type == "threeUTR") {
        anno <- "3' UTR"
    } else {
        anno <- type
    }
    res <- list(queryIndex=queryIndex, annotation=anno, gene=geneID)
    return(res)
}


================================================
FILE: R/getNearestFeatureIndicesAndDistances.R
================================================
##' get index of features that closest to peak and calculate distance
##'
##'
##' @title getNearestFeatureIndicesAndDistances
##' @param peaks peak in GRanges
##' @param features features in GRanges
##' @param sameStrand logical, whether find nearest gene in the same strand
##' @param ignoreOverlap logical, whether ignore overlap of TSS with peak
##' @param ignoreUpstream logical, if True only annotate gene at the 3' of the peak.
##' @param ignoreDownstream logical, if True only annotate gene at the 5' of the peak.
##' @param overlap one of "TSS" or "all"
##' @return list
##' @import BiocGenerics IRanges GenomicRanges
##' @author G Yu
getNearestFeatureIndicesAndDistances <- function(peaks, features,
                                                 sameStrand = FALSE,
                                                 ignoreOverlap=FALSE,
                                                 ignoreUpstream=FALSE,
                                                 ignoreDownstream=FALSE,
                                                 overlap = "TSS") {

    overlap <- match.arg(overlap, c("TSS", "all"))

    if (!ignoreOverlap && overlap == "all") {
        overlap_hit <- findOverlaps(peaks, unstrand(features))
    }

    ## peaks only conatin all peak records, in GRanges object
    ## feature is the annotation in GRanges object

    ## only keep start position based on strand
    ## start(features) <- end(features) <- ifelse(strand(features) == "+", start(features), end(features))
    features <- resize(features, width=1) # faster

    ## add dummy NA feature for peaks that are at the last or first feature
    ## suggested by Michael Kluge
    features.bak <- features
    seqlevels(features) <- c(seqlevels(features), "chrNA")
    dummy <- GRanges("chrNA", IRanges(1,1))

    ## dummy$tx_id <- -1
    ## dummy$tx_name <- "NA"

    cns <- names(mcols(features))
    for (cn in cns) {
        if (grepl('id', cn)) {
            mcols(dummy)[[cn]] <- -1
        } else {
            mcols(dummy)[[cn]] <- NA
        }
    }

    features <- append(features, dummy)
    dummyID <- length(features)

    if (sameStrand) {
        ## nearest from peak start
        ps.idx <- follow(peaks, features)

        ## nearest from peak end
        pe.idx <- precede(peaks, features)
    } else {
        ps.idx <- follow(peaks, unstrand(features))
        pe.idx <- precede(peaks, unstrand(features))
    }

    na.idx <- is.na(ps.idx) & is.na(pe.idx)
    if (sum(na.idx) > 0) { ## suggested by Thomas Schwarzl
        ps.idx <- ps.idx[!na.idx]
        pe.idx <- pe.idx[!na.idx]
        ##peaks <- peaks[!na.idx]
    }

    # set NA values to dummy value if only one entry is affected
    ps.idx[is.na(ps.idx)] <- dummyID
    pe.idx[is.na(pe.idx)] <- dummyID

    ## features from nearest peak start
    psF <- features[ps.idx]

    ## feature distances from peak start
    psD <- ifelse(strand(psF) == "+", 1, -1) *
        (start(peaks[!na.idx]) - start(psF))
    psD[ps.idx == dummyID] <- Inf # ensure that there is even no match if a seq with name "chrNA" exists

    ## features from nearest peak end
    peF <- features[pe.idx]
    ## feature distances from peak end
    peD <- ifelse(strand(peF) == "+", 1, -1) *
        (end(peaks[!na.idx]) - start(peF))
    peD[pe.idx == dummyID] <- Inf # ensure that there is even no match if a seq with name "chrNA" exists

    ## restore the old feature object
    features <- features.bak

    pse <- data.frame(ps=psD, pe=peD)
    if (ignoreUpstream) {
        j <- rep(2, nrow(pse))
    } else if (ignoreDownstream) {
        j <- rep(1, nrow(pse))
    } else {
        j <- apply(pse, 1, function(i) which.min(abs(i)))
    }

    ## index
    idx <- ps.idx
    idx[j==2] <- pe.idx[j==2]

    ## distance
    dd <- psD
    dd[j==2] <- peD[j==2]

    index <- distanceToTSS <- rep(NA, length(peaks))
    distanceToTSS[!na.idx] <- dd
    index[!na.idx] <- idx

    if (!ignoreOverlap) {
        ## hit <- findOverlaps(peaks, unstrand(features))

        if (overlap == "all") {
            hit <- overlap_hit
            if ( length(hit) != 0 ) {
                qh <- queryHits(hit)
                hit.idx <- getFirstHitIndex(qh)
                hit <- hit[hit.idx]
                peakIdx <- queryHits(hit)
                featureIdx <- subjectHits(hit)

                index[peakIdx] <- featureIdx
                distance_both_end <- data.frame(start=start(peaks[peakIdx]) - start(features[featureIdx]),
                                          end = end(peaks[peakIdx]) - start(features[featureIdx]))
                distance_idx <- apply(distance_both_end, 1, function(i) which.min(abs(i)))
                distance_minimal <- distance_both_end[,1]
                distance_minimal[distance_idx == 2] <- distance_both_end[distance_idx==2, 2]

                distanceToTSS[peakIdx] <- distance_minimal * ifelse(strand(features[featureIdx]) == "+", 1, -1)

            }
        }

        hit <- findOverlaps(peaks, unstrand(features))

        if ( length(hit) != 0 ) {
            qh <- queryHits(hit)
            hit.idx <- getFirstHitIndex(qh)
            hit <- hit[hit.idx]
            peakIdx <- queryHits(hit)
            featureIdx <- subjectHits(hit)

            index[peakIdx] <- featureIdx
            distanceToTSS[peakIdx] <- 0
        }

    }

    j <- is.na(distanceToTSS) | is.na(index)

    res <- list(index=index[!j],
                distance=distanceToTSS[!j],
                peak=peaks[!j])

    return(res)
}

isPeakFeatureOverlap <- function(peak, feature) {
    peakRange <- ranges(peak)
    featureRange <- ranges(feature)
    x <- intersect(peakRange, featureRange)
    return(length(x) != 0)
}


================================================
FILE: R/plotAnno.R
================================================
##' plot feature distribution based on their chromosome region
##'
##' plot chromosome region features
##' @title plotAnnoBar.data.frame
##' @rdname plotAnnoBar
##' @param anno.df annotation stats
##' @param xlab xlab
##' @param ylab ylab
##' @param title plot title
##' @param categoryColumn category column
##' @return bar plot that summarize genomic features of peaks
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 aes_string
##' @importFrom ggplot2 geom_bar
##' @importFrom ggplot2 coord_flip
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_fill_manual
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 ggtitle
##' @importFrom ggplot2 guide_legend
##' @seealso \code{\link{annotatePeak}} \code{\link{plotAnnoPie}}
##' @author Guangchuang Yu \url{https://yulab-smu.top}
plotAnnoBar.data.frame <- function(anno.df,
                                   xlab="",
                                   ylab="Percentage(%)",
                                   title="Feature Distribution",
                                   categoryColumn) {


    anno.df$Feature <- factor(anno.df$Feature, levels = rev(levels(anno.df$Feature)))

    p <- ggplot(anno.df, aes_string(x = categoryColumn,
                                    fill = "Feature",
                                    y = "Frequency"))

    p <- p + geom_bar(stat="identity") + coord_flip() + theme_bw()
    p <- p + ylab(ylab) + xlab(xlab) + ggtitle(title)

    if (categoryColumn == 1) {
        p <- p + scale_x_continuous(breaks=NULL)
        p <- p+scale_fill_manual(values=rev(getCols(nrow(anno.df))), guide=guide_legend(reverse=TRUE))
    } else {
        p <- p+scale_fill_manual(values=rev(getCols(length(unique(anno.df$Feature)))), guide=guide_legend(reverse=TRUE))
    }

    return(p)
}

##' pieplot from peak genomic annotation
##'
##'
##' @title plotAnnoPie
##' @rdname plotAnnoPie
##' @param x csAnno object
##' @param ndigit number of digit to round
##' @param cex label cex
##' @param col color
##' @param legend.position topright or other.
##' @param pie3D plot in 3D or not
##' @param radius radius of Pie
##' @param ... extra parameter
##' @return pie plot of peak genomic feature annotation
##' @examples
##' \dontrun{
##' require(TxDb.Hsapiens.UCSC.hg19.knownGene)
##' txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
##' peakfile <- system.file("extdata", "sample_peaks.txt", package="chipseeker")
##' peakAnno <- annotatePeak(peakfile, TxDb=txdb)
##' plotAnnoPie(peakAnno)
##' }
##' @seealso \code{\link{annotatePeak}} \code{\link{plotAnnoBar}}
##' @export
##' @author Guangchuang Yu \url{https://yulab-smu.top}
plotAnnoPie.csAnno <- function(x,
                        ndigit=2,
                        cex=0.8,
                        col=NA,
                        legend.position="rightside",
                        pie3D=FALSE,
                        radius=0.8,
                        ...){

    anno.df <- getAnnoStat(x)
    if (is.na(col[1])) {
        col <- getCols(nrow(anno.df))
    }

    if (pie3D)
        annoPie3D(anno.df, ndigit=ndigit, cex=cex, col=col, ...)

    annoPie(anno.df, ndigit=ndigit, cex=cex, col=col, legend.position=legend.position, radius=radius, ...)
 }

##' @importFrom grDevices colorRampPalette
##' @importFrom graphics par
##' @importFrom graphics layout
##' @importFrom graphics pie
##' @importFrom graphics legend
##' @importFrom graphics plot.new
annoPie <- function(anno.df, ndigit=2, cex=0.8, col=NA, legend.position, radius=0.8, ...) {
    if ( ! all(c("Feature", "Frequency") %in% colnames(anno.df))) {
        stop("check your input...")
    }

    if (legend.position == "rightside") {
        labels=paste(anno.df$Feature, " (",
            round(anno.df$Frequency/sum(anno.df$Frequency)*100, ndigit),
            "%)", sep="")

        par(mai = c(0,0,0,0))
        layout(matrix(c(1,2), ncol=2), widths=c(0.6,0.4))
        pie(anno.df$Frequency, labels=NA, cex=cex, col=col, ...)
        plot.new()
        legend("center", legend = labels, fill = col, bty = "n", cex = cex)
    } else {
        par(mai = c(0,0,0,0))
        pie(anno.df$Frequency,
            ##     ## labels=paste(round(anno.df$Frequency/sum(anno.df$Frequency)*100, 2), "%", sep=""),
            labels=paste(anno.df$Feature, " (",
                round(anno.df$Frequency/sum(anno.df$Frequency)*100, ndigit),
                "%)", sep=""),
            cex=cex,
            col=col,
            radius=radius,
            ...
            )
    }
}

## @param ndigit ndigit
## @param radius the radius of the pie
## @param explode the amount to "explode" the pie
## @param labelcex label font size
## @importFrom plotrix pie3D
annoPie3D <- function(anno.df,
                      ndigit=2,
                      cex=1,
                      ...){

    ## anno.df <- getGenomicAnnoStat(peakAnno)

    pkg <- "plotrix"
    require(pkg, character.only=TRUE)
    pie3D <- eval(parse(text="pie3D"))
    pie3D(anno.df$Frequency,
          labels=paste(
              anno.df$Feature,
              "(",
              paste(round(anno.df$Frequency, ndigit), "%", sep=""),
              ")",
              sep=""),
          labelcex=cex,
          col=col,
          ...)
}

getGenomicAnnoStat <- function(peakAnno) {
    if(inherits(peakAnno,"GRanges"))
        peakAnno <- as.data.frame(peakAnno)
    anno <- peakAnno$annotation
    ## anno <- sub(" \\(.+", "", anno)

    e1 <- getOption("ChIPseeker.ignore_1st_exon")
    i1 <- getOption("ChIPseeker.ignore_1st_intron")
    ids <- getOption("ChIPseeker.ignore_downstream")

    if (is.null(e1) || !e1) {
        e1lab <- "1st Exon"
        anno[grep("exon 1 of", anno)] <- e1lab
        exonlab <- "Other Exon"
    } else {
        e1lab <- NULL
        exonlab <- "Exon"
    }

    if (is.null(i1) || !i1) {
        i1lab <- "1st Intron"
        anno[grep("intron 1 of", anno)] <- i1lab
        intronlab <- "Other Intron"
    } else {
        i1lab <- NULL
        intronlab <- "Intron"
    }

    anno[grep("Exon \\(", anno)] <- exonlab
    anno[grep("Intron \\(", anno)] <- intronlab

    if (is.null(ids) || !ids) {
        dsd <- getOption("ChIPseeker.downstreamDistance")
        if (is.null(dsd))
            dsd <- 3000 ## downstream 3k by default
        if (dsd > 1000) {
            dsd <- round(dsd/1000, 1)
            dsd <- paste0(dsd, "kb")
        }
        dslab <- paste0("Downstream (<=", dsd, ")")

        anno[grep("Downstream", anno)] <- dslab
        iglab <- "Distal Intergenic"
    } else {
        dslab <- NULL
        iglab <- "Intergenic"
        anno[grep("Downstream", anno)] <- iglab
    }
    anno[grep("^Distal", anno)] <- iglab

    lvs <- c(
        "5' UTR",
        "3' UTR",
        e1lab,
        exonlab,
        i1lab,
        intronlab,
        dslab,
        iglab
    )

    promoter <- unique(anno[grep("Promoter", anno)])
    ip <- getOption("ChIPseeker.ignore_promoter_subcategory")
    if ((is.null(ip) || !ip) && (length(promoter) > 0)) {
        plab <- sort(as.character(promoter))
    } else {
        plab <- "Promoter"
        anno[grep("^Promoter", anno)] <- plab
    }
    lvs <- c(plab, lvs)

    ## count frequency
    anno.table <- table(anno)

    ## calculate ratio
    anno.ratio <- anno.table/ sum(anno.table) * 100
    anno.df <- as.data.frame(anno.ratio)
    colnames(anno.df) <- c("Feature", "Frequency")

    anno.df$Feature <- factor(anno.df$Feature, levels=lvs[lvs %in% anno.df$Feature])
    anno.df <- anno.df[order(anno.df$Feature),]
    return(anno.df)
}






================================================
FILE: R/plotDistToTSS.R
================================================
merge_two_si = function(x1, x2){
  if (length(unique(gsub("^[0-9]+","",c(x1, x2)))) == 1){
    return(paste0(gsub("[^0-9]*$","",x1), "-", x2))
  } else {
    return(paste0(x1, "-", x2))
  }
}

generate_break_lbs = function(breaks) {
  lbs = c()
  
  # break labels
  break_labels = scales::label_number(scale_cut = scales::cut_si(unit = "b"))(breaks)
  break_labels = gsub(" b$"," bp", break_labels)

  # category labels
  for (i in 2:length(breaks)) {
    if (i == length(breaks)) {
      lbs = c(lbs, paste0(">", break_labels[i-1]))
    } else {
      lbs = c(lbs, merge_two_si(break_labels[i-1], break_labels[i]))
    }
  }
  
  return(lbs)
}

generate_colors = function(palette = NULL, n) {
  # old color in version <= 1.41.1
  old_color = c("#9ecae1", "#3182bd", "#C7A76C", "#86B875", "#39BEB1", "#CD99D8")
  if (is.null(palette)){
    brewer_cols = old_color
  } else if (length(palette) == 1 && is_valid_palette(palette)){
    brewer_cols = RColorBrewer::brewer.pal(
      name = palette, 
      n = RColorBrewer::brewer.pal.info[palette, "maxcolors"]
    ) |> rev()     
  } else if (all(is_valid_color(palette))){
    brewer_cols = palette
  }
  else {
    warning("Your palette is non-valid, switching to default...")
    brewer_cols = old_color
  }
  
  if (length(brewer_cols) >= n) {
    cols = brewer_cols[1:length(brewer_cols)]
  } else {
    cols = grDevices::colorRampPalette(brewer_cols)(n)
  }
  
  return(cols)
}

is_valid_palette = function(palette){
  palette %in% rownames(RColorBrewer::brewer.pal.info)
}

is_valid_color = function(color){
  tryCatch({
    grDevices::col2rgb(color)
    TRUE
  }, error = function(e) {
    FALSE
  })
}

##' plot feature distribution based on the distances to the TSS
##'
##'
##' @title plotDistToTSS.data.frame
##' @param peakDist peak annotation
##' @param distanceColumn column name of the distance from peak to nearest gene
##' @param distanceBreaks default is 'c(0, 1000, 3000, 5000, 10000, 100000)'
##' @param palette palette name for coloring different distances. Run `RColorBrewer::display.brewer.all()` to see all applicable values.
##' @param xlab x label
##' @param ylab y lable
##' @param title figure title
##' @param categoryColumn category column, default is ".id"
##' @return bar plot that summarize distance from peak to
##' TSS of the nearest gene.
##' @importFrom magrittr %<>%
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 aes
##' @importFrom ggplot2 aes_string
##' @importFrom ggplot2 geom_bar
##' @importFrom ggplot2 geom_hline
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 coord_flip
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 ggtitle
##' @importFrom ggplot2 geom_hline
##' @importFrom ggplot2 scale_y_continuous
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_fill_brewer
##' @importFrom ggplot2 scale_fill_hue
##' @importFrom ggplot2 scale_fill_manual
##' @importFrom ggplot2 geom_text
##' @importFrom rlang .data
##' @examples
##' \dontrun{
##' require(TxDb.Hsapiens.UCSC.hg19.knownGene)
##' txdb <- TxDb.Hsapiens.UCSC.hg19.knownGene
##' peakfile <- system.file("extdata", "sample_peaks.txt", package="ChIPseeker")
##' peakAnno <- annotatePeak(peakfile, TxDb=txdb)
##' plotDistToTSS(peakAnno)
##' }
##' @seealso \code{\link{annotatePeak}}
##' @author Guangchuang Yu \url{https://guangchuangyu.github.io}
plotDistToTSS.data.frame <- function(peakDist,
                                     distanceColumn="distanceToTSS",
                                     distanceBreaks=c(0, 1000, 3000, 5000, 10000, 100000),
                                     palette = NULL,
                                     xlab="",
                                     ylab="Binding sites (%) (5'->3')",
                                     title="Distribution of transcription factor-binding loci relative to TSS",
                                     categoryColumn = ".id") {

    distanceBreaks = sort(distanceBreaks)
    hasZero = sum(distanceBreaks == 0)
    if (!hasZero) distanceBreaks = c(0, distanceBreaks)
    hasInf = sum(is.infinite(distanceBreaks))
    if (!hasInf) distanceBreaks = c(distanceBreaks, Inf)
    lbs = generate_break_lbs(distanceBreaks)
    peakDist$Feature = cut(abs(peakDist[[distanceColumn]]), 
                           breaks = distanceBreaks,
                           labels = lbs,
                           include.lowest = TRUE)

    ## sign containing -1 and 1 for upstream and downstream
    peakDist$sign <- sign(peakDist[,distanceColumn])

    ## count frequencies
    if (categoryColumn == 1) {
      peakDist = peakDist |> 
        summarise(freq = length(.data$Feature), .by = c("Feature", "sign")) |> 
        mutate(freq = .data$freq/sum(.data$freq) * 100)
    } else {
      peakDist = peakDist |> 
        summarise(freq = length(.data$Feature), .by = c(categoryColumn, "Feature", "sign")) |> 
        mutate(freq = .data$freq/sum(.data$freq) * 100, .by = categoryColumn)
    }

    if (any(peakDist$sign == 0)) {
        zeroDist <- peakDist[peakDist$sign == 0,]
        zeroDist$freq <- zeroDist$freq/2
        zeroDist$sign <- -1
        peakDist[peakDist$sign == 0,] <- zeroDist
        zeroDist$sign <- 1
        peakDist <- rbind(peakDist, zeroDist)
    }

    if (categoryColumn == 1) {
        peakDist %<>% group_by(.data$Feature, .data$sign) %>%
            summarise(freq = sum(.data$freq))

        totalFreq <- peakDist %>% group_by(.data$sign) %>%
            summarise(total = sum(.data$freq))
    } else {
        peakDist %<>% group_by(.data$.id, .data$Feature, .data$sign) %>%
            summarise(freq = sum(.data$freq))
        totalFreq <- peakDist %>% group_by(.data$.id, .data$sign) %>%
            summarise(total = sum(.data$freq))
    }


    ## preparing ylim and y tick labels
    ds = max(totalFreq$total[totalFreq$sign == 1])
    dslim = ceiling(ds/10) * 10
    us = max(totalFreq$total[totalFreq$sign == -1])
    uslim = ceiling(us/10) * 10
    ybreaks <- seq(-uslim, dslim, by=10)
    ylbs <- abs(ybreaks)
    ylbs[ylbs == 0] <- "TSS"

    peakDist$Feature <- factor(peakDist$Feature, levels=rev(levels(peakDist$Feature)))
    if (categoryColumn == 1) {
        p <- ggplot(peakDist, aes(x=1, fill=.data$Feature))
    } else {
        p <- ggplot(peakDist, aes(x=.data[[categoryColumn]], fill=.data$Feature))
    }

    p <- p + geom_bar(data=subset(peakDist, sign==1), aes(y=.data$freq), stat="identity") +
        geom_bar(data=subset(peakDist, sign==-1), aes(y=-.data$freq), stat="identity")

    p <- p + geom_hline(yintercept = 0, colour = "black") +
        coord_flip() + theme_bw() +
            scale_y_continuous(breaks=ybreaks,labels=ylbs)

    p <- p + ylab(ylab) + xlab(xlab) + ggtitle(title)

    if (categoryColumn == 1) {
        p <- p + scale_x_continuous(breaks=NULL)
    }

    cols <- generate_colors(palette = palette, n = length(lbs))
    p <- p + scale_fill_manual(values=rev(cols), guide=guide_legend(reverse=TRUE))

    return(p)
}


================================================
FILE: R/plotTagMatrix.R
================================================
##' plot the profile of peaks
##'`
##' \code{plotPeakProf_MultiWindows()} is almost the same as \code{plotPeakProf2()}, having
##' the main difference of accepting two or more granges objects. Accepting more
##' granges objects can help compare the same peaks in different windows.
##' 
##' \code{TxDb} parameter can accept txdb object.
##' But many regions can not be obtained by txdb object. In this case,
##' Users can provide self-made granges served the same role 
##' as txdb object and pass to \code{TxDb} object.
##' 
##' \code{by} the features of interest. 
##' 
##' (1) if users use \code{txdb}, \code{by} can be one of 'gene', 'transcript', 'exon', 
##' 'intron' , '3UTR' , '5UTR', 'UTR'. These features can be obtained by functions from txdb object.
##' 
##' (2) if users use self-made granges object, \code{by} can be everything. Because this \code{by}
##' will not pass to functions to get features, which is different from the case of using 
##' txdb object. This \code{by} is only used to made labels showed in picture.
##' 
##' \code{type} means the property of the region. one of the "start site",
##' "end site" and "body".
##' 
##' \code{upstream} and \code{downstream} parameter have different usages:
##' 
##' (1) if \code{type == 'body'}, \code{upstream} and \code{downstream} can use to extend 
##' the flank of body region.
##' 
##' (2) if \code{type == 'start_site'/'end_site'}, \code{upstream} and \code{downstream} refer to
##' the upstream and downstream of the start_site or the end_site.
##' 
##' \code{weightCol} refers to column in peak file. This column acts as a weight value. Details
##' see \url{https://github.com/YuLab-SMU/ChIPseeker/issues/15}
##' 
##' \code{nbin} refers to the number of bins. \code{getTagMatrix()} provide a binning method
##' to get the tag matrix.
##' 
##' There are two ways input a list of window.
##' 
##' (1) Users can input a list of self-made granges objects
##' 
##' (2) Users can input a list of \code{by} and only one \code{type}. In this way, 
##' \code{plotPeakProf_MultiWindows()} can made a list of window from txdb object based on \code{by} and \code{type}.
##' 
##' Warning: 
##' 
##' (1) All of these window should be the same type. It means users can only
##' compare a list of "start site"/"end site"/"body region" with the same upstream
##' and downstream.
##' 
##' (2) So it will be only one \code{type} and several \code{by}.
##' 
##' (3) Users can make window by txdb object or self-made granges object. Users can only
##' choose one of 'gene', 'transcript', 'exon', 'intron' , '3UTR' , '5UTR' or 'UTR' in the
##' way of using txdb object. User can input any \code{by} in the way of using 
##' self-made granges object.
##' 
##' (4) Users can mingle the \code{by} designed for the two ways. \code{plotPeakProf_MultiWindows} can
##' accpet the hybrid \code{by}. But the above rules should be followed.
##' 
##' \url{https://github.com/YuLab-SMU/ChIPseeker/issues/189}
##'
##' @title plotPeakProf_MultiWindows
##' 
##' @param tagMatrix tagMatrix or a list of tagMatrix
##' @param peak peak file or GRanges object
##' @param weightCol column name of weight
##' @param TxDb TxDb object or self-made granges objects
##' @param upstream upstream position
##' @param downstream downstream position
##' @param by feature of interest
##' @param type one of "start_site", "end_site", "body"
##' @param windows_name the name for each window, which will also be showed in the picture as labels
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param verbose print message or not
##' @param nbin the amount of bines 
##' @param ignore_strand ignore the strand information or not
##' @param ... additional parameter
##' @return ggplot object
##' @importFrom methods is
##' @importFrom methods as
##' @importFrom methods missingArg
##' @importFrom methods new
##' @export
plotPeakProf <- function(tagMatrix = NULL,
                         peak,
                         upstream,
                         downstream,
                         conf,
                         by,
                         type,
                         windows_name = NULL,
                         weightCol = NULL,
                         TxDb = NULL,
                         xlab = "Genomic Region (5'->3')",
                         ylab = "Peak Count Frequency",
                         facet = "row",
                         free_y = TRUE,
                         verbose = TRUE,
                         nbin = NULL,
                         ignore_strand = FALSE,
                         ...){
  
  if(is.null(tagMatrix)){
    
    conf <- if(missingArg(conf)) NA else conf
    upstream <- if(missingArg(upstream)) NULL else upstream
    downstream <- if(missingArg(downstream)) NULL else downstream
    
    if(length(by) == 1){
      
      plotPeakProf2(peak = peak, 
                    upstream = upstream, 
                    downstream = downstream,
                    conf = conf,
                    by = by,
                    type = type,
                    weightCol = weightCol, 
                    TxDb = TxDb,
                    xlab = xlab,
                    ylab = ylab,
                    facet = facet,
                    free_y = free_y,
                    verbose = verbose, 
                    nbin = nbin,
                    ignore_strand = ignore_strand,
                    ...)
      
    }else{
      
      if(is.null(windows_name) && !is.null(names(TxDb)))
        windows_name <- names(TxDb)
      
      plotPeakProf_MultiWindows(peak = peak,
                                upstream = upstream,
                                downstream = downstream,
                                conf = conf,
                                by = by,
                                type = type,
                                windows_name = windows_name,
                                weightCol = weightCol,
                                TxDb = TxDb,
                                xlab = xlab,
                                ylab = ylab,
                                facet = facet,
                                free_y = free_y,
                                verbose = verbose,
                                nbin = nbin,
                                ignore_strand = ignore_strand,
                                ...)
      
    }
    
  }else{
    
    if(is(tagMatrix, "list")){
      upstream <- attr(tagMatrix[[1]], 'upstream')
      downstream <- attr(tagMatrix[[1]], 'downstream')
      label <- attr(tagMatrix[[1]], 'label')
      attr(tagMatrix, 'type') <- attr(tagMatrix[[1]], 'type')
      attr(tagMatrix, 'is.binning') <- attr(tagMatrix[[1]], 'is.binning')
      
    }else{
      upstream <- attr(tagMatrix, 'upstream')
      downstream <- attr(tagMatrix, 'downstream')
      label <- attr(tagMatrix, 'label')
    }
    
    
    if(attr(tagMatrix, 'is.binning')){
      
      if (!(missingArg(conf) || is.na(conf))){
        
        plotAvgProf.binning(tagMatrix = tagMatrix, 
                            xlab = xlab,
                            ylab = ylab,
                            conf = conf,
                            facet = facet, 
                            free_y = free_y,
                            upstream = upstream,
                            downstream = downstream,
                            label = label,
                            ...)
        
      }else{
        
        plotAvgProf.binning(tagMatrix = tagMatrix, 
                            xlab = xlab,
                            ylab = ylab,
                            facet = facet, 
                            free_y = free_y,
                            upstream = upstream,
                            downstream = downstream,
                            label = label,
                            ...)
        
      }
      
      
    }else{
      
      xlim <- c(-upstream, downstream)
      
      if (!(missingArg(conf) || is.na(conf))){
        
        plotAvgProf (tagMatrix = tagMatrix, 
                     xlim = xlim,
                     xlab = xlab,
                     ylab = ylab,
                     conf = conf,
                     facet = facet, 
                     free_y = free_y,
                     origin_label = label,
                     ...)
        
      }else{
        
        plotAvgProf (tagMatrix = tagMatrix, 
                     xlim = xlim,
                     xlab = xlab,
                     ylab = ylab,
                     facet = facet, 
                     free_y = free_y,
                     origin_label = label,
                     ...)
        
      }
      
      
    }
  }
  
}


##' plot the profile of peaks
##'
##'
##' @title plotAvgProf
##' @param tagMatrix tagMatrix or a list of tagMatrix
##' @param xlim xlim
##' @param xlab x label
##' @param ylab y label
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param origin_label label of the center
##' @param verbose print message or not
##' @param ... additional parameter
##' @return ggplot object
##' @author G Yu; Y Yan
##' @export
plotAvgProf <- function(tagMatrix, xlim,
                        xlab="Genomic Region (5'->3')",
                        ylab = "Peak Count Frequency",
                        conf,
                        facet="none", 
                        free_y = TRUE, 
                        origin_label = "TSS",
                        verbose = TRUE,
                        ...) {
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)

  if (verbose) {
      cat(">> plotting figure...\t\t\t",
          format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  
  conf <- if(missingArg(conf)) NA else conf
  
  if (!(missingArg(conf) || is.na(conf))){
    p <- plotAvgProf.internal(tagMatrix = tagMatrix, 
                              conf = conf, 
                              xlim = xlim,
                              xlab = xlab, 
                              ylab = ylab,
                              facet = facet, 
                              free_y = free_y, 
                              origin_label = origin_label,
                              ...)
  } else {
    p <- plotAvgProf.internal(tagMatrix, 
                              xlim = xlim,
                              xlab = xlab, 
                              ylab = ylab,
                              facet = facet, 
                              free_y = free_y, 
                              origin_label = origin_label,
                              ...)
  }
  return(p)
}


##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 geom_line
##' @importFrom ggplot2 geom_vline
##' @importFrom ggplot2 geom_ribbon
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_color_manual
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 element_blank
##' @importFrom ggplot2 facet_grid
plotAvgProf.internal <- function(tagMatrix, conf,
                                 xlim = c(-3000,3000),
                                 xlab = "Genomic Region (5'->3')",
                                 ylab = "Peak Count Frequency",
                                 facet="none", 
                                 free_y = TRUE,
                                 origin_label, 
                                 ...) {
  
  listFlag <- FALSE
  if (is(tagMatrix, "list")) {
    if ( is.null(names(tagMatrix)) ) {
      nn <- paste0("peak", seq_along(tagMatrix))
      warning("input is not a named list, set the name automatically to ", paste(nn, collapse=' '))
      names(tagMatrix) <- nn
      ## stop("tagMatrix should be a named list...")
    }
    listFlag <- TRUE
  }
  
  if ( listFlag ) {
    facet <- match.arg(facet, c("none", "row", "column"))
    if ( (xlim[2]-xlim[1]+1) != ncol(tagMatrix[[1]]) ) {
      stop("please specify appropreate xcoordinations...")
    }
  } else {
    if ( (xlim[2]-xlim[1]+1) != ncol(tagMatrix) ) {
      stop("please specify appropreate xcoordinations...")
    }
  }
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  ##
  conf <- if(missingArg(conf)) NA else conf
  
  pos <- value <- .id <- Lower <- Upper <- NULL
  
  if ( listFlag ) {
    tagCount <- lapply(tagMatrix, function(x) getTagCount(x, xlim = xlim, conf = conf, ...))
    tagCount <- list_to_dataframe(tagCount)
    tagCount$.id <- factor(tagCount$.id, levels=names(tagMatrix))
    p <- ggplot(tagCount, aes(pos, group=.id, color=.id))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper, fill = .id),
                           linetype = 0, alpha = 0.2)
    }
  } else {
    tagCount <- getTagCount(tagMatrix, xlim = xlim, conf = conf, ...)
    p <- ggplot(tagCount, aes(pos))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper),
                           linetype = 0, alpha = 0.2)
    }
  }
  
  p <- p + geom_line(aes(y = value))
  
  if ( 0 > xlim[1] && 0 < xlim[2] ) {
    p <- p + geom_vline(xintercept=0,
                        linetype="longdash")
    p <- p + scale_x_continuous(breaks=c(xlim[1], floor(xlim[1]/2),
                                         0,
                                         floor(xlim[2]/2), xlim[2]),
                                labels=c(paste0(xlim[1],"bp"), paste0(floor(xlim[1]/2),"bp"),
                                         origin_label, 
                                         paste0(floor(xlim[2]/2),"bp"), paste0(xlim[2], "bp")))
  }
  
  if (listFlag) {
    cols <- getCols(length(tagMatrix))
    p <- p + scale_color_manual(values=cols)
    if (facet == "row") {
      if (free_y) {
        p <- p + facet_grid(.id ~ ., scales = "free_y")
      } else {
        p <- p + facet_grid(.id ~ .)
      }
    } else if (facet == "column") {
      if (free_y) {
        p <-  p + facet_grid(. ~ .id, scales = "free_y")
      } else {
        p <-  p + facet_grid(. ~ .id)
      }
    }
  }
  p <- p+xlab(xlab)+ylab(ylab)
  p <- p + theme_bw() + theme(legend.title=element_blank())
  if(facet != "none") {
    p <- p + theme(legend.position="none")
  }
  return(p)
}

##' plot the profile of peaks that align to flank sequences of TSS
##'
##' This function is the old function of \code{plotPeakProf2}. It can
##' only plot the start site region of gene.
##'
##' @title plotAvgProf
##' @param peak peak file or GRanges object
##' @param weightCol column name of weight
##' @param TxDb TxDb object
##' @param upstream upstream position
##' @param downstream downstream position
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param verbose print message or not
##' @param ignore_strand ignore the strand information or not
##' @param ... additional parameter
##' @return ggplot object
##' @export
##' @author G Yu, Ming L
plotAvgProf2 <- function(peak, weightCol = NULL, TxDb = NULL,
                         upstream = 1000, downstream = 1000,
                         xlab = "Genomic Region (5'->3')",
                         ylab = "Peak Count Frequency",
                         conf,
                         facet = "none",
                         free_y = TRUE,
                         verbose = TRUE, 
                         ignore_strand = FALSE,
                         ...) {
  
  plotPeakProf2(peak = peak, 
                upstream = upstream, 
                downstream = downstream,
                conf,
                by = "gene",
                type = "start_site",
                weightCol = weightCol, 
                TxDb = TxDb,
                xlab = xlab,
                ylab = ylab,
                facet = facet,
                free_y = free_y,
                verbose = verbose, 
                ignore_strand = ignore_strand,
                ...)
  
}

##' plot the profile of peaks  by binning
##' 
##' 
##' @title plotAvgProf.binning
##' @param tagMatrix tagMatrix or a list of tagMatrix
##' @param xlab x label
##' @param ylab y label 
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled 
##' @param upstream rel object reflects the percentage of flank extension, e.g rel(0.2)
##'                 integer reflects the actual length of flank extension or TSS region
##'                 NULL reflects the gene body with no extension
##' @param downstream rel object reflects the percentage of flank extension, e.g rel(0.2)
##'                   integer reflects the actual length of flank extension or TSS region
##'                   NULL reflects the gene body with no extension
##' @param label label
##' @param ... additional parameter
##' @return ggplot object
##' @importFrom ggplot2 rel
plotAvgProf.binning <- function(tagMatrix, 
                                xlab = "Genomic Region (5'->3')",
                                ylab = "Peak Count Frequency",
                                conf,
                                facet ="none", 
                                free_y = TRUE,
                                upstream = NULL,
                                downstream = NULL,
                                label,
                                ...) {
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  conf <- if(missingArg(conf)) NA else conf
  
  if (!(missingArg(conf) || is.na(conf))){
    p <- plotAvgProf.binning.internal(tagMatrix , 
                                      conf = conf, 
                                      xlab = xlab, 
                                      ylab = ylab,
                                      facet = facet, 
                                      free_y = free_y,
                                      upstream = upstream,
                                      downstream = downstream,
                                      label = label,
                                      ...)
  } else {
    p <- plotAvgProf.binning.internal(tagMatrix , 
                                      xlab = xlab, 
                                      ylab = ylab,
                                      facet = facet, 
                                      free_y = free_y, 
                                      upstream = upstream,
                                      downstream = downstream,
                                      label = label,
                                      ...)
  }
  return(p)
}


##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 geom_line
##' @importFrom ggplot2 geom_vline
##' @importFrom ggplot2 geom_ribbon
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_color_manual
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 element_blank
##' @importFrom ggplot2 facet_grid
##' @importFrom ggplot2 rel
plotAvgProf.binning.internal <- function(tagMatrix, 
                                         conf,
                                         xlab = "Genomic Region (5'->3')",
                                         ylab = "Peak Count Frequency",
                                         facet="none", 
                                         free_y = TRUE,
                                         upstream = NULL,
                                         downstream = NULL,
                                         label,
                                         ...) {
  
  listFlag <- FALSE
  if (is(tagMatrix, "list")) {
    if ( is.null(names(tagMatrix )) ) {
      nn <- paste0("peak", seq_along(tagMatrix ))
      warning("input is not a named list, set the name automatically to ", paste(nn, collapse=' '))
      names(tagMatrix) <- nn
      ## stop("tagMatrix should be a named list...")
    }
    listFlag <- TRUE
  }
  
  if(listFlag){
    nbin <- dim(tagMatrix[[1]])[2]
  }else{
    nbin <- dim(tagMatrix)[2]
  }
  xlim <- c(1,nbin)
  
  if ( listFlag ) {
    facet <- match.arg(facet, c("none", "row", "column"))
  }
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  ##
  conf <- if(missingArg(conf)) NA else conf
  
  pos <- value <- .id <- Lower <- Upper <- NULL
  
  if ( listFlag ) {
    tagCount <- lapply(tagMatrix , function(x) getTagCount(x, xlim = xlim, conf = conf, ...))
    tagCount <- list_to_dataframe(tagCount)
    tagCount$.id <- factor(tagCount$.id, levels=names(tagMatrix ))
    p <- ggplot(tagCount, aes(pos, group=.id, color=.id))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper, fill = .id),
                           linetype = 0, alpha = 0.2)
    }
  } else {
    tagCount <- getTagCount(tagMatrix , xlim = xlim, conf = conf, ...)
    p <- ggplot(tagCount, aes(pos))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper),
                           linetype = 0, alpha = 0.2)
    }
  }
  
  p <- p + geom_line(aes(y = value))
  
  ## x_scale for genebody
  if(attr(tagMatrix, 'type') == 'body'){
    ## x_scale for gene body with no flank extension
    if(is.null(upstream)){
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*0.25),
                                           floor(nbin*0.5),
                                           floor(nbin*0.75),
                                           nbin),
                                  labels=c(label[1], 
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2]))
    }
    
    
    ## x_scale for flank extension by relative value
    if(inherits(upstream, 'rel')){
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(as.numeric(upstream)*100/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+25)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+50)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+75)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+100)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           nbin),
                                  labels=c(paste0("-",as.numeric(upstream)*100,"%"), 
                                           label[1],
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2],
                                           paste0("+",as.numeric(downstream)*100,"%")))
      p <- p + geom_vline(xintercept=floor(nbin*(as.numeric(upstream)*100/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                          linetype="longdash")
      
      p <- p + geom_vline(xintercept=floor(nbin*((as.numeric(upstream)*100+100)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                          linetype="longdash")
    }
    
    ## x_scale for flank extension by absolute value
    if(!is.null(upstream) & !inherits(upstream, 'rel')){
      
      upstreamPer <- floor(upstream/1000)*0.1
      downstreamPer <- floor(downstream/1000)*0.1
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(upstreamPer/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.25)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.5)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.75)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+1)/(1+upstreamPer+downstreamPer))),
                                           nbin),
                                  labels=c(paste0("-",upstream,"bp"), 
                                           label[1],
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2],
                                           paste0(downstream,"bp")))
      p <- p + geom_vline(xintercept=floor(nbin*(upstreamPer/(1+upstreamPer+downstreamPer))),
                          linetype="longdash")
      
      p <- p + geom_vline(xintercept=floor(nbin*((upstreamPer+1)/(1+upstreamPer+downstreamPer))),
                          linetype="longdash")
    }
  }
  
  
  ## x_scale for start region
  if(attr(tagMatrix, 'type') != 'body'){
    
    p <- p + scale_x_continuous(breaks=c(1, 
                                         floor(nbin*0.25),
                                         floor(nbin*0.5),
                                         floor(nbin*0.75),
                                         nbin),
                                labels=c(paste0("-",upstream,"bp"), 
                                         paste0("-",floor(upstream*0.5),"bp"),
                                         label,
                                         paste0(floor(downstream*0.5),"bp"),
                                         paste0(downstream,"bp")))
    
    p <- p + geom_vline(xintercept=floor(nbin*0.5),
                        linetype="longdash")
  }
  
  
  if (listFlag) {
    cols <- getCols(length(tagMatrix))
    p <- p + scale_color_manual(values=cols)
    if (facet == "row") {
      if (free_y) {
        p <- p + facet_grid(.id ~ ., scales = "free_y")
      } else {
        p <- p + facet_grid(.id ~ .)
      }
    } else if (facet == "column") {
      if (free_y) {
        p <-  p + facet_grid(. ~ .id, scales = "free_y")
      } else {
        p <-  p + facet_grid(. ~ .id)
      }
    }
  }
  p <- p+xlab(xlab)+ylab(ylab)
  p <- p + theme_bw() + theme(legend.title=element_blank())
  if(facet != "none") {
    p <- p + theme(legend.position="none")
  }
  return(p)
}


##' plot the profile of peaks automatically
##'
##' \code{peak} stands for the peak file. 
##' 
##' \code{by} the features of interest. 
##' 
##' (1) if users use \code{txdb}, \code{by} can be one of 'gene', 'transcript', 'exon', 
##' 'intron' , '3UTR' , '5UTR', 'UTR'. These features can be obtained by functions from txdb object.
##' 
##' (2) if users use self-made granges object, \code{by} can be everything. Because this \code{by}
##' will not pass to functions to get features, which is different from the case of using 
##' txdb object. This \code{by} is only used to made labels showed in picture.
##' 
##' \code{type} means the property of the region. one of the "start site",
##' "end site" and "body".
##' 
##' \code{upstream} and \code{downstream} parameter have different usages:
##' 
##' (1) if \code{type == 'body'}, \code{upstream} and \code{downstream} can use to extend 
##' the flank of body region.
##' 
##' (2) if \code{type == 'start_site'/'end_site'}, \code{upstream} and \code{downstream} refer to
##' the upstream and downstream of the start_site or the end_site.
##' 
##' \code{weightCol} refers to column in peak file. This column acts as a weight vaule. Details
##' see \url{https://github.com/YuLab-SMU/ChIPseeker/issues/15}
##' 
##' \code{nbin} refers to the number of bins, providing a binning method
##' to get the tag matrix.
##' 
##' \code{TxDb} parameter can accept txdb object.
##' But many regions can not be obtained by txdb object. In this case,
##' Users can provide self-made granges served the same role 
##' as txdb object and pass to \code{TxDb} object.
##' 
##' \code{plotPeakProf2()} is different from the \code{plotPeakProf()}. \code{plotPeakProf2()} do not
##' need to provide \code{window} parameter, which means \code{plotPeakProf2()} will call relevent
##' functions to make \code{window} automatically.
##'
##' @title plotPeakProf2
##' @param peak peak file or GRanges object
##' @param weightCol column name of weight
##' @param TxDb TxDb object, or self-made granges object
##' @param upstream upstream position
##' @param downstream downstream position
##' @param by e.g. 'gene', 'transcript', 'exon' or features of interest(e.g. "enhancer")
##' @param type one of "start_site", "end_site", "body"
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param verbose print message or not
##' @param nbin the amount of nbines 
##' @param ignore_strand ignore the strand information or not
##' @param ... additional parameter
##' @return ggplot object
##' @export
##' @author G Yu, Ming Li
plotPeakProf2 <- function(peak, 
                          upstream, 
                          downstream,
                          conf,
                          by,
                          type,
                          weightCol = NULL, 
                          TxDb = NULL,
                          xlab = "Genomic Region (5'->3')",
                          ylab = "Peak Count Frequency",
                          facet = "none",
                          free_y = TRUE,
                          verbose = TRUE, 
                          nbin = NULL,
                          ignore_strand = FALSE,
                          ...){
  
  conf <- if(missingArg(conf)) NA else conf
  upstream <- if(missingArg(upstream)) NULL else upstream
  downstream <- if(missingArg(downstream)) NULL else downstream
  
  if ( is(peak, "list") ) {
    tagMatrix <- lapply(peak, getTagMatrix, 
                        upstream = upstream,
                        downstream = downstream, 
                        type = type,
                        TxDb = TxDb,
                        by = by,
                        weightCol = weightCol, 
                        nbin = nbin,
                        verbose = verbose,
                        ignore_strand = ignore_strand)
  } else {
    tagMatrix <- getTagMatrix(peak = peak, 
                              upstream = upstream,
                              downstream = downstream, 
                              type = type,
                              by = by,
                              TxDb = TxDb,
                              weightCol = weightCol, 
                              nbin = nbin,
                              verbose = verbose,
                              ignore_strand = ignore_strand)
  }
  
  
  if (!(missingArg(conf) || is.na(conf))){
    p <- plotPeakProf(tagMatrix = tagMatrix,
                      conf = conf,
                      xlab = xlab,
                      ylab = ylab,
                      facet = facet, 
                      free_y = free_y,
                      ...)
    
  } else {
    p <- plotPeakProf(tagMatrix = tagMatrix,
                      xlab = xlab,
                      ylab = ylab,
                      facet= facet, 
                      free_y = free_y,
                      ...)
  }
  return(p)
  
}


##' plot the profile of peaks in two or more windows
##'
##'
##' This function comes from \url{https://github.com/YuLab-SMU/ChIPseeker/issues/189}
##'`
##' \code{plotPeakProf_MultiWindows()} is almost the same as \code{plotPeakProf2()}, having
##' the main difference of accepting two or more granges objects. Accepting more
##' granges objects can help compare the same peaks in different windows.
##' 
##' \code{TxDb} parameter can accept txdb object.
##' But many regions can not be obtained by txdb object. In this case,
##' Users can provide self-made granges served the same role 
##' as txdb object and pass to \code{TxDb} object.
##' 
##' \code{by} the features of interest. 
##' 
##' (1) if users use \code{txdb}, \code{by} can be one of 'gene', 'transcript', 'exon', 
##' 'intron' , '3UTR' , '5UTR', 'UTR'. These features can be obtained by functions from txdb object.
##' 
##' (2) if users use self-made granges object, \code{by} can be everything. Because this \code{by}
##' will not pass to functions to get features, which is different from the case of using 
##' txdb object. This \code{by} is only used to made labels showed in picture.
##' 
##' \code{type} means the property of the region. one of the "start site",
##' "end site" and "body".
##' 
##' \code{upstream} and \code{downstream} parameter have different usages:
##' 
##' (1) if \code{type == 'body'}, \code{upstream} and \code{downstream} can use to extend 
##' the flank of body region.
##' 
##' (2) if \code{type == 'start_site'/'end_site'}, \code{upstream} and \code{downstream} refer to
##' the upstream and downstream of the start_site or the end_site.
##' 
##' \code{weightCol} refers to column in peak file. This column acts as a weight value. Details
##' see \url{https://github.com/YuLab-SMU/ChIPseeker/issues/15}
##' 
##' \code{nbin} refers to the number of bins. \code{getTagMatrix()} provide a binning method
##' to get the tag matrix.
##' 
##' There are two ways input a list of window.
##' 
##' (1) Users can input a list of self-made granges objects
##' 
##' (2) Users can input a list of \code{by} and only one \code{type}. In this way, 
##' \code{plotPeakProf_MultiWindows()} can made a list of window from txdb object based on \code{by} and \code{type}.
##' 
##' Warning: 
##' 
##' (1) All of these window should be the same type. It means users can only
##' compare a list of "start site"/"end site"/"body region" with the same upstream
##' and downstream.
##' 
##' (2) So it will be only one \code{type} and several \code{by}.
##' 
##' (3) Users can make window by txdb object or self-made granges object. Users can only
##' choose one of 'gene', 'transcript', 'exon', 'intron' , '3UTR' , '5UTR' or 'UTR' in the
##' way of using txdb object. User can input any \code{by} in the way of using 
##' self-made granges object.
##' 
##' (4) Users can mingle the \code{by} designed for the two ways. \code{plotPeakProf_MultiWindows} can
##' accpet the hybrid \code{by}. But the above rules should be followed.
##' 
##'
##' @title plotPeakProf_MultiWindows
##' @param peak peak file or GRanges object
##' @param weightCol column name of weight
##' @param TxDb TxDb object or self-made granges objects
##' @param upstream upstream position
##' @param downstream downstream position
##' @param by feature of interest
##' @param type one of "start_site", "end_site", "body"
##' @param windows_name the name for each window, which will also be showed in the picture as labels
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param verbose print message or not
##' @param nbin the amount of bines 
##' @param ignore_strand ignore the strand information or not
##' @param ... additional parameter
##' @return ggplot object
plotPeakProf_MultiWindows <- function(peak,
                                      upstream,
                                      downstream,
                                      conf,
                                      by,
                                      type,
                                      windows_name = NULL,
                                      weightCol = NULL,
                                      TxDb = NULL,
                                      xlab = "Genomic Region (5'->3')",
                                      ylab = "Peak Count Frequency",
                                      facet = "row",
                                      free_y = TRUE,
                                      verbose = TRUE,
                                      nbin = NULL,
                                      ignore_strand = FALSE,
                                      ...){
  
  conf <- if(missingArg(conf)) NA else conf
  upstream <- if(missingArg(upstream)) NULL else upstream
  downstream <- if(missingArg(downstream)) NULL else downstream
  
  ## check type
  if(length(type) != 1){
    stop("It should be only one type...")
  }
  
  ## make the window name
  if (is.null(windows_name)) {
    nn <- by
    warning("set the name automatically to ", paste(nn, collapse=' '))
    windows_name <- nn
  }else{
    if (length(windows_name) != length(by)) {
      stop("the length of the window name and the by should be equal...")
    }
  }
  
  
  if ( is(peak, "list") ) {
    tagMatrix <- lapply(peak, getTagMatrix2,
                        upstream=upstream,
                        downstream=downstream,
                        windows_name=windows_name,
                        type=type,
                        by=by,
                        TxDb=TxDb,
                        weightCol = weightCol, 
                        nbin = nbin,
                        verbose = verbose,
                        ignore_strand= ignore_strand)
  } else {
    tagMatrix <- getTagMatrix2(peak=peak, 
                               upstream=upstream,
                               downstream=downstream,
                               windows_name=windows_name,
                               type=type,
                               by=by,
                               TxDb=TxDb,
                               weightCol = weightCol, 
                               nbin = nbin,
                               verbose = verbose,
                               ignore_strand= ignore_strand)
  }
  
  if (!(missingArg(conf) || is.na(conf))){
    p <- plotMultiProf(tagMatrix = tagMatrix,
                       conf = conf,
                       xlab = xlab,
                       ylab = ylab,
                       facet = facet, 
                       free_y = free_y,
                       ...)
    
  } else {
    p <- plotMultiProf(tagMatrix = tagMatrix,
                       xlab = xlab,
                       ylab = ylab,
                       facet= facet, 
                       free_y = free_y,
                       ...)
  }
  return(p)
  
}


##' internal function for plotPeakProf_MultiWindows
##' 
##' @param tagMatrix tagMatrix
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param ... additional parameter
plotMultiProf <- function(tagMatrix,
                          conf,
                          xlab="Genomic Region (5'->3')",
                          ylab = "Peak Count Frequency",
                          facet="none", 
                          free_y = TRUE,
                          ...){
  
  
  if(is(tagMatrix[[1]][[1]],"matrix")){
    upstream <- attr(tagMatrix[[1]][[1]], 'upstream')
    downstream <- attr(tagMatrix[[1]][[1]], 'downstream')
    # attr(tagMatrix, 'type') <- attr(tagMatrix[[1]][[1]], 'type')
    # attr(tagMatrix, 'is.binning') <- attr(tagMatrix[[1]][[1]], 'is.binning')
    binFlag <- attr(tagMatrix[[1]][[1]], 'is.binning')
    type <- attr(tagMatrix[[1]][[1]], 'type')
    
  }else{
    upstream <- attr(tagMatrix[[1]], 'upstream')
    downstream <- attr(tagMatrix[[1]], 'downstream')
    binFlag <- attr(tagMatrix[[1]], 'is.binning')
    type <- attr(tagMatrix[[1]], 'type')
  }
  
  if(type == "body"){
    
    label <- c("SS","TS")
    
  }else if(type == "start_site"){
    
    label <- "SS"
    
  }else{
    
    label <- "TS"
    
  }
  
  
  if(binFlag){
    
    if (!(missingArg(conf) || is.na(conf))){
      
      plotMultiProf.binning(tagMatrix = tagMatrix, 
                            xlab = xlab,
                            ylab = ylab,
                            conf = conf,
                            facet = facet, 
                            free_y = free_y,
                            upstream = upstream,
                            downstream = downstream,
                            label = label,
                            ...)
      
    }else{
      
      plotMultiProf.binning(tagMatrix = tagMatrix, 
                            xlab = xlab,
                            ylab = ylab,
                            facet = facet, 
                            free_y = free_y,
                            upstream = upstream,
                            downstream = downstream,
                            label = label,
                            ...)
    }
    
    
  }else{
    
    xlim <- c(-upstream, downstream)
    
    if (!(missingArg(conf) || is.na(conf))){
      
      plotMultiProf.normal(tagMatrix = tagMatrix, 
                           xlim = xlim,
                           xlab = xlab,
                           ylab = ylab,
                           conf = conf,
                           facet = facet, 
                           free_y = free_y,
                           origin_label = label,
                           ...)
      
    }else{
      
      plotMultiProf.normal(tagMatrix = tagMatrix, 
                           xlim = xlim,
                           xlab = xlab,
                           ylab = ylab,
                           facet = facet, 
                           free_y = free_y,
                           origin_label = label,
                           ...)
    }
  }
  
}

##' internal function
##' 
##' @param tagMatrix tagMatrix
##' @param xlim xlim
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param origin_label the label of the center
##' @param verbose print message or not
##' @param ... additional parameter
plotMultiProf.normal <- function(tagMatrix, xlim,
                                 xlab="Genomic Region (5'->3')",
                                 ylab = "Peak Count Frequency",
                                 conf,
                                 facet="none", 
                                 free_y = TRUE, 
                                 origin_label = "TSS",
                                 verbose = TRUE,
                                 ...) {
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  
  if (verbose) {
    cat(">> plotting figure...\t\t\t",
        format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  
  conf <- if(missingArg(conf)) NA else conf
  
  if (!(missingArg(conf) || is.na(conf))){
    
    p <- plotMultiProf.normal.internal(tagMatrix = tagMatrix, 
                                       conf = conf, 
                                       xlim = xlim,
                                       xlab = xlab, 
                                       ylab = ylab,
                                       facet = facet, 
                                       free_y = free_y, 
                                       origin_label = origin_label,
                                       ...)
    
    
  } else {
    
    p <- plotMultiProf.normal.internal(tagMatrix, 
                                       xlim = xlim,
                                       xlab = xlab, 
                                       ylab = ylab,
                                       facet = facet, 
                                       free_y = free_y, 
                                       origin_label = origin_label,
                                       ...)
  }
  return(p)
}

##' internal function
##' 
##' 
##' @param tagMatrix tagMatrix
##' @param xlim xlim
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param origin_label the label of the center
##' @param ... additional parameter
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 geom_line
##' @importFrom ggplot2 geom_vline
##' @importFrom ggplot2 geom_ribbon
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_color_manual
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 element_blank
##' @importFrom ggplot2 facet_grid
plotMultiProf.normal.internal <- function(tagMatrix, conf,
                                          xlim = c(-3000,3000),
                                          xlab = "Genomic Region (5'->3')",
                                          ylab = "Peak Count Frequency",
                                          facet="row", 
                                          free_y = TRUE,
                                          origin_label, 
                                          ...) {
  
  listFlag <- FALSE
  if (is.null(attr(tagMatrix[[1]],'upstream'))) {
    if ( is.null(names(tagMatrix)) ) {
      nn <- paste0("peak", seq_along(tagMatrix))
      warning("input is not a named list, set the name automatically to ", paste(nn, collapse=' '))
      names(tagMatrix) <- nn
      ## stop("tagMatrix should be a named list...")
    }
    listFlag <- TRUE
  }
  
  if ( listFlag ) {
    facet <- match.arg(facet, c("none", "row", "column"))
    if ( (xlim[2]-xlim[1]+1) != ncol(tagMatrix[[1]][[1]]) ) {
      stop("please specify appropreate xcoordinations...")
    }
  } else {
    if ( (xlim[2]-xlim[1]+1) != ncol(tagMatrix[[1]]) ) {
      stop("please specify appropreate xcoordinations...")
    }
  }
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  ##
  conf <- if(missingArg(conf)) NA else conf
  
  pos <- value <- .id <- Lower <- Upper <- NULL
  
  if ( listFlag ) {
    
    tagCount <- lapply(as.list(names(tagMatrix)), function(x){
      
      tmp <- tagMatrix[[x]]
      tagCount_tmp <- lapply(as.list(names(tmp)),function(x){
        result <- getTagCount(tmp[[x]], xlim = xlim, conf = conf, ...)
        result$type <- x
        
        return(result)
      })
      tagCount_tmp <- list_to_dataframe(tagCount_tmp)
      return(tagCount_tmp)
      
    })
    
    names(tagCount) <- names(tagMatrix)
    tagCount <- list_to_dataframe(tagCount)
    tagCount$.id <- factor(tagCount$.id, levels=names(tagMatrix))
    p <- ggplot(tagCount, aes(pos, group=type, color=type))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper, fill = type),
                           linetype = 0, alpha = 0.2)
    }
    
  } else {
    
    tagCount <- lapply(as.list(names(tagMatrix)), function(x){
      
      result <- getTagCount(tagMatrix[[x]], xlim = xlim, conf = conf, ...)
      result$type <- x
      
      return(result)
    })
    
    tagCount <- do.call("rbind",tagCount)
    
    p <- ggplot(tagCount, aes(x = pos))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper,fill = type),
                           linetype = 0, alpha = 0.2)
    }
  }
  
  p <- p + geom_line(aes(y = value,color = type))
  
  if ( 0 > xlim[1] && 0 < xlim[2] ) {
    p <- p + geom_vline(xintercept=0,
                        linetype="longdash")
    p <- p + scale_x_continuous(breaks=c(xlim[1], floor(xlim[1]/2),
                                         0,
                                         floor(xlim[2]/2), xlim[2]),
                                labels=c(paste0(xlim[1],"bp"), paste0(floor(xlim[1]/2),"bp"),
                                         origin_label, 
                                         paste0(floor(xlim[2]/2),"bp"), paste0(xlim[2], "bp")))
  }
  
  if (listFlag) {
    # cols <- getCols(length(tagMatrix[[1]]))
    # p <- p + scale_color_manual(values=cols)
    if (facet == "row") {
      if (free_y) {
        p <- p + facet_grid(.id ~ ., scales = "free_y")
      } else {
        p <- p + facet_grid(.id ~ .)
      }
    } else if (facet == "column") {
      if (free_y) {
        p <-  p + facet_grid(. ~ .id, scales = "free_y")
      } else {
        p <-  p + facet_grid(. ~ .id)
      }
    }
  }
  
  p <- p+xlab(xlab)+ylab(ylab)
  p <- p + theme_bw() + theme(legend.title=element_blank())

  # if(facet != "none") {
  #   p <- p + theme(legend.position="none")
  # }
  
  return(p)
}

##' internal function
##' 
##' @param tagMatrix tagMatrix
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param upstream the upstream extension
##' @param downstream the downstream extension
##' @param label the label of the center
##' @param ... additional parameter
plotMultiProf.binning <- function(tagMatrix, 
                                  xlab = "Genomic Region (5'->3')",
                                  ylab = "Peak Count Frequency",
                                  conf,
                                  facet ="none", 
                                  free_y = TRUE,
                                  upstream = NULL,
                                  downstream = NULL,
                                  label,
                                  ...) {
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  conf <- if(missingArg(conf)) NA else conf
  
  if (!(missingArg(conf) || is.na(conf))){
    p <- plotMultiProf.binning.internal(tagMatrix , 
                                        conf = conf, 
                                        xlab = xlab, 
                                        ylab = ylab,
                                        facet = facet, 
                                        free_y = free_y,
                                        upstream = upstream,
                                        downstream = downstream,
                                        label = label,
                                        ...)
  } else {
    p <- plotMultiProf.binning.internal(tagMatrix , 
                                        xlab = xlab, 
                                        ylab = ylab,
                                        facet = facet, 
                                        free_y = free_y, 
                                        upstream = upstream,
                                        downstream = downstream,
                                        label = label,
                                        ...)
  }
  return(p)
}

##' internal function
##' 
##' @param tagMatrix tagMatrix
##' @param xlab xlab
##' @param ylab ylab
##' @param conf confidence interval
##' @param facet one of 'none', 'row' and 'column'
##' @param free_y if TRUE, y will be scaled by AvgProf
##' @param upstream the upstream extension
##' @param downstream the downstream extension
##' @param label the label of the center
##' @param ... additional parameter
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 geom_line
##' @importFrom ggplot2 geom_vline
##' @importFrom ggplot2 geom_ribbon
##' @importFrom ggplot2 scale_x_continuous
##' @importFrom ggplot2 scale_color_manual
##' @importFrom ggplot2 xlab
##' @importFrom ggplot2 ylab
##' @importFrom ggplot2 theme_bw
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 element_blank
##' @importFrom ggplot2 facet_grid
##' @importFrom ggplot2 rel
plotMultiProf.binning.internal <- function(tagMatrix, 
                                           conf,
                                           xlab = "Genomic Region (5'->3')",
                                           ylab = "Peak Count Frequency",
                                           facet="none", 
                                           free_y = TRUE,
                                           upstream = NULL,
                                           downstream = NULL,
                                           label,
                                           ...) {
  
  listFlag <- FALSE
  if (is(tagMatrix[[1]][[1]],"matrix")) {
    if ( is.null(names(tagMatrix)) ) {
      nn <- paste0("peak", seq_along(tagMatrix))
      warning("input is not a named list, set the name automatically to ", paste(nn, collapse=' '))
      names(tagMatrix) <- nn
      ## stop("tagMatrix should be a named list...")
    }
    listFlag <- TRUE
  }
  
  if(listFlag){
    nbin <- dim(tagMatrix[[1]][[1]])[2]
    type <- attr(tagMatrix[[1]][[1]], 'type')
  }else{
    nbin <- dim(tagMatrix[[1]])[2]
    type <- attr(tagMatrix[[1]], 'type')
  }
  xlim <- c(1,nbin)
  
  if ( listFlag ) {
    facet <- match.arg(facet, c("none", "row", "column"))
  }
  
  ## S4Vectors change the behavior of ifelse
  ## see https://support.bioconductor.org/p/70871/
  ##
  ## conf <- ifelse(missingArg(conf), NA, conf)
  ##
  conf <- if(missingArg(conf)) NA else conf
  
  pos <- value <- .id <- Lower <- Upper <- NULL
  
  if ( listFlag ) {
    
    tagCount <- lapply(as.list(names(tagMatrix)), function(x){
      
      tmp <- tagMatrix[[x]]
      tagCount_tmp <- lapply(as.list(names(tmp)),function(x){
        result <- getTagCount(tmp[[x]], xlim = xlim, conf = conf, ...)
        result$type <- x
        
        return(result)
      })
      tagCount_tmp <- list_to_dataframe(tagCount_tmp)
      return(tagCount_tmp)
      
    })
    
    names(tagCount) <- names(tagMatrix)
    tagCount <- list_to_dataframe(tagCount)
    tagCount$.id <- factor(tagCount$.id, levels=names(tagMatrix))
    p <- ggplot(tagCount, aes(pos, group=type, color=type))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper, fill = type),
                           linetype = 0, alpha = 0.2)
    }
    
  } else {
    
    tagCount <- lapply(as.list(names(tagMatrix)), function(x){
      
      result <- getTagCount(tagMatrix[[x]], xlim = xlim, conf = conf, ...)
      result$type <- x
      
      return(result)
    })
    
    tagCount <- do.call("rbind",tagCount)
    
    p <- ggplot(tagCount, aes(pos,group=type,color=type))
    if (!(is.na(conf))) {
      p <- p + geom_ribbon(aes(ymin = Lower, ymax = Upper,fill = type),
                           linetype = 0, alpha = 0.2)
    }
  }
  
  p <- p + geom_line(aes(y = value,color = type))
  
  ## x_scale for genebody
  if(type == 'body'){
    ## x_scale for gene body with no flank extension
    if(is.null(upstream)){
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*0.25),
                                           floor(nbin*0.5),
                                           floor(nbin*0.75),
                                           nbin),
                                  labels=c(label[1], 
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2]))
    }
    
    
    ## x_scale for flank extension by relative value
    if(inherits(upstream, 'rel')){
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(as.numeric(upstream)*100/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+25)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+50)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+75)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+100)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           nbin),
                                  labels=c(paste0("-",as.numeric(upstream)*100,"%"), 
                                           label[1],
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2],
                                           paste0("+",as.numeric(downstream)*100,"%")))
      p <- p + geom_vline(xintercept=floor(nbin*(as.numeric(upstream)*100/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                          linetype="longdash")
      
      p <- p + geom_vline(xintercept=floor(nbin*((as.numeric(upstream)*100+100)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                          linetype="longdash")
    }
    
    ## x_scale for flank extension by absolute value
    if(!is.null(upstream) & !inherits(upstream, 'rel')){
      
      upstreamPer <- floor(upstream/1000)*0.1
      downstreamPer <- floor(downstream/1000)*0.1
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(upstreamPer/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.25)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.5)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.75)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+1)/(1+upstreamPer+downstreamPer))),
                                           nbin),
                                  labels=c(paste0("-",upstream,"bp"), 
                                           label[1],
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2],
                                           paste0(downstream,"bp")))
      p <- p + geom_vline(xintercept=floor(nbin*(upstreamPer/(1+upstreamPer+downstreamPer))),
                          linetype="longdash")
      
      p <- p + geom_vline(xintercept=floor(nbin*((upstreamPer+1)/(1+upstreamPer+downstreamPer))),
                          linetype="longdash")
    }
  }
  
  
  ## x_scale for start region
  if(type != 'body'){
    
    p <- p + scale_x_continuous(breaks=c(1, 
                                         floor(nbin*0.25),
                                         floor(nbin*0.5),
                                         floor(nbin*0.75),
                                         nbin),
                                labels=c(paste0("-",upstream,"bp"), 
                                         paste0("-",floor(upstream*0.5),"bp"),
                                         label,
                                         paste0(floor(downstream*0.5),"bp"),
                                         paste0(downstream,"bp")))
    
    p <- p + geom_vline(xintercept=floor(nbin*0.5),
                        linetype="longdash")
  }
  
  
  if (listFlag) {
    
    if (facet == "row") {
      if (free_y) {
        p <- p + facet_grid(.id ~ ., scales = "free_y")
      } else {
        p <- p + facet_grid(.id ~ .)
      }
    } else if (facet == "column") {
      if (free_y) {
        p <-  p + facet_grid(. ~ .id, scales = "free_y")
      } else {
        p <-  p + facet_grid(. ~ .id)
      }
    }
  }
  p <- p+xlab(xlab)+ylab(ylab)
  p <- p + theme_bw() + theme(legend.title=element_blank())
  # if(facet != "none") {
  #   p <- p + theme(legend.position="none")
  # }
  return(p)
}


##' plot the heatmap of tagMatrix
##'
##'
##' @title tagHeatmap
##' @param tagMatrix tagMatrix or a list of tagMatrix
##' @param xlab xlab
##' @param ylab ylab
##' @param title title
##' @param palette palette to be filled in,details see \link[ggplot2]{scale_colour_brewer}
##' @param nrow the nrow of plotting a list of peak
##' @param ncol the ncol of plotting a list of peak
##' @return figure
##' @export
##' @author G Yu
tagHeatmap <- function(tagMatrix, 
                       xlab="", 
                       ylab="", 
                       title=NULL, 
                       palette="RdBu",
                       nrow = NULL,
                       ncol = NULL) {
  listFlag <- FALSE
  if (is(tagMatrix, "list")) {
    listFlag <- TRUE
  }
  peakHeatmap.internal2(tagMatrix = tagMatrix, 
                        listFlag = listFlag, 
                        palette = palette, 
                        xlab = xlab, 
                        ylab = ylab, 
                        title = title,
                        ncol = ncol,
                        nrow = nrow)
}

##' plot the heatmap of peaks 
##'
##'
##' @title peakHeatmap
##' @param peak peak file or GRanges object
##' @param weightCol column name of weight
##' @param TxDb TxDb object
##' @param upstream upstream position
##' @param downstream downstream position
##' @param xlab xlab
##' @param ylab ylab
##' @param title title
##' @param palette palette to be filled in,details see \link[ggplot2]{scale_colour_brewer}
##' @param verbose print message or not
##' @param by one of 'gene', 'transcript', 'exon', 'intron' , '3UTR' , '5UTR', 'UTR'
##' @param type one of "start_site", "end_site", "body"
##' @param nbin the amount of nbines 
##' @param ignore_strand ignore the strand information or not
##' @param windows a collection of region
##' @param nrow the nrow of plotting a list of peak
##' @param ncol the ncol of plotting a list of peak
##' @return figure
##' @export
##' @author G Yu
peakHeatmap <- function(peak, weightCol=NULL, TxDb=NULL,
                        upstream=1000, downstream=1000,
                        xlab="", ylab="", title=NULL,
                        palette=NULL, verbose=TRUE,
                        by="gene", type="start_site",
                        nbin = NULL,ignore_strand = FALSE,
                        windows,ncol = NULL, nrow = NULL) {
  listFlag <- FALSE
  if ( is(peak, "list") ) {
    listFlag <- TRUE
    if (is.null(names(peak)))
      stop("peak should be a peak file or a name list of peak files...")
  }
  
  if (verbose) {
    cat(">> preparing promoter regions...\t",
        format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  
  if (verbose) {
    cat(">> preparing tag matrix...\t\t",
        format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  
  if(missing(windows)){
    windows <- getBioRegion(TxDb=TxDb,
                            upstream=upstream,
                            downstream=downstream,
                            by=by,
                            type=type)
  }
  
  
  if (listFlag) {
    tagMatrix <- lapply(peak, getTagMatrix, 
                        weightCol=weightCol, 
                        windows = windows,
                        upstream=upstream,
                        downstream=downstream,
                        TxDb = TxDb,
                        nbin = nbin,
                        verbose = verbose,
                        ignore_strand= ignore_strand)
    
    names(tagMatrix) <- names(peak)
    
  } else {
    tagMatrix <- getTagMatrix(peak, 
                              weightCol=weightCol, 
                              windows = windows,
                              TxDb = TxDb,
                              upstream=upstream,
                              downstream=downstream,
                              nbin = nbin,
                              verbose = verbose,
                              ignore_strand= ignore_strand)
  }
  
  if (verbose) {
    cat(">> generating figure...\t\t",
        format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  
  xlim <- NULL
  
  p <- peakHeatmap.internal2(tagMatrix = tagMatrix,
                             listFlag = listFlag, 
                             palette = palette, 
                             xlab = xlab,
                             ylab = ylab, 
                             title = title,
                             nrow = nrow,
                             ncol = ncol)
  
  if (verbose) {
    cat(">> done...\t\t\t",
        format(Sys.time(), "%Y-%m-%d %X"), "\n")
  }
  invisible(tagMatrix)
  p
}

##' @importFrom aplot plot_list
peakHeatmap.internal2 <- function(tagMatrix, 
                                  listFlag, 
                                  palette, 
                                  xlab, 
                                  ylab, 
                                  title,
                                  nrow,
                                  ncol) {
  if ( is.null(xlab) || is.na(xlab))
    xlab <- ""
  if ( is.null(ylab) || is.na(ylab))
    ylab <- ""
  
  if (listFlag) {
    nc <- length(tagMatrix)
    if ( is.null(palette) || is.na(palette) ) {
      palette <- getPalette(nc)
    } else if (length(palette) != nc) {
      palette <- rep(palette[1], nc)
    } else {
      palette <- palette
    }
    
    if (is.null(title) || is.na(title))
      title <- names(tagMatrix)
    if (length(xlab) != nc) {
      xlab <- rep(xlab[1], nc)
    }
    if (length(ylab) != nc) {
      ylab <- rep(ylab[1], nc)
    }
    if (length(title) != nc) {
      title <- rep(title[1], nc)
    }
    
    tmp <- list()
    
    for (i in 1:nc) {
      
      p <- peakHeatmap.internal(tagMatrix = tagMatrix[[i]], 
                                palette = palette[i], 
                                xlab = xlab[i], 
                                ylab = ylab[i], 
                                title= title[i])
      
      p <- p + theme(plot.title = element_text(hjust = 0.5))
      
      tmp[[i]] <- p
    }
    
    if(is.null(nrow) && is.null(ncol))
      nrow <- 1
    
    p <- plot_list(gglist = tmp,
                   ncol = ncol,
                   nrow = nrow)
    return(p)
    
  } else {
    if (is.null(palette) || is.na(palette))
      palette <- "RdBu"
    if (is.null(title) || is.na(title))
      title <- ""
    peakHeatmap.internal(tagMatrix = tagMatrix, 
                         palette = palette, 
                         xlab = xlab, 
                         ylab = ylab, 
                         title = title)
  }
}


##' @import BiocGenerics
##' @importFrom yulab.utils mat2df
##' @importFrom ggplot2 ggplot
##' @importFrom ggplot2 aes
##' @importFrom ggplot2 geom_tile
##' @importFrom ggplot2 scale_fill_distiller
##' @importFrom ggplot2 theme
##' @importFrom ggplot2 element_blank
##' @importFrom ggplot2 labs
##' @importFrom ggplot2 scale_x_continuous
peakHeatmap.internal <- function(tagMatrix, 
                                 palette="RdBu", 
                                 xlab="", 
                                 ylab="",
                                 title="") {
  
  upstream <- attr(tagMatrix, "upstream")
  downstream <- attr(tagMatrix, "downstream")
  binning_Flag <- attr(tagMatrix,"is.binning")
  type <- attr(tagMatrix,"type")
  
  body_Flag <- FALSE
  if(type == "body"){
    body_Flag <- TRUE
    label <- attr(tagMatrix,"label")
  }
  
  if(binning_Flag){
    nbin <- dim(tagMatrix)[2]
  }
  
  tagMatrix <- t(apply(tagMatrix, 1, function(x) x/max(x)))
  ii <- order(rowSums(tagMatrix))
  tagMatrix <- tagMatrix[ii,]
  
  colnames(tagMatrix) <- seq_len(dim(tagMatrix)[2])
  rownames(tagMatrix) <- seq_len(dim(tagMatrix)[1])
  
  tagMatrix <- mat2df(tagMatrix)
  colnames(tagMatrix) <- c("values","sample_ID","coordinate")

  sample_ID <- coordinate <- NULL
  
  p <- ggplot(tagMatrix, aes(x = coordinate,y = sample_ID)) + 
    geom_tile(aes(fill = values)) +
    scale_fill_distiller(palette = palette)  +
    theme(axis.text.y=element_blank(),
          axis.ticks.y=element_blank(),
          axis.line.y = element_blank(),
          panel.grid=element_blank(),
          panel.background = element_blank()) +
    labs(x = xlab, y = ylab, title = title)

  if(body_Flag){
    
    if(inherits(upstream, 'rel')){
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(as.numeric(upstream)*100/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+25)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+50)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+75)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           floor(nbin*((as.numeric(upstream)*100+100)/(100+(as.numeric(upstream)+as.numeric(downstream))*100))),
                                           nbin),
                                  labels=c(paste0("-",as.numeric(upstream)*100,"%"), 
                                           label[1],
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2],
                                           paste0("+",as.numeric(downstream)*100,"%")))
    }
    
    if(is.null(upstream)){
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*0.25),
                                           floor(nbin*0.5),
                                           floor(nbin*0.75),
                                           nbin),
                                  labels=c(label[1], 
                                           "25%",
                                           "50%",
                                           "75%",
                                           label[2]))
    }
    
    if(!is.null(upstream) && !inherits(upstream, 'rel')){
      
      upstreamPer <- floor(upstream/1000)*0.1
      downstreamPer <- floor(downstream/1000)*0.1
      
      p <- p + scale_x_continuous(breaks=c(1, 
                                           floor(nbin*(upstreamPer/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.25)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.5)/(1+upstreamPer+downstreamPer))),
                                           floor(nbin*((upstreamPer+0.75)/(1+upstreamPer

Download .txt

gitextract_66qccjqf/

├── .Rbuildignore
├── .github/
│   └── issue_template.md
├── .gitignore
├── .svnignore
├── .travis.yml
├── CONDUCT.md
├── DESCRIPTION
├── GEODATA
├── Makefile
├── NAMESPACE
├── NEWS
├── NEWS.md
├── R/
│   ├── AllGenerics.R
│   ├── ChIPseeker-package.R
│   ├── GEO.R
│   ├── addGeneAnno.R
│   ├── annotatePeak.R
│   ├── covplot.R
│   ├── csAnno.R
│   ├── dplyr-verb.R
│   ├── enrichOverlap.R
│   ├── getFlankingGene.R
│   ├── getGenomicAnnotation.R
│   ├── getNearestFeatureIndicesAndDistances.R
│   ├── plotAnno.R
│   ├── plotDistToTSS.R
│   ├── plotTagMatrix.R
│   ├── readPeakFile.R
│   ├── seq2gene.R
│   ├── subset.R
│   ├── tagMatrix.R
│   ├── upsetplot.R
│   ├── utilities.R
│   ├── vennpie.R
│   ├── vennplot.R
│   └── zzz.R
├── README.Rmd
├── README.md
├── appveyor.yml
├── data/
│   ├── gsminfo.rda
│   ├── tagMatrixList.rda
│   └── ucsc_release.rda
├── inst/
│   ├── CITATION
│   ├── extdata/
│   │   ├── processedGSM.rda
│   │   └── sample_peaks.txt
│   └── test-plot/
│       ├── test-plotPeakProf.R
│       └── test-plotTagMatrix.R
├── man/
│   ├── ChIPseeker-package.Rd
│   ├── ChIPseekerCache.Rd
│   ├── annotatePeak.Rd
│   ├── as.GRanges.Rd
│   ├── as.data.frame.csAnno.Rd
│   ├── check_upstream_and_downstream.Rd
│   ├── combine_csAnno.Rd
│   ├── covplot.Rd
│   ├── csAnno-class.Rd
│   ├── dot-ChIPseekerEnv.Rd
│   ├── dotFun.Rd
│   ├── downloadGEObedFiles.Rd
│   ├── downloadGSMbedFiles.Rd
│   ├── dropAnno.Rd
│   ├── enrichAnnoOverlap.Rd
│   ├── enrichPeakOverlap.Rd
│   ├── getAnnoStat.Rd
│   ├── getBioRegion.Rd
│   ├── getGEOInfo.Rd
│   ├── getGEOgenomeVersion.Rd
│   ├── getGEOspecies.Rd
│   ├── getGeneAnno.Rd
│   ├── getGenomicAnnotation.Rd
│   ├── getNearestFeatureIndicesAndDistances.Rd
│   ├── getPromoters.Rd
│   ├── getSampleFiles.Rd
│   ├── getTagMatrix.Rd
│   ├── getTagMatrix.binning.internal.Rd
│   ├── getTagMatrix.internal.Rd
│   ├── getTagMatrix2.Rd
│   ├── getTagMatrix2.binning.internal.Rd
│   ├── getTagMatrix2.internal.Rd
│   ├── info.Rd
│   ├── makeBioRegionFromGranges.Rd
│   ├── make_label.Rd
│   ├── overlap.Rd
│   ├── peakHeatmap.Rd
│   ├── peakHeatmap_multiple_Sets.Rd
│   ├── peak_Profile_Heatmap.Rd
│   ├── plotAnnoBar-methods.Rd
│   ├── plotAnnoBar.Rd
│   ├── plotAnnoPie-methods.Rd
│   ├── plotAnnoPie.Rd
│   ├── plotAvgProf.Rd
│   ├── plotAvgProf.binning.Rd
│   ├── plotAvgProf2.Rd
│   ├── plotDistToTSS-methods.Rd
│   ├── plotDistToTSS.data.frame.Rd
│   ├── plotMultiProf.Rd
│   ├── plotMultiProf.binning.Rd
│   ├── plotMultiProf.binning.internal.Rd
│   ├── plotMultiProf.normal.Rd
│   ├── plotMultiProf.normal.internal.Rd
│   ├── plotPeakProf.Rd
│   ├── plotPeakProf2.Rd
│   ├── plotPeakProf_MultiWindows.Rd
│   ├── readPeakFile.Rd
│   ├── reexports.Rd
│   ├── seq2gene.Rd
│   ├── show-methods.Rd
│   ├── shuffle.Rd
│   ├── tagHeatmap.Rd
│   ├── upsetplot-methods.Rd
│   ├── vennpie-methods.Rd
│   ├── vennplot.Rd
│   └── vennplot.peakfile.Rd
├── tests/
│   ├── testthat/
│   │   ├── test-bed.R
│   │   ├── test-getTagMatrix.R
│   │   └── test-txdb.R
│   └── testthat.R
└── vignettes/
    ├── ChIPseeker.Rmd
    └── ChIPseeker.bib

Download .json

Condensed preview — 119 files, each showing path, character count, and a content snippet. Download the .json file or copy for the full structured content (470K chars).

[
  {
    "path": ".Rbuildignore",
    "chars": 123,
    "preview": ".travis.yml\n.svnignore\n.gitignore\n^.*\\.DS_Store\nMakefile\nREADME.Rmd\nappveyor.yml\nGEODATA\ndocs\nmkdocs\n.github\n^CONDUCT\\.m"
  },
  {
    "path": ".github/issue_template.md",
    "chars": 855,
    "preview": "### Prerequisites\n\n+ [ ] Have you read [Feedback](https://guangchuangyu.github.io/chipseeker/#feedback) and follow the ["
  },
  {
    "path": ".gitignore",
    "chars": 157,
    "preview": ".DS_Store\ndata/.DS_Store\ninst/.DS_Store\ninst/extdata/.DS_Store\ninst/extdata/GEO_sample_data/.DS_Store\n.svn\n*~\ndocs/__ini"
  },
  {
    "path": ".svnignore",
    "chars": 61,
    "preview": ".git\n*.Rhistory\n.travis.yml\nappveyor.yml\ndocs\nmkdocs\n.github\n"
  },
  {
    "path": ".travis.yml",
    "chars": 729,
    "preview": "## reference: http://docs.travis-ci.com/user/languages/r/\n\nlanguage: r\nr: bioc-devel\n  \ncache: packages\nbioc_required: t"
  },
  {
    "path": "CONDUCT.md",
    "chars": 1387,
    "preview": "# Contributor Code of Conduct\n\nAs contributors and maintainers of this project, we pledge to respect all people who \ncon"
  },
  {
    "path": "DESCRIPTION",
    "chars": 2998,
    "preview": "Package: ChIPseeker\nType: Package\nTitle: ChIPseeker for ChIP peak Annotation, Comparison, and Visualization\nVersion: 1.4"
  },
  {
    "path": "GEODATA",
    "chars": 422,
    "preview": "UPDATE OF GEO DATA\n + 20947 bed file information in ChIPseeker (version >=1.9.8) <2016-09-20, Tue>\n + 19348 bed file inf"
  },
  {
    "path": "Makefile",
    "chars": 1301,
    "preview": "PKGNAME := $(shell sed -n \"s/Package: *\\([^ ]*\\)/\\1/p\" DESCRIPTION)\nPKGVERS := $(shell sed -n \"s/Version: *\\([^ ]*\\)/\\1/"
  },
  {
    "path": "NAMESPACE",
    "chars": 4458,
    "preview": "# Generated by roxygen2: do not edit by hand\n\nS3method(arrange,GRanges)\nS3method(as.data.frame,csAnno)\nS3method(filter,G"
  },
  {
    "path": "NEWS",
    "chars": 16215,
    "preview": "CHANGES IN VERSION 1.15.2\n------------------------\n o bug fixed for 'overlap = \"all\"' to consider strand information <20"
  },
  {
    "path": "NEWS.md",
    "chars": 6235,
    "preview": "# ChIPseeker 1.48.0\n\n+ Bioconductor RELEASE_3_23 (2026-04-29, Wed)\n\n# ChIPseeker 1.47.1\n\n+ fixed issue in 'test-txdb.R' "
  },
  {
    "path": "R/AllGenerics.R",
    "chars": 1204,
    "preview": "##' vennpie method generics\n##'\n##'\n##' @docType methods\n##' @name vennpie\n##' @rdname vennpie-methods\n##' @export\nsetGe"
  },
  {
    "path": "R/ChIPseeker-package.R",
    "chars": 390,
    "preview": "#' @keywords internal\n\"_PACKAGE\"\n\n\n\n##' Information Datasets\n##' \n##' ucsc genome version, precalcuated data and gsm inf"
  },
  {
    "path": "R/GEO.R",
    "chars": 10409,
    "preview": "########################################\n##                                    ##\n## data last update: Mar 03, 2015     "
  },
  {
    "path": "R/addGeneAnno.R",
    "chars": 1843,
    "preview": "##' get gene annotation, symbol, gene name etc.\n##'\n##'\n##' @title getGeneAnno\n##' @param annoDb annotation package\n##' "
  },
  {
    "path": "R/annotatePeak.R",
    "chars": 12271,
    "preview": "##' Annotate peaks\n##'\n##'\n##' @title annotatePeak\n##' @param peak peak file or GRanges object\n##' @param tssRegion Regi"
  },
  {
    "path": "R/covplot.R",
    "chars": 5949,
    "preview": "\n##' plot peak coverage\n##'\n##' \n##' @title covplot\n##' @param peak peak file or GRanges object\n##' @param weightCol wei"
  },
  {
    "path": "R/csAnno.R",
    "chars": 14415,
    "preview": "##' Class \"csAnno\"\r\n##' This class represents the output of ChIPseeker Annotation\r\n##'\r\n##'\r\n##' @name csAnno-class\r\n##'"
  },
  {
    "path": "R/dplyr-verb.R",
    "chars": 1870,
    "preview": "# extend filter to Peak (GRanges class object)\n#' @method filter GRanges\n#' @importFrom dplyr filter\n#' @export\nfilter.G"
  },
  {
    "path": "R/enrichOverlap.R",
    "chars": 9321,
    "preview": "##' calcuate overlap significant of ChIP experiments based on their nearest gene annotation\n##'\n##'\n##' @title enrichAnn"
  },
  {
    "path": "R/getFlankingGene.R",
    "chars": 2326,
    "preview": "\n##' @import IRanges\n##' @importFrom dplyr mutate\n##' @importFrom dplyr group_by\ngetAllFlankingGene <- function(peak.gr,"
  },
  {
    "path": "R/getGenomicAnnotation.R",
    "chars": 10504,
    "preview": "updateGenomicAnnotation <- function(peaks, genomicRegion, type, anno, sameStrand=FALSE) {\n    hits <- getGenomicAnnotati"
  },
  {
    "path": "R/getNearestFeatureIndicesAndDistances.R",
    "chars": 5690,
    "preview": "##' get index of features that closest to peak and calculate distance\n##'\n##'\n##' @title getNearestFeatureIndicesAndDist"
  },
  {
    "path": "R/plotAnno.R",
    "chars": 7572,
    "preview": "##' plot feature distribution based on their chromosome region\n##'\n##' plot chromosome region features\n##' @title plotAn"
  },
  {
    "path": "R/plotDistToTSS.R",
    "chars": 6987,
    "preview": "merge_two_si = function(x1, x2){\n  if (length(unique(gsub(\"^[0-9]+\",\"\",c(x1, x2)))) == 1){\n    return(paste0(gsub(\"[^0-9"
  },
  {
    "path": "R/plotTagMatrix.R",
    "chars": 95628,
    "preview": "##' plot the profile of peaks\n##'`\n##' \\code{plotPeakProf_MultiWindows()} is almost the same as \\code{plotPeakProf2()}, "
  },
  {
    "path": "R/readPeakFile.R",
    "chars": 2227,
    "preview": "##' read peak file and store in data.frame or GRanges object\n##'\n##' \n##' @title readPeakFile\n##' @param peakfile peak f"
  },
  {
    "path": "R/seq2gene.R",
    "chars": 3399,
    "preview": "##' annotate genomic regions to genes in many-to-many mapping\n##'\n##' This funciton associates genomic regions with codi"
  },
  {
    "path": "R/subset.R",
    "chars": 742,
    "preview": "##' @importFrom S4Vectors subset\r\n##' @importFrom BiocGenerics start\r\n##' @importFrom BiocGenerics end\r\n##' @method subs"
  },
  {
    "path": "R/tagMatrix.R",
    "chars": 41666,
    "preview": "##' prepare the promoter regions\n##'\n##'\n##' @title getPromoters\n##' @param TxDb TxDb\n##' @param upstream upstream from "
  },
  {
    "path": "R/upsetplot.R",
    "chars": 2257,
    "preview": "## @importFrom UpSetR upset\n## @importFrom grid viewport\n## @importFrom grid pushViewport\n## @importFrom grid popViewpor"
  },
  {
    "path": "R/utilities.R",
    "chars": 21042,
    "preview": "#' @title env function for ChIPseeker\n#' @param TxDb txdb object\n#' @param item item name\n#' @param force force to updat"
  },
  {
    "path": "R/vennpie.R",
    "chars": 2632,
    "preview": "##' @importFrom plotrix floating.pie\nvennpie.csAnno <- function(x, \n                           r = 0.2, \n               "
  },
  {
    "path": "R/vennplot.R",
    "chars": 3355,
    "preview": "##' plot the overlap of a list of object\n##'\n##'\n##' There are two ways to plot, which users can specify through `by`.\n#"
  },
  {
    "path": "R/zzz.R",
    "chars": 407,
    "preview": "##' @importFrom yulab.utils yulab_msg\n.onAttach <- function(libname, pkgname) {\n  packageStartupMessage(yulab_msg(pkgnam"
  },
  {
    "path": "README.Rmd",
    "chars": 3741,
    "preview": "---\noutput:\n  md_document:\n    variant: gfm\nhtml_preview: false\n---\n\n\n```{r echo=FALSE, results=\"hide\", message=FALSE}\n#"
  },
  {
    "path": "README.md",
    "chars": 4243,
    "preview": "# ChIPseeker: ChIP peak Annotation, Comparison, and Visualization\n\n<img src=\"https://raw.githubusercontent.com/Bioconduc"
  },
  {
    "path": "appveyor.yml",
    "chars": 1227,
    "preview": "environment:\n  matrix:\n    - R_VERSION: devel\n      R_ARCH: x64\n      USE_RTOOLS: true\n  _R_CHECK_FORCE_SUGGESTS_: false"
  },
  {
    "path": "inst/CITATION",
    "chars": 2151,
    "preview": "citHeader(\"Please cite Q. Wang (2022) or G. Yu (2015) for using ChIPseeker. In addition, please cite clusterProfiler/DOS"
  },
  {
    "path": "inst/extdata/sample_peaks.txt",
    "chars": 7602,
    "preview": "chr\tstart\tend\tlength\tsummit\ttags\tX.10.log10.pvalue.\tfold_enrichment\tFDR...\nchr10\t105137980\t105138593\t614\t174\t7\t52.8\t15.3"
  },
  {
    "path": "inst/test-plot/test-plotPeakProf.R",
    "chars": 3552,
    "preview": "library(ChIPseeker)\nlibrary(TxDb.Hsapiens.UCSC.hg19.knownGene)\n\ncontext(\"test plotPeakProf() for a list of windows\")\n\npe"
  },
  {
    "path": "inst/test-plot/test-plotTagMatrix.R",
    "chars": 3897,
    "preview": "library(ChIPseeker)\nlibrary(TxDb.Hsapiens.UCSC.hg19.knownGene)\n\ncontext(\"test plotTagMatrix() and related functions\")\n\nt"
  },
  {
    "path": "man/ChIPseeker-package.Rd",
    "chars": 1858,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ChIPseeker-package.R\n\\docType{package}\n\\na"
  },
  {
    "path": "man/ChIPseekerCache.Rd",
    "chars": 394,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ChIPseeker-package.R\n\\docType{data}\n\\name{"
  },
  {
    "path": "man/annotatePeak.Rd",
    "chars": 2792,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/annotatePeak.R\n\\name{annotatePeak}\n\\alias{"
  },
  {
    "path": "man/as.GRanges.Rd",
    "chars": 343,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\name{as.GRanges}\n\\alias{as.GRang"
  },
  {
    "path": "man/as.data.frame.csAnno.Rd",
    "chars": 534,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\name{as.data.frame.csAnno}\n\\alia"
  },
  {
    "path": "man/check_upstream_and_downstream.Rd",
    "chars": 392,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{check_upstream_and_downs"
  },
  {
    "path": "man/combine_csAnno.Rd",
    "chars": 379,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\name{combine_csAnno}\n\\alias{comb"
  },
  {
    "path": "man/covplot.Rd",
    "chars": 818,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/covplot.R\n\\name{covplot}\n\\alias{covplot}\n\\"
  },
  {
    "path": "man/csAnno-class.Rd",
    "chars": 1000,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\docType{class}\n\\name{csAnno-clas"
  },
  {
    "path": "man/dot-ChIPseekerEnv.Rd",
    "chars": 406,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{.ChIPseekerEnv}\n\\alias{."
  },
  {
    "path": "man/dotFun.Rd",
    "chars": 320,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{.}\n\\alias{.}\n\\title{.}\n\\"
  },
  {
    "path": "man/downloadGEObedFiles.Rd",
    "chars": 390,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/GEO.R\n\\name{downloadGEObedFiles}\n\\alias{do"
  },
  {
    "path": "man/downloadGSMbedFiles.Rd",
    "chars": 405,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/GEO.R\n\\name{downloadGSMbedFiles}\n\\alias{do"
  },
  {
    "path": "man/dropAnno.Rd",
    "chars": 439,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/annotatePeak.R\n\\name{dropAnno}\n\\alias{drop"
  },
  {
    "path": "man/enrichAnnoOverlap.Rd",
    "chars": 768,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/enrichOverlap.R\n\\name{enrichAnnoOverlap}\n\\"
  },
  {
    "path": "man/enrichPeakOverlap.Rd",
    "chars": 960,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/enrichOverlap.R\n\\name{enrichPeakOverlap}\n\\"
  },
  {
    "path": "man/getAnnoStat.Rd",
    "chars": 254,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\name{getAnnoStat}\n\\alias{getAnno"
  },
  {
    "path": "man/getBioRegion.Rd",
    "chars": 1446,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getBioRegion}\n\\alias{get"
  },
  {
    "path": "man/getGEOInfo.Rd",
    "chars": 380,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/GEO.R\n\\name{getGEOInfo}\n\\alias{getGEOInfo}"
  },
  {
    "path": "man/getGEOgenomeVersion.Rd",
    "chars": 315,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/GEO.R\n\\name{getGEOgenomeVersion}\n\\alias{ge"
  },
  {
    "path": "man/getGEOspecies.Rd",
    "chars": 286,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/GEO.R\n\\name{getGEOspecies}\n\\alias{getGEOsp"
  },
  {
    "path": "man/getGeneAnno.Rd",
    "chars": 468,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/addGeneAnno.R\n\\name{getGeneAnno}\n\\alias{ge"
  },
  {
    "path": "man/getGenomicAnnotation.Rd",
    "chars": 765,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/getGenomicAnnotation.R\n\\name{getGenomicAnn"
  },
  {
    "path": "man/getNearestFeatureIndicesAndDistances.Rd",
    "chars": 989,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/getNearestFeatureIndicesAndDistances.R\n\\na"
  },
  {
    "path": "man/getPromoters.Rd",
    "chars": 463,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getPromoters}\n\\alias{get"
  },
  {
    "path": "man/getSampleFiles.Rd",
    "chars": 276,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{getSampleFiles}\n\\alias{g"
  },
  {
    "path": "man/getTagMatrix.Rd",
    "chars": 3423,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix}\n\\alias{get"
  },
  {
    "path": "man/getTagMatrix.binning.internal.Rd",
    "chars": 1072,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix.binning.int"
  },
  {
    "path": "man/getTagMatrix.internal.Rd",
    "chars": 595,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix.internal}\n\\"
  },
  {
    "path": "man/getTagMatrix2.Rd",
    "chars": 1100,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix2}\n\\alias{ge"
  },
  {
    "path": "man/getTagMatrix2.binning.internal.Rd",
    "chars": 809,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix2.binning.in"
  },
  {
    "path": "man/getTagMatrix2.internal.Rd",
    "chars": 596,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{getTagMatrix2.internal}\n"
  },
  {
    "path": "man/info.Rd",
    "chars": 321,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/ChIPseeker-package.R\n\\docType{data}\n\\name{"
  },
  {
    "path": "man/makeBioRegionFromGranges.Rd",
    "chars": 1968,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/tagMatrix.R\n\\name{makeBioRegionFromGranges"
  },
  {
    "path": "man/make_label.Rd",
    "chars": 378,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{make_label}\n\\alias{make_"
  },
  {
    "path": "man/overlap.Rd",
    "chars": 318,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\name{overlap}\n\\alias{overlap}"
  },
  {
    "path": "man/peakHeatmap.Rd",
    "chars": 1288,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{peakHeatmap}\n\\alias{"
  },
  {
    "path": "man/peakHeatmap_multiple_Sets.Rd",
    "chars": 1502,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{peakHeatmap_multiple"
  },
  {
    "path": "man/peak_Profile_Heatmap.Rd",
    "chars": 1747,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{peak_Profile_Heatmap"
  },
  {
    "path": "man/plotAnnoBar-methods.Rd",
    "chars": 857,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/AllGenerics.R, R/csAnno.R\n\\docType{methods"
  },
  {
    "path": "man/plotAnnoBar.Rd",
    "chars": 768,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotAnno.R\n\\name{plotAnnoBar.data.frame}\n\\"
  },
  {
    "path": "man/plotAnnoPie-methods.Rd",
    "chars": 882,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/AllGenerics.R, R/csAnno.R\n\\docType{methods"
  },
  {
    "path": "man/plotAnnoPie.Rd",
    "chars": 1048,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotAnno.R\n\\name{plotAnnoPie.csAnno}\n\\alia"
  },
  {
    "path": "man/plotAvgProf.Rd",
    "chars": 826,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotAvgProf}\n\\alias{"
  },
  {
    "path": "man/plotAvgProf.binning.Rd",
    "chars": 1159,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotAvgProf.binning}"
  },
  {
    "path": "man/plotAvgProf2.Rd",
    "chars": 1175,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotAvgProf2}\n\\alias"
  },
  {
    "path": "man/plotDistToTSS-methods.Rd",
    "chars": 1503,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/AllGenerics.R, R/csAnno.R\n\\docType{methods"
  },
  {
    "path": "man/plotDistToTSS.data.frame.Rd",
    "chars": 1492,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotDistToTSS.R\n\\name{plotDistToTSS.data.f"
  },
  {
    "path": "man/plotMultiProf.Rd",
    "chars": 665,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotMultiProf}\n\\alia"
  },
  {
    "path": "man/plotMultiProf.binning.Rd",
    "chars": 803,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotMultiProf.binnin"
  },
  {
    "path": "man/plotMultiProf.binning.internal.Rd",
    "chars": 830,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotMultiProf.binnin"
  },
  {
    "path": "man/plotMultiProf.normal.Rd",
    "chars": 779,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotMultiProf.normal"
  },
  {
    "path": "man/plotMultiProf.normal.internal.Rd",
    "chars": 758,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotMultiProf.normal"
  },
  {
    "path": "man/plotPeakProf.Rd",
    "chars": 4058,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotPeakProf}\n\\alias"
  },
  {
    "path": "man/plotPeakProf2.Rd",
    "chars": 2934,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotPeakProf2}\n\\alia"
  },
  {
    "path": "man/plotPeakProf_MultiWindows.Rd",
    "chars": 4073,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{plotPeakProf_MultiWi"
  },
  {
    "path": "man/readPeakFile.Rd",
    "chars": 667,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/readPeakFile.R\n\\name{readPeakFile}\n\\alias{"
  },
  {
    "path": "man/reexports.Rd",
    "chars": 504,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/utilities.R\n\\docType{import}\n\\name{reexpor"
  },
  {
    "path": "man/seq2gene.Rd",
    "chars": 1152,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/seq2gene.R\n\\name{seq2gene}\n\\alias{seq2gene"
  },
  {
    "path": "man/show-methods.Rd",
    "chars": 395,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\docType{methods}\n\\name{show}\n\\al"
  },
  {
    "path": "man/shuffle.Rd",
    "chars": 324,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/enrichOverlap.R\n\\name{shuffle}\n\\alias{shuf"
  },
  {
    "path": "man/tagHeatmap.Rd",
    "chars": 670,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/plotTagMatrix.R\n\\name{tagHeatmap}\n\\alias{t"
  },
  {
    "path": "man/upsetplot-methods.Rd",
    "chars": 397,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/csAnno.R\n\\docType{methods}\n\\name{upsetplot"
  },
  {
    "path": "man/vennpie-methods.Rd",
    "chars": 530,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/AllGenerics.R, R/csAnno.R\n\\docType{methods"
  },
  {
    "path": "man/vennplot.Rd",
    "chars": 1536,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/vennplot.R\n\\name{vennplot}\n\\alias{vennplot"
  },
  {
    "path": "man/vennplot.peakfile.Rd",
    "chars": 366,
    "preview": "% Generated by roxygen2: do not edit by hand\n% Please edit documentation in R/vennplot.R\n\\name{vennplot.peakfile}\n\\alias"
  },
  {
    "path": "tests/testthat/test-bed.R",
    "chars": 207,
    "preview": "library(ChIPseeker)\n\ncontext(\"bed file\")\n\ntest_that(\"parse bed file\", {\n    files <- getSampleFiles()\n    for (i in seq_"
  },
  {
    "path": "tests/testthat/test-getTagMatrix.R",
    "chars": 6608,
    "preview": "library(ChIPseeker)\nlibrary(TxDb.Hsapiens.UCSC.hg19.knownGene)\n\ncontext(\"test getTagMatrix() and related functions\")\n\nte"
  },
  {
    "path": "tests/testthat/test-txdb.R",
    "chars": 1140,
    "preview": "library(TxDb.Hsapiens.UCSC.hg19.knownGene)\nlibrary(TxDb.Hsapiens.UCSC.hg38.knownGene)\nlibrary(ChIPseeker)\nlibrary(yulab."
  },
  {
    "path": "tests/testthat.R",
    "chars": 64,
    "preview": "library(testthat)\nlibrary(ChIPseeker)\n\ntest_check(\"ChIPseeker\")\n"
  },
  {
    "path": "vignettes/ChIPseeker.Rmd",
    "chars": 33801,
    "preview": "---\ntitle: \"ChIPseeker: an R package for ChIP peak Annotation, Comparison and Visualization\"\nauthor: \"Guangchuang Yu\\\\\n\n"
  },
  {
    "path": "vignettes/ChIPseeker.bib",
    "chars": 12441,
    "preview": "\n@article{yu_reactomepa_2016,\n\ttitle = {{ReactomePA}: an R/Bioconductor package for reactome pathway analysis and visual"
  }
]

// ... and 4 more files (download for full content)

About this extraction

This page contains the full source code of the YuLab-SMU/ChIPseeker GitHub repository, extracted and formatted as plain text for AI agents and large language models (LLMs). The extraction includes 119 files (433.4 KB), approximately 123.7k tokens. Use this with OpenClaw, Claude, ChatGPT, Cursor, Windsurf, or any other AI tool that accepts text input. You can copy the full output to your clipboard or download it as a .txt file.

Extracted by GitExtract — free GitHub repo to text converter for AI. Built by Nikandr Surkov.

Extract another repo