Usage

`build_bedpe`

build_bedpe builds pairs between elements in two bed files. Pairs can be constrained by a third bed file (usually TADs) or by user-defined minimum and maximum distances. Pairs are printed in bedpe format to standard out. Pairs can be use to query .hic files with query_bedpe. View the tutorial here.

Usage and Option Summary

build_bedpe -A path/to/bed1.bed -B path/to/bed2.bed -T path/to/TADfile.bed

(or):

build_bedpe -A path/to/bed1.bed -B path/to/bed2.bed -d 10000 -D 100000

Required

Short Option	Long Option	Description
`-A`	`--bed_A`	Path to the first bed file
`-B`	`--bed_B`	Path to the second bed file

Optional

Short Option	Long Option	Description
`-T`	`--TAD`	Path of the TAD file to restrict pairings outside of the TAD
`-d`	`--min_dist`	Minimum distance between pairs used to drop results. Default 0 bp
`-D`	`--max_dist`	Maximum distance between pairs used to drop results. Default 5 Mb
`-m`	`--preserve_meta`	If bed file meta data columns should be preserved. Default FALSE
`-i`	`--get_trans`	If pairs between different chromosomes should be made. If TRUE, will print trans pairs only. Default FALSE
`-f`	`--fraction`	Control the number of possible trans pairs to be printed. Between 0-1. Default = 1. Only applicable if `--get_trans` is TRUE.
`-h`	`--help`	Help message

`get_loops`

get_loops obtains loops with scores > 1. Scores are calculated using inherent normalization and printed in bedpe format to standard out.

Usage and Option Summary

get_loops \
  -A H3K27ac \
  -G hg38 \
  -R chr1:1000:5000 \
  -r 5000

Required

Short Option	Long Option	Description
`-A`	`--sample1`	Name of the sample you want to use as it appears on the Tinker box
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38
`-R`	`--range`	Range to obtain loops from, in chr:start format.
`-r`	`--resolution`	Resolution of sample in base pairs. Only 5000 and 1000 supported.

Optional

Short Option	Long Option	Description
`-T`	`--TAD`	Full path to the TAD file, the boundaries of which will be used to obtain loops
`-S`	`--score`	Minimum inherent score to get loops. Default = 1
`-d`	`--min_dist`	Minimum distance to filter obtained loops. Default 0 bp
`-h`	`--help`	Help message

`get_multisample_viewpoints`

get_multisample_viewpoints is used to extract contact values from specific genomic viewpoints for multiple samples simultaneously.

Usage and Option Summary

get_multisample_viewpoints \
  -G hg38 \
  -L LAMP_DMSO,LAMP_dCBP1 \
  -R chr1:40280000:40530000:MCC7_MYCL \
  -V chr1:40400000:anchor1

Required

Short Option	Long Option	Description
`-L`	`--list`	Comma separated list of sample names. For ex., LAMP_DMSO,LAMP_dCBP1
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38
`-R`	`--range`	The genomic range to extract the contact values of, in chr:start:end format. For example: `-R chr1:40280000:40530000:MCC7_MYCL`
`-V`	`--viewpoint`	Viewpoint in chr:start format. For example: `-V chr1:40400000:anchor1`

Optional

Short Option	Long Option	Description
`-T`	`--table`	Path to 1-col .txt file containing list of sample names, if `--list` option is not used
`-Q`	`--norm`	Which normalization to use. Strictly `none`, `cpm` or `aqua` in lower case. Non-spike-in samples default to cpm. Spike-in samples default to aqua.
`-r`	`--resolution`	Resolution of sample in base pairs, using which the contact values should be calculated. Default 5000. Accepted resolutions- 1000,5000,10000,25000,50000,100000,250000,500000,1000000,2500000
`-O`	`--output_name`	If saving to a file is desired, provide a name for the output
`-h`	`--help`	Help message

`intersect_bedpe`

Given a bedpe file, intersect_bedpe prints those rows of the bedpe in standard out that intersect with rows of given bed file(s) on either foot of the pair. intersect_bedpe is useful for extracting biological subsets from the bedpe.

Usage and Option Summary

intersect_bedpe -A H3K27ac -P /path/to/bedpe

(or):

intersect_bedpe -A H3K27ac -B H3K27me3 -P /path/to/bedpe

Required

Short Option	Long Option	Description
`-A`	`--bed_A`	Path to the first bed file
`-P`	`--bedpe`	Path to the bedpe file

Optional

Short Option	Long Option	Description
`-F`	`--flank`	Genome distance in bp that the bed should be in vicinity of either foot. Default is 0
`-V`	`--absence`	If specified, reports those rows of the bedpe that do not intersect with rows of given bed file. Default FALSE
`-B`	`--bed_B`	Path to the second bed file
	`--print_bed`	If specified, reports rows of bed instead of bedpe
`-h`	`--help`	Help message

`plot_APA`

plot_APA generates APA (Aggregate Peak Analysis) plots using AQuA normalized contact values from genomic pair data.

Usage and Option Summary

plot_APA \
   -P /path/to/example_pairs.bedpe \
   -A H3K27ac \
   -G hg38 \
   -O /path/to/output_directory \
   -B SampleB \
   --bin_size 10000 \
   --hard_cap_cpm 50

(or):

plot_APA \
   -P /path/to/example_pairs.bedpe \
   -A H3K27ac \
   -G hg38 \
   -O /path/to/output_directory \
   -B H3K27me3

Required

Short Option	Long Option	Description
`-P`	`--pair`	Path to the bedpe (pairs) file you want to use, without headers.
`-A`	`--sample1`	Name of the sample you want to use to create the plot, name it as it appears on the Tinker box
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38.
`-O`	`--out-dir`	Full path of the directory you want to store the output plots in.

Optional

Short Option	Long Option	Description
`-B`	`--sample2`	The name of the second sample. If triggered, plots the delta AQuA normalized values from both samples for that pair. Useful in case vs control.
	`--cpml`	No input required. If —cpml is specified, CPM and AQuA APA values get normalized by the number of loops in the bedpe.
	`--bin_size`	Bin size you want to use for the APA plots. Default = 5000.
	`--hard_cap_cpm`	If saving to a file is desired, provide a name for the output.
	`--hard_cap_cpm_delta`	Upper limit of the CPM delta plot range. Only for two sample analysis. If not specified, upper limit will be calculated using max delta value.
	`--hard_cap_aqua`	Upper limit of the AQuA plot range. If not specified, upper limit will be calculated using max bin value.
	`--hard_cap_aqua_delta`	Upper limit of the AQuA delta plot range. Only for two sample analysis. If not specified, upper limit will be calculated using max delta value.
`-h`	`--help`	Help message

`plot_contacts`

plot_contacts creates contact plots with CPM/AQuA normalized contact values. View the tutorial here.

Usage and Option Summary

plot_contacts -A H3K27ac -R chr1:40280000:40530000:MCC7_MYCL -G hg38

(or):

plot_contacts -A H3K27ac -B H3K27me3 -g MCC7_MYCL -G hg38

Required

Short Option	Long Option	Description
`-A`	`--sample_1`	Name of sample you want to use to create the contact plot, name it as it appears on the Tinkerbox
`-R`	`--range`	The genomic range that is to be plotted, in chr:start:end format. For example: -R chr1:40280000:40530000
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38

Optional

Short Option	Long Option	Description
`-O`	`--output_name`	Provide a name for the output pdf
`-Q`	`--norm`	Which normalization to use. Strictly ‘none’, ‘cpm’ or ‘aqua’ in lower case. Non-spike-in samples default to cpm. Spike-in samples default to aqua.
`-B`	`--sample_2`	For two sample delta plots, name of the second sample.
`-r`	`--resolution`	Resolution of sample in base pairs, using which the contact values should be calculated. Default 5000. Accepted resolutions- 1000,5000,10000,25000,50000,100000,250000,500000,1000000,2500000
`-p`	`--profiles`	If contact profiles should be drawn along the diagonal, x axis and y axis. Default = FALSE
`-o`	`--color_one_sample`	Color for contacts for single sample plots in RGB hexadecimal, ex: red = FF0000 (RRGGBB). Default = FF0000
`-t`	`--color_two_sample`	Color for contacts for two sample plots (delta) in RGB hexadecimal separated by ’-’, ex: 1E90FF-C71585
	`--annotations_default`	Draw bed annotations; TSSs, ENCODE 3 enhancers, CpG islands. Default = TRUE
	`--annotations_custom`	Path to bed file to draw custom annotations. Only one custom bed supported
	`--quant_cut`	Between 0.00-1.00. Rather than using the max value of the matrix as the highest color, cap the values at a given percentile. Default 1.00
	`--max_cap`	Set a hard cap, all values greater contact values than this will be brought down to cap value supplied
	`--use_dump`	TRUE or FALSE. Obtain raw contact matrices along with contact plot. Default FALSE
	`--bedpe`	Supply path to a bedpe file to highlight tiles of interacting bedpe feet
	`--bedpe_color`	Color for supplied bedpe in RGB hexadecimal. ex: C71585
`-i`	`--inherent`	TRUE or FALSE. If TRUE, normalize the contact plot using inherent normalization
`-w`	`--width`	Manually set width of printed bin between 0 and 1. Default width calculated automatically.
`-g`	`--gene`	Provide a gene name to automatically select TAD coordinates for interval range (-R). -g can be used in place of -R.
`-h`	`--help`	Help message. Primer can be found at https://rb.gy/fjkwkr

`plot_virtual_4C`

plot_virtual_4C is used for visualizing chromatin interactions, similar to what the 4C (Circular Chromosome Conformation Capture) technique does. However, instead of performing a wet-lab 4C experiment, the tool uses processed data to virtually generate a 4C profile focused on interactions of a specific genomic region (viewpoint) with the rest of the genome.

Usage and Option Summary

plot_virtual_4C
   -A H2K27ac \
   -G hg38 \
   -R chr1:40280000:40530000:MCC7_MYCL \
   -V chr1:40400000:anchor1

(or):

plot_virtual_4C
   -A H3K27ac \
   -B H3K27me3 \
   -G hg38 \
   -R chr1:40280000:40530000:MCC7_MYCL \
   -V chr1:40400000:anchor1

Required

Short Option	Long Option	Description
`-A`	`--sample1`	Name of the first sample you want to use as it appears on the tinker box
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38
`-R`	`--range`	The genomic range that is to be plotted, in chr:start:end format. For example: `-R chr1:40280000:40530000:MCC7_MYCL`
`-V`	`--viewpoint`	The viewpoint to be considered in chr:start format. For example: `-R chr1:40400000:anchor1`

Optional

Short Option	Long Option	Description
`-B`	`--sample2`	Name of the second sample you want to use as it appears on the tinker box
`-Q`	`--norm`	Which normalization to use. Strictly `none`, `cpm` or `aqua` in lower case. Non-spike-in samples default to cpm. Spike-in samples default to aqua.
`-r`	`--resolution`	Resolution of sample in base pairs. Default 5000. Accepted resolutions: 1000,5000,10000,25000,50000,100000,250000,500000,1000000,2500000
`-O`	`--output_name`	Optional: provide a name for the plot
	`--quant_cut`	Help message
	`--max_cap`	Set a hard cap, all values greater than this will be brought down to cap value supplied
	`--width`	Number of bins up and downstream of viewpoint locus to be considered for drawing profiles. Default 0
	`--height`	Numeric factor to control the height of the Virtual 4C profile in the plot. Default 0.3
`-h`	`--help`	Help message

`query_bedpe`

query_bedpe uses a bedpe file to calculate AQuA normalized or counts-per-million (CPM) contact values for given ranges in a sample and prints to standard out. View the tutorial here.

Usage and Option Summary

query_bedpe -A H3K27ac -P path/to/pairs.bedpe -G hg38

(or):

query_bedpe -A H3K27ac -B H3K27me3 -P path/to/pairs.bedpe -G hg38

Required

Short Option	Long Option	Description
`-P`	`--pair`	Full path to the bedpe (pairs) file you want to query, without headers!
`-A`	`--sample_1`	Name of the sample you want to use as it appears on the Tinker box
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38
`-Q`	`--norm`	Which normalization to use. Strictly ‘none’, ‘cpm’ or ‘aqua’ in lower case. Non-spike-in samples default to cpm. Spike-in samples default to aqua.

Optional

Short Option	Long Option	Description
`-B`	`--sample_2`	The name of the second sample. If triggered, calculates the delta contact values for that pair. Useful in case vs control
`-R`	`--resolution`	Resolution of sample in base pairs. Default 5000. Accepted resolutions: 1000,5000,10000,25000,50000,100000,250000,500000,1000000,2500000
`-f`	`--formula`	Arithmetic to use to report contact values. Options: center, max, average, sum. Default = center
`-F`	`--fix`	If FALSE, reports new coordinates based on arithmetic center or max. Default = TRUE
	`--shrink_wrap`	Squeezes a 2D bedpe interval until supplied value is reached. Default = FALSE
	`--split`	Splits a 2D bedpe interval into multiple sub-intervals greater than supplied value. Default = FALSE
	`--padding`	Joins sub-intervals in 2D space reported by —split, based on supplied value in bin units. Default = 2
	`--expand`	Expands 1D bedpe feet in both directions based on supplied value (in bin units). Default = 0
`-I`	`--inherent`	If TRUE, hic values transformed to inherent units. For one-sample tests only. Default = FALSE
`-h`	`--help`	Help message. Primer can be found at https://rb.gy/zyfjxc

`summarize_interval`

summarize_interval counts both short-range and long-range 3D contacts within specified genomic intervals.

Usage and Option Summary

summarize_interval \
  -G hg38 \
  -I /path/to/input_bed_file.bed \
  -A H3K27ac

(or):

summarize_interval \
  -G hg38 \
  -I /path/to/input_bed_file.bed \
  -A H3K27ac \
  -B H3K27me3

Required

Short Option	Long Option	Description
`-I`	`--input`	Full path to the bed (intervals) file without headers
`-A`	`--sample1`	Name of the sample you want to use to calculate the contact values, as it appears on the Tinker box
`-G`	`--genome`	The genome build the sample(s) has been processed using. Strictly hg19 or hg38

Optional

Short Option	Long Option	Description
`-B`	`--sample2`	Name of the second sample you want to use as it appears on the Tinker box. Useful in case vs control.
`-Q`	`--AQuA`	Use AQuA factors: TRUE or FALSE. Non-spike-in samples default to FALSE (CPM). Spike-in samples default to TRUE (AQUA).
`-r`	`--resolution`	Resolution of sample in base pairs. Default 5000. Accepted resolutions: 1000,5000,10000,25000,50000,100000,250000,500000,1000000,2500000
`-D`	`--distance`	Distance in base pairs to classify short-range and long-range contact values. Default 15000.
`-h`	`--help`	Help message

Tools

query_bedpe

build_bedp

plot_contacts

Research

Usage

`build_bedpe`

`get_loops`

`get_multisample_viewpoints`

`intersect_bedpe`

`plot_APA`

`plot_contacts`

`plot_virtual_4C`

`query_bedpe`

`summarize_interval`