Fetch architectural stripes

The .hdf5 file produced by stripepy call contains various kinds of information, including stripe coordinates, various descriptive statistics, persistence vectors, and more.

While having access to all this information can be useful, usually we are mostly interested in the stripe coordinates, which can be fetched using stripepy view.

# Fetch the first 10 stripes in BEDPE format
user@dev:/tmp$ stripepy view 4DNFI9GMP2J8.10000.hdf5 | head

chr1  910000  960000  chr1    930000  3590000
chr1  1060000 1110000 chr1    1080000 3540000
chr1  1570000 1620000 chr1    1600000 2590000
chr1  1600000 1670000 chr1    880000  1620000
chr1  1670000 1700000 chr1    1680000 2610000
chr1  1730000 1780000 chr1    1750000 2570000
chr1  1780000 1840000 chr1    1780000 2580000
chr1  1890000 1940000 chr1    1920000 3540000
chr1  1940000 2020000 chr1    1960000 3590000
chr1  2020000 2060000 chr1    2020000 3550000

# Redirect stdout to a file
user@dev:/tmp$ stripepy view 4DNFI9GMP2J8.10000.hdf5 > stripes.bedpe

# Compress stripes on the fly before writing to a file
user@dev:/tmp$ stripepy view 4DNFI9GMP2J8.10000.hdf5 | gzip -9 > stripes.bedpe.gz

Output customization and filtering

When viewing the stripes, several optional parameters are available to customize the output.

The --relative-change-threshold option allows you to set a cutoff value (defaulting to 5.0) for filtering stripes based on their relative change. This relative change is calculated as the ratio between the average number of interactions found inside a stripe and the number of interactions in a neighborhood immediately outside of the stripe.

If you are interested in the biodescriptors associated with each individual stripe, you can pass --with-header and --with-biodescriptors when calling stripepy view.

This is the output generated by running stripepy view on the .hdf5 generated using stripepy call v1.1.1. Files generated by older versions of StripePy may have different columns.

user@dev:/tmp$ stripepy view 4DNFI9GMP2J8.10000.hdf5 --with-biodescriptors --with-header | head

chrom1        start1  end1    chrom2  start2  end2    top_persistence inner_mean      inner_std       outer_lsum      outer_lsize     outer_rsum      outer_rsize     min     q1      q2      q3      max     outer_lmean     outer_rmean     outer_mean      rel_change
chr1  910000  960000  chr1    930000  3590000 0.3984904019    0.2506571890861574      0.14123131812515843     144.79589039186396      801     192.25135582429806      8010.0  0.17139833204774585     0.22938081658911763     0.28656944403925566     0.9741568863537948      0.18076890186250183     0.24001417705904876     0.2103915394607753      19.138435760573497
chr1  1060000 1110000 chr1    1080000 3540000 0.0826359687    0.23019685453871336     0.14481608064533394     186.18030631678906      741     179.64345985134207      7410.0  0.1539575922232785      0.21018481227951455     0.2710230083036015      0.9903418421799679      0.2512554741117261      0.24243381896267485     0.246844646537200486.744238626207448
chr1  1570000 1620000 chr1    1600000 2590000 0.04103011280000002     0.33195798369580404     0.10697974882795283     99.02697827900961       300     85.58022773213244       300     0.10509240613975727     0.2710230083036015      0.3152772184192718      0.3662448898065007      0.9887477925105556      0.3300899275966987      0.2852674257737748      0.3076786766852368      7.891124361343245
chr1  1600000 1670000 chr1    880000  1620000 0.10798038449999997     0.34673478460468343     0.12547401272240433     79.95811315769556       225     63.18147668278408       225     0.0     0.25904999836303577     0.33447322272887486     0.4155250840484962      0.9887477925105556      0.3553693918119803      0.2808065630345959      0.3180879774232881      9.0059383612837
chr1  1670000 1700000 chr1    1680000 2610000 0.08521339110000004     0.30510000180174507     0.11602295320194354     84.13794539599031       282     71.90225464650885       282     0.0     0.22938081658911763     0.304010183863723       0.37277167877770423     0.8753282776351561      0.29836150849641957     0.2549725342074782      0.2766670213519489      10.276967710447305
chr1  1730000 1780000 chr1    1750000 2570000 0.09549401749999997     0.34157106048803376     0.12939228310023276     66.96694495052422       249     77.44100032822071       249     0.06630592590798857     0.25245019336736707     0.32535592427102433     0.41427461878487365     0.9374989352738993      0.26894355401816955     0.3110080334466695      0.28997579373241955     17.792956471126924
chr1  1780000 1840000 chr1    1780000 2580000 0.14961356020000005     0.31446872398046843     0.14174768874612398     89.65252960337472       243     73.53776985594494       243     0.0     0.2202635181312671      0.28656944403925566     0.3761154144433587      0.9150948504497306      0.3689404510426943      0.3026245673084154      0.33578250917555486     6.347497148501883
chr1  1890000 1940000 chr1    1920000 3540000 0.13643510830000005     0.27087952940479454     0.15589512088714813     98.34422915113818       489     137.9512119037385       489     0.0     0.17139833204774585     0.2453610817780414      0.3592307814635864      0.989227567682685       0.20111294304936234     0.2821088177990563      0.24161088042420928     12.113961477726793
chr1  1940000 2020000 chr1    1960000 3590000 0.05824488140000006     0.267059000791004       0.1518633129658817      138.54936114286124      492     138.81994263073136      492     0.0     0.17139833204774585     0.2453610817780414      0.34858989163711346     0.9751278353396942      0.28160439256679115     0.2821543549405109      0.281879373753651       5.257700400455457

If you are working in Python, you might want to take a look at the classes Result and ResultFile.

Coordinate transformation

The --transform option provides control over how stripe coordinates are presented in the output. By default, no transformation is applied. However, you can specify transpose_to_ut to transpose coordinates to the upper triangular part of the contact map, or transpose_to_lt to transpose them to the lower triangular part, which can be useful for specific downstream analyses or visualization preferences.