Python API Reference¶
- stripepy.data_structures.SparseMatrix¶
alias of
csr_matrix|csc_matrix
- class stripepy.data_structures.Stripe(
- seed: int,
- top_pers: float | None,
- horizontal_bounds: Tuple[int, int] | None = None,
- vertical_bounds: Tuple[int, int] | None = None,
- where: str | None = None,
A class used to represent architectural stripes. This class takes care of validating stripe coordinates and computing several descriptive statistics.
This is how this class should be used:
Initialize the class by providing at least the seed position
At a later time, set the vertical and horizontal boundaries by calling
set_horizontal_boundsandset_vertical_boundsFinally, call
compute_biodescriptorsto compute and store the descriptive statistics
The stripe properties and statistics can now be accessed through the attributes listed below.
Attributes representing the descriptive statistics return negative values to signal that it was not possible to compute the statistics for the current Stripe instance.
- __init__(
- seed: int,
- top_pers: float | None,
- horizontal_bounds: Tuple[int, int] | None = None,
- vertical_bounds: Tuple[int, int] | None = None,
- where: str | None = None,
- Parameters:
seed – the stripe seed position
top_pers – the topological persistence of the seed
horizontal_bounds – the horizontal bounds of the stripe
vertical_bounds – the_vertical bounds of the stripe
where – the location of the stripe: should be “upper_triangular” or “lower_triangular”. When provided, this is used validate the coordinates set when calling
set_horizontal_bounds()andset_vertical_bounds().
- property lower_triangular: bool¶
True when the stripe extends in the lower-triangular portion of the matrix
- property upper_triangular: bool¶
True when the stripe extends in the upper-triangular portion of the matrix
- property five_number: ndarray[tuple[Any, ...], dtype[float]]¶
A vector of five numbers corresponding to the 0, 25, 50, 75, and 100 percentiles of the number of within-stripe interactions
- property outer_lmean: float¶
The average number of interactions in the band to the left of the stripe
- property outer_rmean: float¶
The average number of interactions in the band to the right of the stripe
- property outer_mean: float¶
The average number of interactions in the bands to the left and right of the stripe
- property rel_change: float¶
The ratio of the average number of interactions within the stripe and in the neighborhood outside of the stripe
- set_horizontal_bounds(left_bound: int, right_bound: int)¶
Set the horizontal bounds for the stripe. This function raises an exception when the coordinates have already been set or when the given coordinates are incompatible with the seed position.
- Parameters:
left_bound
right_bound
- set_vertical_bounds(top_bound: int, bottom_bound: int)¶
Set the vertical bounds for the stripe. This function raises an exception when the coordinates have already been set or when the given coordinates are incompatible with the seed position and/or the where location.
- Parameters:
top_bound
bottom_bound
- compute_biodescriptors(
- matrix: csr_matrix | csc_matrix,
- window: int = 3,
Use the sparse matrix to compute various descriptive statistics. Statistics are stored in the current Stripe instance. This function raises an exception when it is called before the stripe bounds have been set.
- Parameters:
matrix – the sparse matrix from which the stripe originated
window – window size used to compute statistics to the left and right of the stripe
- class stripepy.data_structures.ResultFile( )¶
A class used to read and write StripePy results to a HDF5 file.
There are 3 main use cases:
Open the file in read mode:
with ResultFile("results.hdf5") as h5: ...
Open file in write mode:
If all data will be written to the file before the file is closed:
with ResultFile.create("results.hdf5", mode="w", ...) as h5: h5.write_descriptors(res1) h5.write_descriptors(res2) ...
If the data will be added progressively:
with ResultFile.create("results.hdf5", mode="a", ...) as h5: h5.write_descriptors(res1) # not mandatory, it is also possible to create the # file and close it immediately ... with ResultFile.append("results.hdf5") as h5: h5.write_descriptors(res2) h5.write_descriptors(res3) ... with ResultFile.append("results.hdf5") as h5: h5.write_descriptors(res4) h5.finalize() # IMPORTANT! # Without the above line you'll get an error when trying to open # the file in read mode
When opening or creating a
ResultFilewrite or append mode, a context manager (e.g. with:) must be used- static create(
- path: Path,
- mode: str,
- chroms: Dict[str, int],
- resolution: int,
- normalization: str | None = None,
- assembly: str = 'unknown',
- metadata: Dict[str, Any] | None = None,
- compression_lvl: int = 9,
Create a
ResultFileusing the provided information.
- static create_from_file(
- path: Path,
- mode: str,
- matrix_file: File,
- normalization: str | None = None,
- metadata: Dict[str, Any] | None = None,
- compression_lvl: int = 9,
Create a
ResultFileusing information from the given matrix file.
- static append(path: Path)¶
Append to an existing
ResultFile.IMPORTANT: the file must have been created with
createorcreate_from_filewithmode="a"
- property normalization: str | None¶
The name of the normalization used to generate the data stored in the given file
- finalize()¶
Finalize a file opened in append mode
- get_min_persistence(chrom: str) float¶
Get the minimum persistence associated with the given chromosome.
- Parameters:
chrom – chromosome name
- Returns:
the minimum persistence
- get( ) DataFrame¶
Get the data associated with the given chromosome, field, and location.
- Parameters:
chrom – chromosome name. when not provided, return data for the entire genome.
field –
name of the field to be fetched. Supported names:
pseudodistribution
all_minimum_points
persistence_of_all_minimum_points
all_maximum_points
persistence_of_all_maximum_points
geo_descriptors
bio_descriptors
stripes
location – location of the attribute to be registered. Should be “LT” or “UT”
- Returns:
the data associated with the given chromosome, field, and location
- class stripepy.data_structures.Result(chrom_name: str, chrom_size: int)¶
A class used to represent the results generated by stripepy call.
- property chrom: Tuple[str, int]¶
The name and length of the chromosomes to which the
Resultinstance belongs to
- property roi: Dict[str, List[int]] | None¶
The region of interest associated with the
Resultinstance
- get( ) List[Stripe] | ndarray[int] | ndarray[float]¶
Get the value associated with the given attribute name and location.
- Parameters:
name – name of the attribute to be fetched
location – location of the attribute to be fetched. Should be “LT” or “UT”
- Returns:
the value associated with the given name and location.
- get_stripes_descriptor( ) ndarray[int] | ndarray[float]¶
Get the stripe descriptor for the given location.
- Parameters:
descriptor – name of the descriptor to be fetched
location – location of the attribute to be fetched. Should be “LT” or “UT”
- Returns:
the value associated with the given descriptor and location.