pypairs.pairs.cyclone¶
-
pypairs.pairs.
cyclone
(data, marker_pairs=None, gene_names=None, sample_names=None, iterations=1000, min_iter=100, min_pairs=50)¶ Score samples for each category based on marker pairs.
data
can be of typeAnnData
,DataFrame
orndarray
and should contain the raw or normalized gene counts of shapen_obs
*n_vars
. Rows correspond to cells and columns to genes.- If a
AnnData
object is passed, the category scores and the final prediction will be added todata.obs
with keypypairs_{category}_score
andpypairs_max_class
.- If marker pairs contain only the cell cycle categories G1, S and G2M an additional column
pypairs_cc_prediction
will be added. Where category S is assigned to samples where G1 and G2M score are below 0.5, as described in [Scialdone15].
- If marker pairs contain only the cell cycle categories G1, S and G2M an additional column
marker_pairs
, i.e. output fromsandbag()
, must be a mapping from category to list of 2-tuple Genes: {‘category’: [(Gene_1,Gene_2), …], …}.- If no
marker_pairs
are passed the default are used fromdefault_marker()
based on [Leng15] (marker pairs for cell cycle prediction).
Parameters: - data :
Union
[AnnData
,DataFrame
,ndarray
,Collection
[Collection
[float
]]] The (annotated) data matrix of shape
n_obs
*n_vars
. Rows correspond to cells and columns to genes.- marker_pairs :
Optional
[Mapping
[str
,Collection
[Tuple
[str
,str
]]]] A dict mapping from str to a list of 2-tuple, where the key is the category and the list contains the marker pairs: {‘Category_1’: [(Gene_1, Gene_2), …], …}. If not provided default marker pairs are used
- gene_names :
Optional
[Collection
[str
]] Names for genes, must be same length as
n_vars
.- sample_names :
Optional
[Collection
[str
]] Names for samples, must be same length as
n_obs
.- iterations :
Optional
[int
] An integer specifying the number of iterations for random sampling to obtain a cycle score. Default: 1000
- min_iter :
Optional
[int
] An integer specifying the minimum number of iterations for score estimation. Default: 100
- min_pairs :
Optional
[int
] An integer specifying the minimum number of pairs for cycle estimation. Default: 50
Return type: DataFrame
Returns: - A
DataFrame
with samples as index and categories as columns with scores for each category for each - sample and a additional column with the name of the max scoring category for each sample. –
- If marker pairs contain only the cell cycle categories G1, S and G2M an additional column
pypairs_cc_prediction
will be added. Where category S is assigned to samples where G1 and G2M score are below 0.5, as described in [Scialdone15].
- If marker pairs contain only the cell cycle categories G1, S and G2M an additional column
Examples
To predict the cell cycle phase of the unsorted cell from the [Leng15] dataset run:
import pypairs import pairs, datasets adata = datasets.leng15('unsorted') marker_pairs = datasets.default_cc_marker() scores = pairs.cyclone(adata, marker_pairs) print(scores)
- If a