multimil.utils.get_sample_representations

multimil.utils.get_sample_representations#

multimil.utils.get_sample_representations(adata, sample_key, use_rep='X', aggregation='weighted', cell_attn_key='cell_attn', covs_to_keep=None, top_fraction=None)#

Get sample representations from cell-level representations.

Parameters:
  • adata – Annotated data object with cell-level representations.

  • sample_key – Key in adata.obs that identifies samples.

  • use_rep (default: 'X') – Key in adata.obsm to use for sample representations or ‘.X’ (default is ‘X’).

  • aggregation (default: 'weighted') – Method to aggregate cell-level representations to sample-level. Options are ‘weighted’ or ‘mean’.

  • cell_attn_key (default: 'cell_attn') – Key in adata.obs that contains cell-level attention weights (if aggregation is ‘weighted’).

  • covs_to_keep (default: None) – List of sample-level covariate keys to keep in the final sample representation.

  • top_fraction (default: None) – Fraction of top cells to select based on attention weights. If None, uses all cells. If provided, will first score top cells and then use only those for sample representation.

Return type:

AnnData

Returns:

ad.AnnData Annotated data object with sample-level representations.