sig_curie_tool¶
Compute similarity of input cell sets to cell viability datasets
Synopsis¶
sig_curie_tool
[--score SCORE] [--rank RANK] [--up
UP] [--down DOWN] [--min_set_size MIN_SET_SIZE] [--platform PLATFORM] [--feature_space
FEATURE_SPACE] [--metric METRIC] [--es_tail ES_TAIL] [--sig_meta SIG_META] [--query_meta
QUERY_META] [--skip_key_as_text SKIP_KEY_AS_TEXT] [--dataset DATASET] [--use_gctx
USE_GCTX]
Arguments¶
--score
SCORE
: Dataset of differential viability scores
--rank
RANK
: Dataset of ranks corresponding to the score dataset
--up
UP
: Set(s) of UP cell lines
--down
DOWN
: Set(s) of DOWN cell lines
--min_set_size
MIN_SET_SIZE
: Minimum query set size. Sets with fewer members will be excluded. Default is 3
--platform
PLATFORM
: Profiling platform of the. Default is pr500_cs5. Options are {pr500_cs5}
--feature_space
FEATURE_SPACE
: Feature identifiers used in the query cell sets. Supported options are
cell_iname = CMap cell name (MCF7), ccle_name = Broad CCLE cell line name
(MCF7_BREAST), feature_id = CMap feature id (c-438), cell_id = Arxspan
identifiers (ACH-000019) . Default is cell_iname. Options are
{feature_id|cell_iname|cell_id|ccle_name}
--metric
METRIC
: Similarity metric. Default is wtcs. Options are {wtcs|cs}
--es_tail
ES_TAIL
: Specify two-tailed or one-tailed statistic for enrichment metrics. Default is
both. Options are {both|up|down}
--sig_meta
SIG_META
: Optional metadata for columns (signatures) of the score matrix. If provided the
rows in the output datasets will be annotated using the first field as the key.
--query_meta
QUERY_META
: Optional metadata for query cell sets. If provided the columns of the output
datasets will be annotated using the first field as the key.
--skip_key_as_text
SKIP_KEY_AS_TEXT
: If true overrides default behavior of outputting a text version of the key
mnatrices. Default is 0
--dataset
DATASET
: Pre-canned datasets that can be queried without specifying score and rank
matrices individually. Default is pasg_pr500_bydose. Options are
{custom|pasg_pr500_bydose}
--use_gctx
USE_GCTX
: Use binary GCTX format optimized for large datasets if true. Default is 0
Description¶
The tool computes a set-based enrichment similarity between input cell-line sets (aka queries) and a perturbational cell-fitness signature dataset. While the tool is optimized for datasets generated by the PRISM platform, any cell-fitness dataset can be used.
Examples¶
- Run query on a custom dataset using single-sided queries with CCLE names
sig_curie_tool --score 'score.gct' --rank 'rank.gct' --up 'up.gmt' --down
'down.gmt' '--feature_space' ccle_name
- Run query on a pre-canned dataset using single sided queries with cell_inames as features
sig_curie_tool --dataset 'pasg_pr500_bydose' --up 'up.gmt' --es_tail up