Skip to content

sig_curie_tool

Compute similarity of input cell sets to cell viability datasets

Synopsis

sig_curie_tool [--score SCORE] [--rank RANK] [--up UP] [--down DOWN] [--min_set_size MIN_SET_SIZE] [--platform PLATFORM] [--feature_space FEATURE_SPACE] [--metric METRIC] [--es_tail ES_TAIL] [--sig_meta SIG_META] [--query_meta QUERY_META] [--skip_key_as_text SKIP_KEY_AS_TEXT] [--dataset DATASET] [--use_gctx USE_GCTX]

Arguments

--score SCORE : Dataset of differential viability scores

--rank RANK : Dataset of ranks corresponding to the score dataset

--up UP : Set(s) of UP cell lines

--down DOWN : Set(s) of DOWN cell lines

--min_set_size MIN_SET_SIZE : Minimum query set size. Sets with fewer members will be excluded. Default is 3

--platform PLATFORM : Profiling platform of the. Default is pr500_cs5. Options are {pr500_cs5}

--feature_space FEATURE_SPACE : Feature identifiers used in the query cell sets. Supported options are cell_iname = CMap cell name (MCF7), ccle_name = Broad CCLE cell line name (MCF7_BREAST), feature_id = CMap feature id (c-438), cell_id = Arxspan identifiers (ACH-000019) . Default is cell_iname. Options are {feature_id|cell_iname|cell_id|ccle_name}

--metric METRIC : Similarity metric. Default is wtcs. Options are {wtcs|cs}

--es_tail ES_TAIL : Specify two-tailed or one-tailed statistic for enrichment metrics. Default is both. Options are {both|up|down}

--sig_meta SIG_META : Optional metadata for columns (signatures) of the score matrix. If provided the rows in the output datasets will be annotated using the first field as the key.

--query_meta QUERY_META : Optional metadata for query cell sets. If provided the columns of the output datasets will be annotated using the first field as the key.

--skip_key_as_text SKIP_KEY_AS_TEXT : If true overrides default behavior of outputting a text version of the key mnatrices. Default is 0

--dataset DATASET : Pre-canned datasets that can be queried without specifying score and rank matrices individually. Default is pasg_pr500_bydose. Options are {custom|pasg_pr500_bydose}

--use_gctx USE_GCTX : Use binary GCTX format optimized for large datasets if true. Default is 0

Description

The tool computes a set-based enrichment similarity between input cell-line sets (aka queries) and a perturbational cell-fitness signature dataset. While the tool is optimized for datasets generated by the PRISM platform, any cell-fitness dataset can be used.

Examples

  • Run query on a custom dataset using single-sided queries with CCLE names

sig_curie_tool --score 'score.gct' --rank 'rank.gct' --up 'up.gmt' --down 'down.gmt' '--feature_space' ccle_name

  • Run query on a pre-canned dataset using single sided queries with cell_inames as features

sig_curie_tool --dataset 'pasg_pr500_bydose' --up 'up.gmt' --es_tail up