Utils

This module contains miscellaneous methods.

compath.utils.log = <Logger compath.utils (WARNING)>[source]

General utils

compath.utils.dict_to_pandas_df(d)[source]

Transform pandas df into a dict.

Parameters

d (dict) –

Return type

pandas.DataFrame

Returns

pandas dataframe

compath.utils.process_form_gene_set(text)[source]

Process the string containing gene symbols and returns a gene set.

Parameters

text (str) – string to be processed

Return type

set[str]

Returns

gene set

compath.utils.calculate_relative_enrichments(results, total_pathways_by_resource)[source]

Calculate relative enrichment of pathways (enriched pathways/total pathways).

Parameters
  • results (dict) – result enrichment

  • total_pathways_by_resource (dict) – resource to number of pathways

Return type

dict

compath.utils.count_genes_in_pathway(pathways_gene_sets, genes)[source]

Calculate how many of the genes are associated to each pathway gene set.

Parameters
  • pathways_gene_sets (dict) – pathways and their gene sets

  • genes (set) – genes queried

Return type

dict

compath.utils.apply_filter(results, threshold)[source]

Run one simulation with a given threshold.

Parameters
  • results (dict) – resource with pathways

  • threshold (int) – necessary genes to enrich a pathway

Return type

dict

compath.utils.simulate_pathway_enrichment(resource_gene_sets, gene_set_query, runs=200)[source]

Simulate pathway enrichment.

Parameters
  • resource_gene_sets – resource and their gene sets

  • gene_set_query – shared genes between all resources

  • runs – number of simulation

Return type

dict[list[tuple]]

compath.utils.get_genes_without_assigned_pathways(enrichment_results, genes_query)[source]

Return the genes without any known pathway assigned.

Parameters
  • gene_set (dict) – list of managers

  • genes_query (set[str]) – gene set queried

Returns

compath.utils.get_enriched_pathways(manager_list, gene_set)[source]

Return the results of the queries for every registered manager.

Parameters
  • Manager] manager_list (dict[str,) – list of managers

  • gene_set (set[str]) – gene set queried

Return type

dict[str,dict[str,dict]]

compath.utils.get_gene_pathways(manager_list, gene)[source]

Return the pathways associated with a gene for every registered manager.

Parameters
  • Manager] manager_list (dict[str,) – list of managers

  • gene (str) – HGNC symbol

Return type

dict[str,dict[str,dict]]

compath.utils.get_mappings(compath_manager, only_accepted=True)[source]

Return a pandas dataframe with mappings information as an excel sheet file.

Parameters
compath.utils.get_pathway_model_by_name(manager_dict, resource, pathway_name)[source]

Return the pathway object from the resource manager.

Parameters
  • manager_dict (dict) – manager name to manager instances dictionary

  • resource (str) – name of the manager

  • pathway_name (str) – pathway name

Return type

Optional[Pathway]

Returns

pathway if exists

compath.utils.get_pathway_model_by_id(app, resource, pathway_id)[source]

Return the pathway object from the resource manager.

Parameters
  • app (flask.Flask) – current app

  • resource (str) – name of the manager

  • pathway_id (str) – pathway id

Return type

Optional[Pathway]

Returns

pathway if exists

compath.utils.get_gene_sets_from_pathway_names(app, pathways)[source]

Return the gene sets for a given pathway/resource tuple.

Parameters
  • app (flask.Flask) – current app

  • pathways (list[tuple[str,str]) – pathway/resource tuples

Return type

tuple[dict[str,set[str]],dict[str,str]]

Returns

gene sets

compath.utils.get_pathway_info(app, pathways)[source]

Return the gene sets for a given pathway/resource tuple.

Parameters
  • app (flask.Flask) – current app

  • pathways (list[tuple[str,str]) – pathway/resource tuples

Return type

list

Returns

pathway info

compath.utils.get_last_action_in_module(module_name, action)[source]

Return the info about the last action in the given module.

Parameters

module_name (str) –

Returns

compath.utils.perform_hypergeometric_test(gene_set, manager_pathways_dict, gene_universe, apply_threshold=False, threshold=0.05)[source]

Perform hypergeometric tests.

Parameters
  • gene_set (set[str]) – gene set to test against pathway

  • manager_pathways_dict (dict[str,dict[str,dict]]) – manager to pathways

  • gene_universe (int) – number of HGNC symbols

  • apply_threshold (Optional[bool]) – return only significant pathways

  • threshold (Optional[float]) – significance threshold (by default 0.05)

Return type

dict[str,dict[str,dict]]

Returns

manager_pathways_dict with p value info

compath.utils.calculate_szymkiewicz_simpson_coefficient(set_1, set_2)[source]

Calculate Szymkiewicz-Simpson coefficient between two sets.

Parameters
  • set_1 (set) – set 1

  • set_2 (set) – set 2

Returns

similarity of the two sets

Return type

float

compath.utils.calculate_similarity(name_1, name_2)[source]

Calculate the string based similarity between two names.

Parameters
  • name_1 (str) – name 1

  • name_2 (str) – name 2

Return type

float

Returns

Levenshtein similarity

compath.utils.get_top_matches(matches, top)[source]

Order list of tuples by second value and returns top values.

Parameters
compath.utils.filter_results(results, threshold)[source]

Filter a tuple based iterator given a threshold.

Parameters
compath.utils.get_most_similar_names(reference_name, names, threshold=0.4, top=5)[source]

Return the most similar names based on string matching.

Parameters
  • reference_name (str) –

  • names (list[str]) –

  • threshold (optional[float]) –

  • top (optional[int]) –

Returns

compath.utils.to_csv(triplets, file=None, sep='\t')[source]

Writs triplets as a tab-separated.

Parameters
  • triplets (iterable[tuple[str,str,str]]) – iterable of triplets

  • file (file) – A writable file or file-like. Defaults to stdout.

  • sep (str) – The separator. Defaults to tab.