Tutorial: Custom GMT Libraries

Learn how GMT files are structured, common pitfalls to avoid, and how to add your own custom libraries to pathXcite.

Step 1: Learn about Enrichr

Follow the steps below. Use ←/→ to navigate; press 1-9 to jump.

Step 1 of 5

GMT (Gene Matrix Transposed) in a nutshell

GMT is a simple, tab-delimited text format for storing collections of gene sets. Each line is one gene set.

Canonical layout

<SET_NAME>\t<DESCRIPTION_OR_URL>\t<GENE_1>\t<GENE_2>\t...\t<GENE_N>
  • SET_NAME: short identifier for the set (no tabs). Prefer A-Z, 0-9, _, -.
  • DESCRIPTION_OR_URL: free text, a reference, or a link. Can be empty (- or blank after the tab).
  • GENE_i: one identifier per column (e.g., HGNC symbols). All sets in a file should use the same ID system and species.

Minimal working example

TNF_SIGNALING\tKEGG hsa04668\tTNF\tTNFRSF1A\tTRADD\tTRAF2\tMAP3K7\tIKBKB\tNFKBIA\tRELA
OXPHOS_CORE\tReactome R-HSA-1428517\tNDUFS1\tNDUFA9\tUQCRC1\tCOX4I1\tATP5F1A
Encoding & line endings
  • Use UTF-8 text, Unix line endings (\n).
  • Do not wrap lines; one line = one gene set.
← Back to Tutorials