MPRAlib documentation
MPRAlib is a Python library and CLI for processing MPRA (Massively Parallel Reporter Assay) data.
Citation
If you use MPRAlib in your work, please cite our recent preprint:
Uniform processing and analysis of IGVF massively parallel reporter assay data with MPRAsnakeflow Jonathan D. Rosen, Arjun Devadas Vasanthakumari, Kilian Salomon, Nikola de Lange, Pyaree Mohan Dash, Pia Keukeleire, Ali Hassan, Alejandro Barrera, Martin Kircher, Michael I. Love, Max Schubach bioRxiv (2025). 2025.09.25.678548
Installation
PyPI
pip install mpralib
Conda
From the bioconda channel:
conda install -c bioconda mpralib
Usage
Command Line Interface
Use the mpralib command to access various functionalities.
Validate a file
MPRAlib provides a CLI tool for validating MPRA data files against supported schemas.
mpralib validate-file <schema> --input <input_file>
<schema>: One ofreporter-sequence-design,reporter-barcode-to-element-mapping,reporter-experiment-barcode,reporter-experiment,reporter-element,reporter-variant,reporter-genomic-element,reporter-genomic-variant<input_file>: Path to your data file (e.g.,.tsv.gz,.bed.gz)
Example:
mpralib validate-file reporter-sequence-design --input data/reporter_sequence_design.example.tsv.gz
Functional analysis
Filter barcodes using multiple filters, like setting min/max counts or detect barcode outliers
mpralib functional <schema> <inputs>
<schema>: One ofactivities,compute-correlation,filter<inputs>: Please use--helpfor more details on the schema.
Example:
mpralib functional filter --input data/reporter_experiment_barcode.example.tsv.gz --method max_count --method-values '{"rna_max_count": 500, "dna_max_count": 300}' --output-barcode data/reporter_experiment_barcode.filtered.tsv.gz
Plotting
Generate plots of your data.
mpralib plot <schema> --input <input_file> --bc-threshold <bc_threshold> --output <output_file> <other_inputs>
<schema>: One ofbarcodes-per-oligo,correlation,dna-vs-rna,outlier<input_file>: MPRA experiment or experiment barcode.<bc_threshold>: Minimum number of barcodes per oligo to include in the plot.<output_file>: Path to save the plot (e.g.,.png,.pdf)<other_inputs>: Please use--helpfor more details on the schema.
Example:
mpralib plot correlation --input data/reporter_experiment_barcode.example.tsv.gz --oligos --bc-threshold 10 --modality activity --output data/test.png
Combine counts with other outputs
Combines counts data with MPRA sequence design and quantification from other tools like BCalm to create several other output data, like bed file tracks. Can also create a variant map from the MPRA design file as an input for mpralm and BCalm.
mpralib combine <schema> <inputs>
<schema>: One ofget-counts,get-reporter-elements,get-reporter-genomic-elements,get-reporter-genomic-variants,get-reporter-variants,get-variant-counts,get-variant-map<inputs>: Please use--helpfor more details on the schema.
Example:
mpralib combine get-variant-map --input data/reporter_experiment_barcode.example.tsv.gz --sequence-design data/mpra_sequence_design.example.tsv.gz --output data/variant_map_of_oligo.tsv.gz
Python API
MPRAlib is primarily intended to be used as a library. Please see our notebook mpralib.ipynb for a more detailed example.
License
MIT License