MGni.py#

MGni.py#

MGni.py (pronounced MAG-nee-pie) is a Python wrapper for the MGnify API . It provides a high-level, Pythonic interface to query metagenomics data and metadata from the MGnify database.

The Python client libraries were auto-generated using openapi-python-client and provide data models and methods for API resources using httpx and attrs.

Features#

  • Simple, Pythonic API β€” Query studies, samples, analyses, and genomes with intuitive syntax

  • Async-ready β€” Built on httpx with async/await support for efficient I/O

  • Data export β€” Multiple output formats including pandas DataFrames and AnnData objects

  • Caching β€” Automatic caching to reduce redundant API calls

  • Filtering & search β€” Powerful filtering with support for custom parameters

  • Biome hierarchy β€” Navigate the GOLD ecosystem classification system

Installation#

From PyPI (stable)#

pip install mgnipy

From TestPyPI (development)#

pip install mgnipy \
--index-url https://test.pypi.org/simple/ \
--extra-index-url https://pypi.org/simple

Development installation#

git clone https://github.com/EBI-Metagenomics/mgnipy.git
cd mgnipy
uv sync --all-groups  # or: pip install -e ".[dev,docs]"

Quick Start#

Initialize and explore#

from mgnipy import MGnipy

# Create the main client
mg = MGnipy()

# See available endpoints
print(mg.list_resources())

Query studies with filtering#

# Search for studies by biome and keyword
studies = mg.studies(
    biomes_lineage="root:Host-associated:Plants:Rhizosphere",
    search="tomato"
)

# Preview requests before fetching
print(studies.explain())
# or preview first page as df
df = studies.preview()

# Get all results (async here but also sync option)
import asyncio
asyncio.run(studies.aget())

Multiple output formats#

pd_df = studies.to_df()

# As polars DataFrame
pl_df = studies.to_polars()

# as json
results_json = studies.to_json()

Available Endpoints#

  • Studies β€” Browse and filter metagenomic studies

  • Samples β€” Query sample metadata

  • Runs β€” Access sequencing run information

  • Assemblies β€” Genome assembly data

  • Genomes β€” Genome-level information

  • Analyses β€” Analysis results and annotations

  • And more… β€” Use mg.list_resources() to see all available endpoints

Documentation#

Development#

Code quality#

# Format and sort imports
black mgnipy
isort mgnipy

# Lint
ruff check mgnipy

# Run tests
pytest mgnipy tests

Contributing#

see Contributing.md

License#

TODO

Citation#

TODO

Contributing code#

Install the code with development and docs dependencies:

uv sync --all-groups

Prior to PR:#

Format code and sort imports#

black mgnipy
isort mgnipy

lint code#

ruff check mgnipy

Run tests#

pytest mgnipy tests

There are 2 options for putting tests.

  1. Tests in the tests folder.

  2. Simple doctests under examples of function docstrings e.g.

    ...docstring text...
    
    Examples
    --------
    >>> prints_hello_world()
    hello world
    
    ...docstring text continued...
    

    Note: if you want to include a docstring example without running as a test then append # doctest: +SKIP to the line of code.

Update docs#

See the docs/README.md

Thank you#

A list of people who have contributed to this repository. You may add your name and github or email if you’d like.

From the Microbiome Informatics team @ The European Bioinformatics Institute (EMBL-EBI):#

  • Angel L. P. (angelphanth) - visiting PhD student

  • Mahfouz Shehu (MGS-sails) - Mgnify Website Developer

  • Christian Atallah (chrisAta) - Bioinformatician Mgnify

  • Sandy Rogers (SandyRogers) - MGnify Web and Platform Project Leader

  • Martin Beracochea (mberacochea) - MGnify Production Project Leader

  • Robert Finn (rdf [at] ebi.ac.uk ) - Section Head, Team Leader and Senior Scientist

From the Multiomics Network Analytics team @ Danmarks Tekniske Universitet (DTU):#

  • Angel L. P. (angelphanth) - PhD student

  • SebastiΓ‘n Ayala Ruano (sayalaruano) - Previous MSc student and Research Assistant

  • Alberto Santos Delgado (albsantosdel) - Senior Researcher and BRIGHT Informatics Platform Director

Extra speciak thanks to:#