mgnipy.MGnifiers as API Resource proxies

`mgnipy.MGnifier`s as API Resource `proxies`#

The main idea 🗝️ :
mgnipy.MGnipy().studies is the exact same as mgnipy.V2.proxies.Studies()which is just a mgnipy.MGnifier(resource="studies") with added studies-specific functions.

And this is the same for all of the resource proxies (analyses, analysis, study, samples, etc.) not just “studies” in the above example.

A `MGnifier` glass#

Like how a magnifying glass 🔍 is often associated with searching/querying, the mgnipy.MGnifier class is the interface for building, executing and then caching MGnify API queries.

✅ Builds query sets#

Using MGnifier, users can specify a resource endpoint and parameters, which get translated (built) into a request url or series of request urls (e.g., due to pagination) called a QuerySet

✅ Query planning and inspection#

Prior to executing the queries, MGnifier has several built-in methods to estimate and preview the number of requests (pages) to be made, such as .preview() .dry_run() .explain()

✅ Execute the queries#

MGnifier adopts a QueryExecutor which handles the executing and caching (via DiskCheckpointer mixin) of the query sets. There is support for:

Single-page access e.g. .page(n) , .get()
Bulk retrieval e.g. .bulk_fetch()

✅ Parse responses into structured data#

Also used by MGnifier is mixins.ResultsHandler which helps to transform the API list and detail responses into usable metadata in familiar data structures, such as dataframes to_df(), lists and dictionaries.

What is the `proxies` module#

Each resource/endpoint proxy is basically an API endpoint-specific MGnifier instance.

e.g., mgnipy.MGnipy().studies is the same as mgnipy.V2.proxies.Studies() which is mgnipy.MGnifier(resource="studies") plus added functionality that is specific to the studies endpoint!!

Available API Endpoints and Proxies#

mgnipy exposes a set of “proxy” classes that map directly to MGnify API resources. Each resource typically has two proxy types:

List proxies (e.g. Studies, Samples, Analyses) which represent collection/list endpoints (e.g. /studies, /samples).
Detail proxies (e.g. StudyDetail, SampleDetail, AnalysisDetail) are used to fetch metadata for a single resource (by accession or id)

These proxies live in the mgnipy.V2.proxies subpackage and mirror the API surface documented at https://www.ebi.ac.uk/metagenomics/api/v2/.

Brief mapping (proxy → API):#

Studies → GET /studies (list). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Studies/get_mgnify_studies
StudyDetail → GET /studies/{accession} (detail). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Studies/get_mgnify_study
Samples → GET /samples (list). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Samples/get_mgnify_samples
SampleDetail → GET /samples/{accession} (detail). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Samples/get_mgnify_sample
Runs → GET /runs and RunDetail → /runs/{accession}
Assemblies → GET /assemblies and AssemblyDetail → /assemblies/{accession}
Analyses → GET /analyses and AnalysisDetail → /analyses/{accession}
Publications → GET /publications and PublicationDetail → /publications/{pubmed_id}
Genomes / Catalogues → catalogue and genome endpoints (catalogues list, genomes within catalogues)
Biomes → GET /biomes and BiomeDetail → /biomes/{biome_lineage}

Examples#

Using the high-level MGnipy client:

from mgnipy import MGnipy

mg = MGnipy()

# list studies matching a query
studies = mg.studies(search="tomato")

# get a detail for a specific study accession
study = mg.study("MGYS00001234") 

Using proxies directly:

from mgnipy.V2.proxies import Studies, Study

# MGnifyList
studies = Studies(search="tomato")

# MGnifyDetail
study = Study("MGYS00001234")

Where to read more#

Upstream API reference: https://www.ebi.ac.uk/metagenomics/api/v2/
Proxy source code: mgnipy/V2/proxies (see studies.py, samples.py, etc.)