mgnipy.MGnifiers as API Resource proxies#
The main idea 🗝️ :
mgnipy.MGnipy().studiesis the exact same asmgnipy.V2.proxies.Studies()which is just amgnipy.MGnifier(resource="studies")with addedstudies-specific functions.
And this is the same for all of the resource proxies (analyses, analysis, study, samples, etc.) not just “studies” in the above example.
A MGnifier glass#
Like how a magnifying glass 🔍 is often associated with searching/querying, the mgnipy.MGnifier class is the interface for building, executing and then caching MGnify API queries.
✅ Builds query sets#
Using MGnifier, users can specify a resource endpoint and parameters, which get translated (built) into a request url or series of request urls (e.g., due to pagination) called a QuerySet
✅ Query planning and inspection#
Prior to executing the queries, MGnifier has several built-in methods to estimate and preview the number of requests (pages) to be made, such as .preview() .dry_run() .explain()
✅ Execute the queries#
MGnifier adopts a QueryExecutor which handles the executing and caching (via DiskCheckpointer mixin) of the query sets.
There is support for:
Single-page access e.g.
.page(n),.get()Bulk retrieval e.g.
.bulk_fetch()
✅ Parse responses into structured data#
Also used by MGnifier is mixins.ResultsHandler which helps to transform the API list and detail responses into usable metadata in familiar data structures, such as dataframes to_df(), lists and dictionaries.
What is the proxies module#
Each resource/endpoint proxy is basically an API endpoint-specific MGnifier instance.
e.g., mgnipy.MGnipy().studies is the same as mgnipy.V2.proxies.Studies() which is mgnipy.MGnifier(resource="studies") plus added functionality that is specific to the studies endpoint!!
Available API Endpoints and Proxies#
mgnipy exposes a set of “proxy” classes that map directly to MGnify API resources. Each resource typically has two proxy types:
List proxies (e.g.
Studies,Samples,Analyses) which represent collection/list endpoints (e.g./studies,/samples).Detail proxies (e.g.
StudyDetail,SampleDetail,AnalysisDetail) are used to fetch metadata for a single resource (by accession or id)
These proxies live in the mgnipy.V2.proxies subpackage and mirror the API surface documented at https://www.ebi.ac.uk/metagenomics/api/v2/.
Brief mapping (proxy → API):#
Studies→ GET/studies(list). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Studies/get_mgnify_studiesStudyDetail→ GET/studies/{accession}(detail). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Studies/get_mgnify_studySamples→ GET/samples(list). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Samples/get_mgnify_samplesSampleDetail→ GET/samples/{accession}(detail). See API: https://www.ebi.ac.uk/metagenomics/api/v2/#/Samples/get_mgnify_sampleRuns→ GET/runsandRunDetail→/runs/{accession}Assemblies→ GET/assembliesandAssemblyDetail→/assemblies/{accession}Analyses→ GET/analysesandAnalysisDetail→/analyses/{accession}Publications→ GET/publicationsandPublicationDetail→/publications/{pubmed_id}Genomes/Catalogues→ catalogue and genome endpoints (catalogues list, genomes within catalogues)Biomes→ GET/biomesandBiomeDetail→/biomes/{biome_lineage}
Examples#
Using the high-level MGnipy client:
from mgnipy import MGnipy
mg = MGnipy()
# list studies matching a query
studies = mg.studies(search="tomato")
# get a detail for a specific study accession
study = mg.study("MGYS00001234")
Using proxies directly:
from mgnipy.V2.proxies import Studies, Study
# MGnifyList
studies = Studies(search="tomato")
# MGnifyDetail
study = Study("MGYS00001234")
Where to read more#
Upstream API reference: https://www.ebi.ac.uk/metagenomics/api/v2/
Proxy source code:
mgnipy/V2/proxies(seestudies.py,samples.py, etc.)