Accessing MGnify API Resources#

The MGnify API provides access to multiple types of resources (or endpoints) such as studies, samples, analyses, runs, and more. This notebook shows you how to

  1. Discover what resources are available

  2. Inspect what query parameters each resource accepts


# uncomment below if colab
#!pip install mgnipy

Starting up a mgnipy.MGnipy client#

For more details on configuring mgnipy and the detault configuration go to the “congifuration” page

from mgnipy import MGnipy
# init
MG = MGnipy(
    # add a configuration
    cache_dir=None,
)
# print the MGnipy instance to see its configuration (credentials are not printed)
print(MG)
MGnipy(config=api_version=<SupportedApiVersions.V2: 'v2'> base_url=HttpUrl('https://www.ebi.ac.uk/') cache_dir=None)

Exploring the available resources#

We can learn more about the MGnify API and its available resources via the MGnipy client.

# to list all avail resources
MG.list_resources()
['analyses',
 'analysis',
 'assemblies',
 'assembly',
 'genomes',
 'genome',
 'publications',
 'publication',
 'samples',
 'sample',
 'studies',
 'study',
 'runs',
 'run',
 'biomes',
 'biome',
 'catalogues',
 'catalogue',
 'private_studies']

For more detail here we can describe the resource using the helper method .describe_resource() or .describe_resources()

print("studies list endpoint:")
MG.describe_resource("studies")

print("\n----------\n")

print("analysis detail endpoint:")
MG.describe_resource("analysis")
studies list endpoint:
List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: ListMgnifyStudiesOrderType0 | None | Unset
- biome_lineage: None | str | Unset The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: None | PipelineVersions | Unset If set, will only show studies with analyses from the specified MGnify pipeline version
- search: None | str | Unset Search within study titles and accessions
- page: int | Unset Default: 1.
- page_size: int | None | Unset

----------

analysis detail endpoint:
Get MGnify analysis by accession

MGnify analyses are accessioned with an MYGA-prefixed identifier and correspond to an individual Run
or Assembly analysed by a Pipeline.

Supported parameters:
- accession: str

Each of the listed resource proxies above is a MGnifier that maps directly to a MGnify API endpoint. More information on the proxies can be found on the proxies page , but at a glance:

  • the plural resources (e.g. analyses studies) represent collection/list endpoints from the API

    e.g.

    • Studies: Lists of MGnify studies

    • Analyses: Lists of MGnify pipeline analyses on runs or assemblies

    Usually we use MGnifyList endpoints to search or filter for a list of the resource

  • the singular (e.g. analysis study) represent a detail endpoint (i.e., getting the details of a single study, analysis, etc)

    e.g.

    • Study: Details/metadata for a study given its study accession id

    • Analysis: Details/metadata for a MGnify Analysis given its MGnify analysis accession id

    MGnifyDetail endpoints are used to get the metadata for a given item.

  • typically one would:

    1. first acquire a MGnifyList of Studies

    2. and then for each item (study) in Studies get their MGnifyDetail StudyDetail

Accessing a Resource#

To use a given endpoint you can access it as an attribute of your mgnipy.MGnipy instance.

By using MGnipy().<chosen_resource> the resource proxy (aka endpoint-specific MGnifier) is automatically configured

# accessing Studies proxy as an attribute of MGnipy instance
studies = MG.studies

again to help there are helper functions for each resource proxy such as .list_supported_params() .describe_endpoint()

# print for more info 
print(studies, "\n----------\n")
# or helper to list supported query params for the endpoint
print(studies.list_supported_params(), "\n----------\n")
# or a helper to describe corresponding API endpoint 
studies.describe_endpoint()
MGnifier instance for resource: studies
I.e., mgnipy.V2.proxies.studies.Studies
----------------------------------------
Base URL: https://www.ebi.ac.uk/
Parameters: {}
Example request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?page=1
Endpoint module: mgnipy.emgapi_v2_client.api.studies.list_mgnify_studies
Is list endpoint (returns paginated results): True
Cache directory: None
 
----------

['order', 'biome_lineage', 'has_analyses_from_pipeline', 'search', 'page', 'page_size'] 
----------

List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: ListMgnifyStudiesOrderType0 | None | Unset
- biome_lineage: None | str | Unset The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: None | PipelineVersions | Unset If set, will only show studies with analyses from the specified MGnify pipeline version
- search: None | str | Unset Search within study titles and accessions
- page: int | Unset Default: 1.
- page_size: int | None | Unset

Searching a Resource#

Using the supported params we can filter our MGnifyLists.

For example, for Studies list we can .filter by search and has_analyses_from_pipeline

We can pass our search params either:

  1. using .filter() after accessing the resource from MGnipy (as we had a few cells earlier) OR

  2. during accessing the resource from MGnipy (see cell below)

Note: explain provides a preview of the urls to be called to fulfil our search and populate the Studies list

# MGnifyList example with studies endpoint
# 1. filter method 
filtered_studies = studies.filter(search="chicken")
filtered_studies.explain()
# or
print("\n----------\n")

# 2. directly with query params in the MGnipy instance attribute
filtered_studies = MG.studies(search="chicken")
filtered_studies.explain()
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=1
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=2

----------
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=1
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=2

Note: MGnifyDetails can also be “filtered” but basically only by accession/id… For example

# MGnifyDetail example with .study
# 1. filter method 
study = MG.study()
study = study.filter(accession="MGYS00000653")
study.explain()

# or
print("\n----------\n")

# 2. directly with query params in the MGnipy instance attribute
study = MG.study(accession="MGYS00000653")
study.explain()
https://www.ebi.ac.uk/metagenomics/api/v2/studies/MGYS00000653

----------

https://www.ebi.ac.uk/metagenomics/api/v2/studies/MGYS00000653

Wrap Up:#

This page was a quick start demonstration of:

  1. ✅ Start up a mgnipy.MGnipy client with your desired configuration

  2. ✅ Search in MGnify resources using a MGnifier glass

  3. ❌ ~~Receive a MGazine of MGnify datasets~~ (go to next page)