πŸ•΅ Exploring MGnify Resources πŸ—‚

πŸ•΅ Exploring MGnify Resources πŸ—‚#

Open In Colab

The MGnify API provides access to multiple types of resources (or endpoints) such as studies, samples, analyses, runs, and more. This notebook shows you how to:

  1. Discover what resources are available

  2. Inspect what parameters each resource accepts

  3. Query resources using two different approaches

✌ Two Ways to Query Resources#

MGnipy provides two main interfaces:

  • MGnipy client: High-level interface with built-in helper functions for exploration

  • Resource proxies (mgnipy.V2.proxies): Direct access to individual resource types

Let’s start by exploring available resources using the MGnipy client.


# uncomment below if colab
#!pip install --index-url https://test.pypi.org/simple/ --extra-index-url https://pypi.org/simple mgnipy
#!pip install asyncio

1. MGnipy Client#

from mgnipy import MGnipy

# init
MG = MGnipy(
    # add a configuration
)

# we can explore which resources are available
MG.list_resources()
['analyses',
 'analysis',
 'assemblies',
 'assembly',
 'genomes',
 'genome',
 'publications',
 'publication',
 'samples',
 'sample',
 'studies',
 'study',
 'runs',
 'run',
 'biomes',
 'biome',
 'catalogues',
 'catalogue',
 'annotations',
 'private_studies']

For more detail we can describe the resource

print("studies list endpoint:")
MG.describe_resources("studies")

print("\n----------\n")

print("analysis detail endpoint:")
MG.describe_resources("analysis")
studies list endpoint:
List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: (ListMgnifyStudiesOrderType0 | None | Unset)
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: (None | PipelineVersions | Unset) If set, will only show studies with analyses from the specified MGnify pipeline version
- search: (None | str | Unset) Search within study titles and accessions
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)

----------

analysis detail endpoint:
Get MGnify analysis by accession

MGnify analyses are accessioned with an MYGA-prefixed identifier and correspond to an individual Run
or Assembly analysed by a Pipeline.

Supported parameters:
- accession: (str)

To use a given endpoint you can access it as an attribute

studies = MG.studies
filtered_studies = studies.filter(search="chicken")
print(filtered_studies.explain())
# or
print("\n----------\n")
filtered_studies = MG.studies(search="chicken")
print(filtered_studies.explain())
Planning the API call with params:
{'search': 'chicken'}
Total pages to retrieve: 2
Total records to retrieve: 36
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=1
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=2
None

----------

Planning the API call with params:
{'search': 'chicken'}
Total pages to retrieve: 2
Total records to retrieve: 36
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=1
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=2
None

again to help there are helper functions for each resource proxy such as .list_supported_params() .describe_endpoint()

print(studies.list_supported_params())
# or
print("\n----------\n")
print(studies.describe_endpoint())
['order', 'biome_lineage', 'has_analyses_from_pipeline', 'search', 'page', 'page_size']

----------

List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: (ListMgnifyStudiesOrderType0 | None | Unset)
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: (None | PipelineVersions | Unset) If set, will only show studies with analyses from the specified MGnify pipeline version
- search: (None | str | Unset) Search within study titles and accessions
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
None

2. Resource mgnipy.V2.proxies#

Alternatively, many of the same functionalities are available from the resource proxies

from mgnipy.V2.proxies import Studies
chicken_studies = Studies(search="chicken")
print(chicken_studies.describe_endpoint())
print("\n----------\n")
print(chicken_studies.explain())
List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: (ListMgnifyStudiesOrderType0 | None | Unset)
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: (None | PipelineVersions | Unset) If set, will only show studies with analyses from the specified MGnify pipeline version
- search: (None | str | Unset) Search within study titles and accessions
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
None

----------

Planning the API call with params:
{'search': 'chicken'}
Total pages to retrieve: 2
Total records to retrieve: 36
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=1
https://www.ebi.ac.uk/metagenomics/api/v2/studies?search=chicken&page=2
None