MGnipy β Capabilities Demo#
This notebook demonstrates everything the library can do right now, and explicitly marks what is broken or not yet implemented. It is organized as a live tour before the PyPI draft publish.
Sections:
Setup
MGnifierβ direct low-level API (works)Output formatters:
to_df,to_polars,to_json,to_list(works)Query planning:
dry_run,preview,explain(works)Immutable filter cloning (works)
Pagination:
page(n),get(limit)(works)MGnipyfacade + proxy classes (partially works)Async:
aget,apage(works)β What is currently broken
πΊοΈ What comes next
1. Setup#
# Verify installation
import mgnipy
print(f"mgnipy version: {mgnipy.__version__}")
mgnipy version: 0.0.1.dev0+g73d819d64.d20260407
from mgnipy.V2.core import MGnifier
from mgnipy.V2.query_set import QuerySet
from mgnipy import MGnipy
print("All imports OK")
All imports OK
2. MGnifier β direct low-level API#
MGnifier is the core query object. It wraps QuerySet (which does query building) and delegates HTTP calls to QueryExecutor. You can use it directly or through the higher-level MGnipy facade.
# Create a query for studies β no API call yet
mg = MGnifier(resource="studies", params={"page_size": 5})
print(mg)
MGnifier instance for resource: studies
I.e., mgnipy.V2.core.MGnifier
----------------------------------------
Base URL: https://www.ebi.ac.uk/
Parameters: {'page_size': 5}
Endpoint module: mgnipy.emgapi_v2_client.api.studies.list_mgnify_studies
Example request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?page=1&page_size=5
Returns paginated results: True
# Fetch just the first page β one API call
mg.first()
print(f"Pages fetched so far: {list(mg._results.keys())}")
Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 1110
Total records to retrieve: 5546
Pages fetched so far: [1]
3. Output formatters#
All four formatters work once data is in _results.
# pandas DataFrame
df = mg.to_df()
print(f"to_df() β {type(df).__name__}, shape: {df.shape}")
df.head()
to_df() β DataFrame, shape: (5, 5)
| accession | ena_accessions | title | biome | updated_at | |
|---|---|---|---|---|---|
| 0 | MGYS00000653 | [DRP000157, PRJDA46243] | Metatranscriptomic Analysis for Eukaryotic Fun... | {'biome_name': 'Soil', 'lineage': 'root:Enviro... | 2025-01-27T15:22:29.059000+00:00 |
| 1 | MGYS00001632 | [DRP000423, PRJDA68519] | The Usefulness and Reproducibility of Pyrosequ... | {'biome_name': 'Bioreactor', 'lineage': 'root:... | 2026-04-27T12:08:49.939000+00:00 |
| 2 | MGYS00001846 | [DRP000450, PRJDA72133] | food metagenome Metagenome | {'biome_name': 'Fermented seafood', 'lineage':... | 2025-01-27T15:22:38.826000+00:00 |
| 3 | MGYS00001633 | [DRP000451, PRJDA67149] | microbial community of traditional Korean alco... | {'biome_name': 'Fermented beverages', 'lineage... | 2025-01-27T15:22:37.053000+00:00 |
| 4 | MGYS00000624 | [DRP000487, PRJDA73169] | Metagenomic analysis of soil microorganisms | {'biome_name': 'Soil', 'lineage': 'root:Enviro... | 2025-01-27T15:22:28.785000+00:00 |
# Polars DataFrame
pl_df = mg.to_polars()
print(f"to_polars() β {type(pl_df).__name__}, shape: {pl_df.shape}")
pl_df.head()
to_polars() β DataFrame, shape: (5, 5)
| accession | ena_accessions | title | biome | updated_at |
|---|---|---|---|---|
| str | list[str] | str | struct[2] | str |
| "MGYS00000653" | ["DRP000157", "PRJDA46243"] | "Metatranscriptomic Analysis fo⦠| {"Soil","root:Environmental:Terrestrial:Soil"} | "2025-01-27T15:22:29.059000+00:⦠|
| "MGYS00001632" | ["DRP000423", "PRJDA68519"] | "The Usefulness and Reproducibi⦠| {"Bioreactor","root:Engineered:Bioreactor"} | "2026-04-27T12:08:49.939000+00:⦠|
| "MGYS00001846" | ["DRP000450", "PRJDA72133"] | "food metagenome Metagenome" | {"Fermented seafood","root:Engineered:Food production:Fermented seafood"} | "2025-01-27T15:22:38.826000+00:β¦ |
| "MGYS00001633" | ["DRP000451", "PRJDA67149"] | "microbial community of traditi⦠| {"Fermented beverages","root:Engineered:Food production:Fermented beverages"} | "2025-01-27T15:22:37.053000+00:⦠|
| "MGYS00000624" | ["DRP000487", "PRJDA73169"] | "Metagenomic analysis of soil m⦠| {"Soil","root:Environmental:Terrestrial:Soil"} | "2025-01-27T15:22:28.785000+00:⦠|
# List of dicts
records = mg.to_list()
print(f"to_list() β {type(records).__name__}, length: {len(records)}")
print(f"First record keys: {list(records[0].keys()) if records else 'none'}")
to_list() β list, length: 5
First record keys: ['accession', 'ena_accessions', 'title', 'biome', 'updated_at']
# JSON string (newline-delimited by default)
json_str = mg.to_json()
print(f"to_json() β {type(json_str).__name__}, {len(json_str)} chars")
print(json_str[:200], "...")
to_json() β str, 1434 chars
{"accession":"MGYS00000653","ena_accessions":["DRP000157","PRJDA46243"],"title":"Metatranscriptomic Analysis for Eukaryotic Functional Genes in Forest Soil","biome":{"biome_name":"Soil","lineage":"roo ...
4. Query planning: dry_run, preview, explain#
Before fetching everything you can inspect the plan β how many records, how many pages, which URLs.
# dry_run: makes one small API call (page_size=1) to learn total count, then prints the plan
planner = MGnifier(resource="analyses", params={"page_size": 10})
planner.dry_run()
Planning the API call with params:
{'page_size': 10}
Total pages to retrieve: 1420
Total records to retrieve: 14198
# After dry_run, count and total_pages are populated
print(f"Total records: {planner.count}")
print(f"Total pages (at page_size=10): {planner.total_pages}")
Total records: 14198
Total pages (at page_size=10): 1420
# explain: print the first N request URLs without making them
planner.explain(head=3)
https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=1&page_size=10
https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=2&page_size=10
https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=3&page_size=10
# list_urls: returns the full URL list
urls = planner.list_urls()
print(f"{len(urls)} URLs total. First: {urls[0]}")
1420 URLs total. First: https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=1&page_size=10
# preview: fetches page 1 and returns a DataFrame immediately
preview_df = MGnifier(resource="samples", params={"page_size": 5}).preview()
print(f"preview() β {type(preview_df).__name__}, shape: {preview_df.shape}")
preview_df.head()
Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7060
Total records to retrieve: 35300
preview() β DataFrame, shape: (5, 5)
| accession | ena_accessions | sample_title | biome | updated_at | |
|---|---|---|---|---|---|
| 0 | SAMEA113539431 | [SAMEA113539431, ERS15535852] | Study_1322_RNA | {'biome_name': 'Fecal', 'lineage': 'root:Host-... | 2026-04-24T16:01:41.365000+00:00 |
| 1 | SAMEA113539645 | [ERS15536066, SAMEA113539645] | Study_1665_DNA | {'biome_name': 'Fecal', 'lineage': 'root:Host-... | 2026-04-24T16:02:06.759000+00:00 |
| 2 | SAMEA113539284 | [ERS15535705, SAMEA113539284] | Study_963_DNA | {'biome_name': 'Fecal', 'lineage': 'root:Host-... | 2026-04-24T16:01:23.909000+00:00 |
| 3 | SAMEA113540517 | [ERS15536938, SAMEA113540517] | Study_5298_DNA | {'biome_name': 'Fecal', 'lineage': 'root:Host-... | 2026-04-24T16:03:53.546000+00:00 |
| 4 | SAMEA115284684 | [ERS18228651, SAMEA115284684] | ATZ_IGR_046_V1 | None | 2026-04-24T16:07:38.087000+00:00 |
5. Immutable filter cloning#
filter() returns a new QuerySet with the updated params β the original is untouched.
This makes it safe to build queries incrementally.
base = QuerySet(resource="studies")
filtered = base.filter(biome_lineage="root:Environmental:Aquatic", page_size=5)
print(f"base params: {base.params}")
print(f"filtered params: {filtered.params}")
print(f"Same object? {base is filtered}")
base params: {}
filtered params: {'biome_lineage': 'root:Environmental:Aquatic', 'page_size': 5}
Same object? False
# Chain filters β each step is a new clone
qs = (
QuerySet(resource="studies")
.filter(biome_lineage="root:Environmental")
.page_size(3)
)
print(f"Chained params: {qs.params}")
print(f"Request URL: {qs.request_url}")
Chained params: {'biome_lineage': 'root:Environmental', 'page_size': 3}
Request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?biome_lineage=root%3AEnvironmental&page=1&page_size=3
# Fetch the filtered results
qs.first()
df = qs.to_df()
print(f"Rows returned: {len(df)}")
df.head()
Planning the API call with params:
{'biome_lineage': 'root:Environmental', 'page_size': 3}
Total pages to retrieve: 889
Total records to retrieve: 2665
Rows returned: 3
| accession | ena_accessions | title | biome | updated_at | |
|---|---|---|---|---|---|
| 0 | MGYS00000274 | [SRP000664] | Windshield splatter | {'biome_name': 'Air', 'lineage': 'root:Environ... | 2025-01-27T15:22:25.672000+00:00 |
| 1 | MGYS00010288 | [PRJEB93890] | Metagenome assembly of PRJNA270248 data set (M... | {'biome_name': 'Sediment', 'lineage': 'root:En... | 2025-07-14T20:41:25.062000+00:00 |
| 2 | MGYS00002009 | [ERP104175, PRJEB22494] | EMG produced TPA metagenomics assembly of the ... | {'biome_name': 'Salt marsh', 'lineage': 'root:... | 2025-05-16T10:58:55.790000+00:00 |
6. Pagination: page(n) and get(limit)#
Pages are fetched individually and cached. Already-fetched pages are not re-requested.
pg = MGnifier(resource="biomes", params={"page_size": 5})
pg.dry_run()
print(f"Total pages: {pg.total_pages}")
Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 99
Total records to retrieve: 492
Total pages: 99
# Fetch page 1
pg.page(1)
print(f"Page 1 in results: {pg._is_in_results(1)}")
print(f"Page 2 in results: {pg._is_in_results(2)}")
Page 1 in results: True
Page 2 in results: False
# Fetch page 3 (skipping page 2 β non-contiguous fetch is fine)
pg.page(3)
print(f"Pages fetched: {sorted(pg._results.keys())}")
print(f"DataFrame has {len(pg.to_df())} rows (2 pages Γ 5 per page)")
Pages fetched: [1, 3]
DataFrame has 10 rows (2 pages Γ 5 per page)
# get(limit=N) fetches however many pages are needed to satisfy limit records
# Requires dry_run() first (or pass safety=False to skip)
limited = MGnifier(resource="samples", params={"page_size": 5})
limited.dry_run()
limited.get(limit=12) # will fetch 3 pages (5+5+5 = 15, enough for 12)
print(f"Records retrieved: {len(limited.to_df())} (asked for 12, got nearest page boundary)")
Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7060
Total records to retrieve: 35300
Records retrieved: 15 (asked for 12, got nearest page boundary)
Retrieving pages: 100%|ββββββββββ| 3/3 [00:00<00:00, 21.45it/s]
7. MGnipy facade + proxy classes#
MGnipy is the top-level entry point. It uses __getattr__ to dispatch attribute access to typed proxy classes.
β οΈ Known limitation: the config passed to MGnipy() is not forwarded to the proxy. Custom base URLs and auth tokens are silently ignored until M2 is fixed. β
client = MGnipy()
print(f"Available resources: {client.list_resources()}")
Available resources: ['analyses', 'analysis', 'assemblies', 'assembly', 'genomes', 'genome', 'publications', 'publication', 'samples', 'sample', 'studies', 'study', 'runs', 'run', 'biomes', 'biome', 'catalogues', 'catalogue']
# Accessing a resource returns a typed proxy (list-type)
studies_proxy = client.studies
print(type(studies_proxy))
print(studies_proxy)
<class 'mgnipy.V2.proxies.Studies'>
MGnifier instance for resource: studies
I.e., mgnipy.V2.proxies.Studies
----------------------------------------
Base URL: https://www.ebi.ac.uk/
Parameters: {}
Endpoint module: mgnipy.emgapi_v2_client.api.studies.list_mgnify_studies
Example request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?page=1
Returns paginated results: True
# Proxies expose the same query-building API as MGnifier
filtered_proxy = studies_proxy.filter(biome_lineage="root:Environmental", page_size=3)
print(f"Filter returned new object: {filtered_proxy is not studies_proxy}")
print(f"Filtered params: {filtered_proxy.params}")
Filter returned new object: True
Filtered params: {'biome_lineage': 'root:Environmental', 'page_size': 3}
# Fetch first page through the proxy
filtered_proxy.first()
df = filtered_proxy.to_df()
print(f"Rows: {len(df)}")
df.head()
Planning the API call with params:
{'biome_lineage': 'root:Environmental', 'page_size': 3}
Total pages to retrieve: 889
Total records to retrieve: 2665
Rows: 3
| accession | ena_accessions | title | biome | updated_at | |
|---|---|---|---|---|---|
| 0 | MGYS00000274 | [SRP000664] | Windshield splatter | {'biome_name': 'Air', 'lineage': 'root:Environ... | 2025-01-27T15:22:25.672000+00:00 |
| 1 | MGYS00010288 | [PRJEB93890] | Metagenome assembly of PRJNA270248 data set (M... | {'biome_name': 'Sediment', 'lineage': 'root:En... | 2025-07-14T20:41:25.062000+00:00 |
| 2 | MGYS00002009 | [ERP104175, PRJEB22494] | EMG produced TPA metagenomics assembly of the ... | {'biome_name': 'Salt marsh', 'lineage': 'root:... | 2025-05-16T10:58:55.790000+00:00 |
# Biomes has a special tree visualisation (after fetching)
biomes = client.biomes
biomes.first()
print(f"Biome lineages: {biomes.lineages[:5]}")
Planning the API call with params:
{}
Total pages to retrieve: 20
Total records to retrieve: 492
Biome lineages: ['root', 'root:Control', 'root:Engineered', 'root:Engineered:Biogas plant', 'root:Engineered:Biogas plant:Wet fermentation']
# Config bug demonstration β custom base_url is silently ignored right now
from mgnipy._models.config import MgnipyConfig
default_url = str(MgnipyConfig().base_url)
custom_client = MGnipy(base_url="https://custom.example.com")
proxy = custom_client.studies
print(f"Custom URL given to MGnipy: https://custom.example.com")
print(f"URL actually used by proxy: {proxy._base_url}")
print(f"Bug present: {str(proxy._base_url) == default_url} β should be False after M2 fix")
Custom URL given to MGnipy: https://custom.example.com
URL actually used by proxy: https://custom.example.com/
Bug present: False β should be False after M2 fix
8. Async: aget, apage, afirst#
Every sync method has an async counterpart. Use these when you need to concurrently fetch many resources.
import asyncio
async def demo_async():
mg = MGnifier(resource="runs", params={"page_size": 5})
await mg.afirst()
df = mg.to_df()
print(f"Async fetch β {len(df)} rows")
return df
# In Jupyter, use await directly (event loop already running)
df_async = await demo_async()
df_async.head()
Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7703
Total records to retrieve: 38514
Async fetch β 5 rows
| experiment_type | accession | instrument_model | instrument_platform | sample_accession | study_accession | |
|---|---|---|---|---|---|---|
| 0 | Amplicon | DRR019176 | None | None | SAMD00004051 | MGYS00001632 |
| 1 | Amplicon | DRR001168 | None | None | SAMD00009393 | MGYS00001632 |
| 2 | Amplicon | DRR001169 | None | None | SAMD00009394 | MGYS00001632 |
| 3 | Amplicon | DRR001167 | None | None | SAMD00009395 | MGYS00001632 |
| 4 | Amplicon | DRR001170 | None | None | SAMD00009396 | MGYS00001632 |
async def demo_concurrent_pages():
mg = MGnifier(resource="samples", params={"page_size": 10})
mg.dry_run()
# aget fetches all pages concurrently (with semaphore to protect the server)
await mg.aget(limit=30, safety=False)
return mg.to_df()
df_concurrent = await demo_concurrent_pages()
print(f"Concurrent fetch β {len(df_concurrent)} rows")
Planning the API call with params:
{'page_size': 10}
Total pages to retrieve: 3530
Total records to retrieve: 35300
Concurrent fetch β 30 rows
Retrieving pages: 100%|ββββββββββ| 3/3 [00:00<00:00, 28.81it/s]
9. β οΈ What is currently broken#
These cells document bugs and missing features. They are expected to fail until the corresponding milestone is fixed.
β
M1 β cli.py is missing#
# This should succeed after M1 is fixed
try:
import mgnipy.cli
print(f"β
mgnipy.cli imported, main={mgnipy.cli.main}")
except ModuleNotFoundError as e:
print(f"β M1 not fixed: {e}")
print(" Fix: create mgnipy/cli.py with a main() function")
β
mgnipy.cli imported, main=<function main at 0x111b67ce0>
β M2 β Config not passed to proxies#
# After M2 is fixed, proxy._base_url should match the custom URL
from mgnipy import MGnipy
from mgnipy._models.config import MgnipyConfig
custom = MGnipy(base_url="https://staging.example.com/")
proxy = custom.studies
expected = "https://staging.example.com/"
actual = str(proxy._base_url)
if actual == expected:
print("β
M2 fixed: config flows through")
else:
print(f"β M2 not fixed: proxy uses '{actual}' instead of '{expected}'")
print(" Fix: mgnipy/mgnipy.py:52 β pass base_url to proxy constructor")
β
M2 fixed: config flows through
β
Not-yet-implemented: SingleResource (accession lookup)#
β
Intead of SingleResource I created MGnifyDetail but I think its a similar idea
from mgnipy.V2.proxies import StudyDetail, Studies
Studies(search="MGYS00001422").preview()
Planning the API call with params:
{'search': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1
| accession | ena_accessions | title | biome | updated_at | |
|---|---|---|---|---|---|
| 0 | MGYS00001422 | [ERP014234, PRJEB12735] | Amplicon sequencing of four biogas plants | {'biome_name': 'Biogas plant', 'lineage': 'roo... | 2026-04-27T12:05:32.095000+00:00 |
# The plan (Day 2) calls for: mgnipy.studies["MGYS00001422"] β lazy SingleResource
# Currently __getitem__ on the proxy requires data to already be fetched
from mgnipy.V2.proxies import StudyDetail, Studies
detail = StudyDetail("MGYS00001422")
s = Studies(search="MGYS00001422")
s.get()
try:
item = s["MGYS00001422"]
print(f"β
Accession lookup returned: {type(item).__name__}")
# or
print(f"β
Accession lookup returned: {type(detail).__name__}")
except (AttributeError, KeyError, TypeError) as e:
print(f"β SingleResource not implemented: {type(e).__name__}: {e}")
print(" Fix: implement SingleResource class and update proxy __getitem__")
Planning the API call with params:
{'search': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1
Planning the API call with params:
{'accession': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1
β
Accession lookup returned: StudyDetail
β
Accession lookup returned: StudyDetail
Retrieving pages: 100%|ββββββββββ| 1/1 [00:00<00:00, 18236.10it/s]
β Not-yet-implemented: .order_by() and .exists()#
from mgnipy.V2.query_set import QuerySet
qs = QuerySet(resource="studies")
for method in ("order_by", "exists"):
if hasattr(qs, method):
print(f"β
{method}() exists")
else:
print(f"β {method}() not implemented yet")
β order_by() not implemented yet
β exists() not implemented yet
β
Not-yet-implemented: describe_resources()#
from mgnipy import MGnipy
client = MGnipy()
result = client.describe_resources()
if result is not None:
print(f"β
describe_resources() returned: {result}")
else:
print("β describe_resources() is a stub β returns None")
print(" Fix: implement in mgnipy/mgnipy.py")
List all analyses (MGYAs) available from MGnify
Each analysis is the result of a Pipeline execution on a reads dataset (either a raw read-run, or an
assembly).
Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get MGnify analysis by accession
MGnify analyses are accessioned with an MYGA-prefixed identifier and correspond to an individual Run
or Assembly analysed by a Pipeline.
Supported parameters:
- accession: (str)
List all assemblies available in MGnify
Each assembly represents a collection of contigs generated by assembling sequencing reads from an
MGnify or run
Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get assembly by accession
Get detailed information about a specific assembly.
Supported parameters:
- accession: (str)
List all genomes across MGnify Genome catalogues
MGnify Genomes are either isolates, or MAGs derived from binned metagenomes.
Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single MGnify Genome
MGnify Genomes are either isolates, or MAGs derived from binned metagenomes.
Supported parameters:
- accession: (str)
List all publications
List all publications in the MGnify database.
Supported parameters:
- order: (ListMgnifyPublicationsOrderType0 | None | Unset)
- published_after: (int | None | Unset) Filter by minimum publication year
- published_before: (int | None | Unset) Filter by maximum publication year
- title: (None | str | Unset) Search within publication titles
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single publication
Get detailed information about a publication, including associated studies.
Supported parameters:
- pubmed_id: (int)
List all samples analysed by MGnify
MGnify samples inherit directly from samples (or BioSamples) in ENA.
Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- search: (None | str | Unset) Search within sample titles and accessions
- order: (ListMgnifySamplesOrderType0 | None | Unset)
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single sample analysed by MGnify
MGnify samples inherit directly from samples (or BioSamples) in ENA.
Supported parameters:
- accession: (str)
List all studies analysed by MGnify
MGnify studies inherit directly from studies (or projects) in ENA.
Supported parameters:
- order: (ListMgnifyStudiesOrderType0 | None | Unset)
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: (None | PipelineVersions | Unset) If set, will only show studies with analyses from the specified MGnify pipeline version
- search: (None | str | Unset) Search within study titles and accessions
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single study analysed by MGnify
MGnify studies inherit directly from studies (or projects) in ENA.
Supported parameters:
- accession: (str)
List all analysed runs
List all analysed runs in the MGnify database.
Supported parameters:
- has_experiment_type: (ExperimentTypes | None | Unset) If set, will only show runs with the specified experiment type
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single analysed run
Get the detail of a single analysed run in the MGnify database.
Supported parameters:
- accession: (str)
List all biomes
List all biomes in the MGnify database.
Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- max_depth: (int | None | Unset) Maximum depth of the biome lineage to include, e.g. `root` is 1 and `root:Host-Associated:Human` is level 3
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
List all biomes
List all biomes in the MGnify database.
Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- max_depth: (int | None | Unset) Maximum depth of the biome lineage to include, e.g. `root` is 1 and `root:Host-Associated:Human` is level 3
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
List all genome catalogues
MGnify Genomes Catalogues are biome-specific collections of isolate and MAG genomes.
Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get genome catalogue by ID
Supported parameters:
- catalogue_id: (str)
β
describe_resources() returned: {}
10. πΊοΈ What comes next#
Milestones for the 2-hour PyPI publish session#
# |
Fix |
File |
Time |
|---|---|---|---|
M1 |
Create |
|
15 min |
M2 |
Pass config to proxy constructors |
|
10 min |
M3 |
Narrow |
|
5 min |
M4 |
Build wheel and test-install |
β |
15 min |
M5 |
Rewrite README with accurate examples |
|
20 min |
M6 |
Add CHANGELOG |
|
5 min |
M7 |
Tag version and push |
git |
10 min |
To run the milestone tests#
# All milestones (offline, no API calls)
uv run pytest tests/milestones/test_milestones.py -v
# Include live-API regression tests
uv run pytest tests/milestones/test_milestones.py -v -m live_api
# After fixing M1, remove @pytest.mark.xfail from TestM1_CLI.test_after_fix_*
# and re-run β those tests should now be green
Deferred post-publish#
SingleResourceβ lazy accession-keyed objects (studies["MGYS00001422"]).order_by(),.exists()methodsdescribe_resources()implementation85% test coverage with mocked API calls
Full docstring pass