MGnipy — Capabilities Demo#

This notebook demonstrates everything the library can do right now, and explicitly marks what is broken or not yet implemented. It is organized as a live tour before the PyPI draft publish.

Sections:

Setup
MGnifier — direct low-level API (works)
Output formatters: to_df, to_polars, to_json, to_list (works)
Query planning: dry_run, preview, explain (works)
Immutable filter cloning (works)
Pagination: page(n), get(limit) (works)
MGnipy facade + proxy classes (partially works)
Async: aget, apage (works)
❌ What is currently broken
🗺️ What comes next

1. Setup#

# Verify installation
import mgnipy
print(f"mgnipy version: {mgnipy.__version__}")

mgnipy version: 0.0.1.dev0+g73d819d64.d20260407

from mgnipy.V2.core import MGnifier
from mgnipy.V2.query_set import QuerySet
from mgnipy import MGnipy
print("All imports OK")

All imports OK

2. `MGnifier` — direct low-level API#

MGnifier is the core query object. It wraps QuerySet (which does query building) and delegates HTTP calls to QueryExecutor. You can use it directly or through the higher-level MGnipy facade.

# Create a query for studies — no API call yet
mg = MGnifier(resource="studies", params={"page_size": 5})
print(mg)

MGnifier instance for resource: studies
I.e., mgnipy.V2.core.MGnifier
----------------------------------------
Base URL: https://www.ebi.ac.uk/
Parameters: {'page_size': 5}
Endpoint module: mgnipy.emgapi_v2_client.api.studies.list_mgnify_studies
Example request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?page=1&page_size=5
Returns paginated results: True

# Fetch just the first page — one API call
mg.first()
print(f"Pages fetched so far: {list(mg._results.keys())}")

Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 1110
Total records to retrieve: 5546
Pages fetched so far: [1]

3. Output formatters#

All four formatters work once data is in _results.

# pandas DataFrame
df = mg.to_df()
print(f"to_df() → {type(df).__name__}, shape: {df.shape}")
df.head()

to_df() → DataFrame, shape: (5, 5)

	accession	ena_accessions	title	biome	updated_at
0	MGYS00000653	[DRP000157, PRJDA46243]	Metatranscriptomic Analysis for Eukaryotic Fun...	{'biome_name': 'Soil', 'lineage': 'root:Enviro...	2025-01-27T15:22:29.059000+00:00
1	MGYS00001632	[DRP000423, PRJDA68519]	The Usefulness and Reproducibility of Pyrosequ...	{'biome_name': 'Bioreactor', 'lineage': 'root:...	2026-04-27T12:08:49.939000+00:00
2	MGYS00001846	[DRP000450, PRJDA72133]	food metagenome Metagenome	{'biome_name': 'Fermented seafood', 'lineage':...	2025-01-27T15:22:38.826000+00:00
3	MGYS00001633	[DRP000451, PRJDA67149]	microbial community of traditional Korean alco...	{'biome_name': 'Fermented beverages', 'lineage...	2025-01-27T15:22:37.053000+00:00
4	MGYS00000624	[DRP000487, PRJDA73169]	Metagenomic analysis of soil microorganisms	{'biome_name': 'Soil', 'lineage': 'root:Enviro...	2025-01-27T15:22:28.785000+00:00

# Polars DataFrame
pl_df = mg.to_polars()
print(f"to_polars() → {type(pl_df).__name__}, shape: {pl_df.shape}")
pl_df.head()

to_polars() → DataFrame, shape: (5, 5)

shape: (5, 5)

accession	ena_accessions	title	biome	updated_at
str	list[str]	str	struct[2]	str
"MGYS00000653"	["DRP000157", "PRJDA46243"]	"Metatranscriptomic Analysis fo…	{"Soil","root:Environmental:Terrestrial:Soil"}	"2025-01-27T15:22:29.059000+00:…
"MGYS00001632"	["DRP000423", "PRJDA68519"]	"The Usefulness and Reproducibi…	{"Bioreactor","root:Engineered:Bioreactor"}	"2026-04-27T12:08:49.939000+00:…
"MGYS00001846"	["DRP000450", "PRJDA72133"]	"food metagenome Metagenome"	{"Fermented seafood","root:Engineered:Food production:Fermented seafood"}	"2025-01-27T15:22:38.826000+00:…
"MGYS00001633"	["DRP000451", "PRJDA67149"]	"microbial community of traditi…	{"Fermented beverages","root:Engineered:Food production:Fermented beverages"}	"2025-01-27T15:22:37.053000+00:…
"MGYS00000624"	["DRP000487", "PRJDA73169"]	"Metagenomic analysis of soil m…	{"Soil","root:Environmental:Terrestrial:Soil"}	"2025-01-27T15:22:28.785000+00:…

# List of dicts
records = mg.to_list()
print(f"to_list() → {type(records).__name__}, length: {len(records)}")
print(f"First record keys: {list(records[0].keys()) if records else 'none'}")

to_list() → list, length: 5
First record keys: ['accession', 'ena_accessions', 'title', 'biome', 'updated_at']

# JSON string (newline-delimited by default)
json_str = mg.to_json()
print(f"to_json() → {type(json_str).__name__}, {len(json_str)} chars")
print(json_str[:200], "...")

to_json() → str, 1434 chars
{"accession":"MGYS00000653","ena_accessions":["DRP000157","PRJDA46243"],"title":"Metatranscriptomic Analysis for Eukaryotic Functional Genes in Forest Soil","biome":{"biome_name":"Soil","lineage":"roo ...

4. Query planning: `dry_run`, `preview`, `explain`#

Before fetching everything you can inspect the plan — how many records, how many pages, which URLs.

# dry_run: makes one small API call (page_size=1) to learn total count, then prints the plan
planner = MGnifier(resource="analyses", params={"page_size": 10})
planner.dry_run()

Planning the API call with params:
{'page_size': 10}
Total pages to retrieve: 1420
Total records to retrieve: 14198

# After dry_run, count and total_pages are populated
print(f"Total records: {planner.count}")
print(f"Total pages (at page_size=10): {planner.total_pages}")

Total records: 14198
Total pages (at page_size=10): 1420

# explain: print the first N request URLs without making them
planner.explain(head=3)

https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=1&page_size=10
https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=2&page_size=10
https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=3&page_size=10

# list_urls: returns the full URL list
urls = planner.list_urls()
print(f"{len(urls)} URLs total. First: {urls[0]}")

1420 URLs total. First: https://www.ebi.ac.uk/metagenomics/api/v2/analyses?page=1&page_size=10

# preview: fetches page 1 and returns a DataFrame immediately
preview_df = MGnifier(resource="samples", params={"page_size": 5}).preview()
print(f"preview() → {type(preview_df).__name__}, shape: {preview_df.shape}")
preview_df.head()

Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7060
Total records to retrieve: 35300
preview() → DataFrame, shape: (5, 5)

	accession	ena_accessions	sample_title	biome	updated_at
0	SAMEA113539431	[SAMEA113539431, ERS15535852]	Study_1322_RNA	{'biome_name': 'Fecal', 'lineage': 'root:Host-...	2026-04-24T16:01:41.365000+00:00
1	SAMEA113539645	[ERS15536066, SAMEA113539645]	Study_1665_DNA	{'biome_name': 'Fecal', 'lineage': 'root:Host-...	2026-04-24T16:02:06.759000+00:00
2	SAMEA113539284	[ERS15535705, SAMEA113539284]	Study_963_DNA	{'biome_name': 'Fecal', 'lineage': 'root:Host-...	2026-04-24T16:01:23.909000+00:00
3	SAMEA113540517	[ERS15536938, SAMEA113540517]	Study_5298_DNA	{'biome_name': 'Fecal', 'lineage': 'root:Host-...	2026-04-24T16:03:53.546000+00:00
4	SAMEA115284684	[ERS18228651, SAMEA115284684]	ATZ_IGR_046_V1	None	2026-04-24T16:07:38.087000+00:00

5. Immutable filter cloning#

filter() returns a new QuerySet with the updated params — the original is untouched. This makes it safe to build queries incrementally.

base = QuerySet(resource="studies")
filtered = base.filter(biome_lineage="root:Environmental:Aquatic", page_size=5)

print(f"base params:     {base.params}")
print(f"filtered params: {filtered.params}")
print(f"Same object?     {base is filtered}")

base params:     {}
filtered params: {'biome_lineage': 'root:Environmental:Aquatic', 'page_size': 5}
Same object?     False

# Chain filters — each step is a new clone
qs = (
    QuerySet(resource="studies")
    .filter(biome_lineage="root:Environmental")
    .page_size(3)
)
print(f"Chained params: {qs.params}")
print(f"Request URL: {qs.request_url}")

Chained params: {'biome_lineage': 'root:Environmental', 'page_size': 3}
Request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?biome_lineage=root%3AEnvironmental&page=1&page_size=3

# Fetch the filtered results
qs.first()
df = qs.to_df()
print(f"Rows returned: {len(df)}")
df.head()

Planning the API call with params:
{'biome_lineage': 'root:Environmental', 'page_size': 3}
Total pages to retrieve: 889
Total records to retrieve: 2665
Rows returned: 3

	accession	ena_accessions	title	biome	updated_at
0	MGYS00000274	[SRP000664]	Windshield splatter	{'biome_name': 'Air', 'lineage': 'root:Environ...	2025-01-27T15:22:25.672000+00:00
1	MGYS00010288	[PRJEB93890]	Metagenome assembly of PRJNA270248 data set (M...	{'biome_name': 'Sediment', 'lineage': 'root:En...	2025-07-14T20:41:25.062000+00:00
2	MGYS00002009	[ERP104175, PRJEB22494]	EMG produced TPA metagenomics assembly of the ...	{'biome_name': 'Salt marsh', 'lineage': 'root:...	2025-05-16T10:58:55.790000+00:00

6. Pagination: `page(n)` and `get(limit)`#

Pages are fetched individually and cached. Already-fetched pages are not re-requested.

pg = MGnifier(resource="biomes", params={"page_size": 5})
pg.dry_run()
print(f"Total pages: {pg.total_pages}")

Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 99
Total records to retrieve: 492
Total pages: 99

# Fetch page 1
pg.page(1)
print(f"Page 1 in results: {pg._is_in_results(1)}")
print(f"Page 2 in results: {pg._is_in_results(2)}")

Page 1 in results: True
Page 2 in results: False

# Fetch page 3 (skipping page 2 — non-contiguous fetch is fine)
pg.page(3)
print(f"Pages fetched: {sorted(pg._results.keys())}")
print(f"DataFrame has {len(pg.to_df())} rows (2 pages × 5 per page)")

Pages fetched: [1, 3]
DataFrame has 10 rows (2 pages × 5 per page)

# get(limit=N) fetches however many pages are needed to satisfy limit records
# Requires dry_run() first (or pass safety=False to skip)
limited = MGnifier(resource="samples", params={"page_size": 5})
limited.dry_run()
limited.get(limit=12)  # will fetch 3 pages (5+5+5 = 15, enough for 12)
print(f"Records retrieved: {len(limited.to_df())} (asked for 12, got nearest page boundary)")

Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7060
Total records to retrieve: 35300
Records retrieved: 15 (asked for 12, got nearest page boundary)

Retrieving pages: 100%|██████████| 3/3 [00:00<00:00, 21.45it/s]

7. `MGnipy` facade + proxy classes#

MGnipy is the top-level entry point. It uses __getattr__ to dispatch attribute access to typed proxy classes.

⚠️ Known limitation: the config passed to MGnipy() is not forwarded to the proxy. Custom base URLs and auth tokens are silently ignored until M2 is fixed. ✅

client = MGnipy()
print(f"Available resources: {client.list_resources()}")

Available resources: ['analyses', 'analysis', 'assemblies', 'assembly', 'genomes', 'genome', 'publications', 'publication', 'samples', 'sample', 'studies', 'study', 'runs', 'run', 'biomes', 'biome', 'catalogues', 'catalogue']

# Accessing a resource returns a typed proxy (list-type)
studies_proxy = client.studies
print(type(studies_proxy))
print(studies_proxy)

<class 'mgnipy.V2.proxies.Studies'>
MGnifier instance for resource: studies
I.e., mgnipy.V2.proxies.Studies
----------------------------------------
Base URL: https://www.ebi.ac.uk/
Parameters: {}
Endpoint module: mgnipy.emgapi_v2_client.api.studies.list_mgnify_studies
Example request URL: https://www.ebi.ac.uk/metagenomics/api/v2/studies?page=1
Returns paginated results: True

# Proxies expose the same query-building API as MGnifier
filtered_proxy = studies_proxy.filter(biome_lineage="root:Environmental", page_size=3)
print(f"Filter returned new object: {filtered_proxy is not studies_proxy}")
print(f"Filtered params: {filtered_proxy.params}")

Filter returned new object: True
Filtered params: {'biome_lineage': 'root:Environmental', 'page_size': 3}

# Fetch first page through the proxy
filtered_proxy.first()
df = filtered_proxy.to_df()
print(f"Rows: {len(df)}")
df.head()

Planning the API call with params:
{'biome_lineage': 'root:Environmental', 'page_size': 3}
Total pages to retrieve: 889
Total records to retrieve: 2665
Rows: 3

	accession	ena_accessions	title	biome	updated_at
0	MGYS00000274	[SRP000664]	Windshield splatter	{'biome_name': 'Air', 'lineage': 'root:Environ...	2025-01-27T15:22:25.672000+00:00
1	MGYS00010288	[PRJEB93890]	Metagenome assembly of PRJNA270248 data set (M...	{'biome_name': 'Sediment', 'lineage': 'root:En...	2025-07-14T20:41:25.062000+00:00
2	MGYS00002009	[ERP104175, PRJEB22494]	EMG produced TPA metagenomics assembly of the ...	{'biome_name': 'Salt marsh', 'lineage': 'root:...	2025-05-16T10:58:55.790000+00:00

# Biomes has a special tree visualisation (after fetching)
biomes = client.biomes
biomes.first()
print(f"Biome lineages: {biomes.lineages[:5]}")

Planning the API call with params:
{}
Total pages to retrieve: 20
Total records to retrieve: 492
Biome lineages: ['root', 'root:Control', 'root:Engineered', 'root:Engineered:Biogas plant', 'root:Engineered:Biogas plant:Wet fermentation']

# Config bug demonstration — custom base_url is silently ignored right now
from mgnipy._models.config import MgnipyConfig
default_url = str(MgnipyConfig().base_url)

custom_client = MGnipy(base_url="https://custom.example.com")
proxy = custom_client.studies

print(f"Custom URL given to MGnipy: https://custom.example.com")
print(f"URL actually used by proxy: {proxy._base_url}")
print(f"Bug present: {str(proxy._base_url) == default_url}  ← should be False after M2 fix")

Custom URL given to MGnipy: https://custom.example.com
URL actually used by proxy: https://custom.example.com/
Bug present: False  ← should be False after M2 fix

8. Async: `aget`, `apage`, `afirst`#

Every sync method has an async counterpart. Use these when you need to concurrently fetch many resources.

import asyncio

async def demo_async():
    mg = MGnifier(resource="runs", params={"page_size": 5})
    await mg.afirst()
    df = mg.to_df()
    print(f"Async fetch → {len(df)} rows")
    return df

# In Jupyter, use await directly (event loop already running)
df_async = await demo_async()
df_async.head()

Planning the API call with params:
{'page_size': 5}
Total pages to retrieve: 7703
Total records to retrieve: 38514
Async fetch → 5 rows

	experiment_type	accession	instrument_model	instrument_platform	sample_accession	study_accession
0	Amplicon	DRR019176	None	None	SAMD00004051	MGYS00001632
1	Amplicon	DRR001168	None	None	SAMD00009393	MGYS00001632
2	Amplicon	DRR001169	None	None	SAMD00009394	MGYS00001632
3	Amplicon	DRR001167	None	None	SAMD00009395	MGYS00001632
4	Amplicon	DRR001170	None	None	SAMD00009396	MGYS00001632

async def demo_concurrent_pages():
    mg = MGnifier(resource="samples", params={"page_size": 10})
    mg.dry_run()
    # aget fetches all pages concurrently (with semaphore to protect the server)
    await mg.aget(limit=30, safety=False)
    return mg.to_df()

df_concurrent = await demo_concurrent_pages()
print(f"Concurrent fetch → {len(df_concurrent)} rows")

Planning the API call with params:
{'page_size': 10}
Total pages to retrieve: 3530
Total records to retrieve: 35300
Concurrent fetch → 30 rows

Retrieving pages: 100%|██████████| 3/3 [00:00<00:00, 28.81it/s]

9. ⚠️ What is currently broken#

These cells document bugs and missing features. They are expected to fail until the corresponding milestone is fixed.

✅ M1 — `cli.py` is missing#

# This should succeed after M1 is fixed
try:
    import mgnipy.cli
    print(f"✅ mgnipy.cli imported, main={mgnipy.cli.main}")
except ModuleNotFoundError as e:
    print(f"❌ M1 not fixed: {e}")
    print("   Fix: create mgnipy/cli.py with a main() function")

✅ mgnipy.cli imported, main=<function main at 0x111b67ce0>

✅ M2 — Config not passed to proxies#

# After M2 is fixed, proxy._base_url should match the custom URL
from mgnipy import MGnipy
from mgnipy._models.config import MgnipyConfig

custom = MGnipy(base_url="https://staging.example.com/")
proxy = custom.studies

expected = "https://staging.example.com/"
actual = str(proxy._base_url)

if actual == expected:
    print("✅ M2 fixed: config flows through")
else:
    print(f"❌ M2 not fixed: proxy uses '{actual}' instead of '{expected}'")
    print("   Fix: mgnipy/mgnipy.py:52 — pass base_url to proxy constructor")

✅ M2 fixed: config flows through

✅ Not-yet-implemented: `SingleResource` (accession lookup)#

✅ Intead of SingleResource I created MGnifyDetail but I think its a similar idea

from mgnipy.V2.proxies import StudyDetail, Studies

Studies(search="MGYS00001422").preview()

Planning the API call with params:
{'search': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1

	accession	ena_accessions	title	biome	updated_at
0	MGYS00001422	[ERP014234, PRJEB12735]	Amplicon sequencing of four biogas plants	{'biome_name': 'Biogas plant', 'lineage': 'roo...	2026-04-27T12:05:32.095000+00:00

# The plan (Day 2) calls for: mgnipy.studies["MGYS00001422"] → lazy SingleResource
# Currently __getitem__ on the proxy requires data to already be fetched
from mgnipy.V2.proxies import StudyDetail, Studies

detail = StudyDetail("MGYS00001422")

s = Studies(search="MGYS00001422")
s.get()

try:
    item = s["MGYS00001422"]
    print(f"✅ Accession lookup returned: {type(item).__name__}")
    # or 
    print(f"✅ Accession lookup returned: {type(detail).__name__}")
except (AttributeError, KeyError, TypeError) as e:
    print(f"❌ SingleResource not implemented: {type(e).__name__}: {e}")
    print("   Fix: implement SingleResource class and update proxy __getitem__")

Planning the API call with params:
{'search': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1
Planning the API call with params:
{'accession': 'MGYS00001422'}
Total pages to retrieve: 1
Total records to retrieve: 1
✅ Accession lookup returned: StudyDetail
✅ Accession lookup returned: StudyDetail

Retrieving pages: 100%|██████████| 1/1 [00:00<00:00, 18236.10it/s]

❌ Not-yet-implemented: `.order_by()` and `.exists()`#

from mgnipy.V2.query_set import QuerySet

qs = QuerySet(resource="studies")

for method in ("order_by", "exists"):
    if hasattr(qs, method):
        print(f"✅ {method}() exists")
    else:
        print(f"❌ {method}() not implemented yet")

❌ order_by() not implemented yet
❌ exists() not implemented yet

✅ Not-yet-implemented: `describe_resources()`#

from mgnipy import MGnipy

client = MGnipy()
result = client.describe_resources()

if result is not None:
    print(f"✅ describe_resources() returned: {result}")
else:
    print("❌ describe_resources() is a stub — returns None")
    print("   Fix: implement in mgnipy/mgnipy.py")

List all analyses (MGYAs) available from MGnify

Each analysis is the result of a Pipeline execution on a reads dataset (either a raw read-run, or an
assembly).

Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get MGnify analysis by accession

MGnify analyses are accessioned with an MYGA-prefixed identifier and correspond to an individual Run
or Assembly analysed by a Pipeline.

Supported parameters:
- accession: (str)
List all assemblies available in MGnify

Each assembly represents a collection of contigs generated by assembling sequencing reads from an
MGnify or run

Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get assembly by accession

Get detailed information about a specific assembly.

Supported parameters:
- accession: (str)
List all genomes across MGnify Genome catalogues

MGnify Genomes are either isolates, or MAGs derived from binned metagenomes.

Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single MGnify Genome

MGnify Genomes are either isolates, or MAGs derived from binned metagenomes.

Supported parameters:
- accession: (str)
List all publications

List all publications in the MGnify database.

Supported parameters:
- order: (ListMgnifyPublicationsOrderType0 | None | Unset)
- published_after: (int | None | Unset) Filter by minimum publication year
- published_before: (int | None | Unset) Filter by maximum publication year
- title: (None | str | Unset) Search within publication titles
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single publication

Get detailed information about a publication, including associated studies.

Supported parameters:
- pubmed_id: (int)
List all samples analysed by MGnify

MGnify samples inherit directly from samples (or BioSamples) in ENA.

Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- search: (None | str | Unset) Search within sample titles and accessions
- order: (ListMgnifySamplesOrderType0 | None | Unset)
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single sample analysed by MGnify

MGnify samples inherit directly from samples (or BioSamples) in ENA.

Supported parameters:
- accession: (str)
List all studies analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- order: (ListMgnifyStudiesOrderType0 | None | Unset)
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- has_analyses_from_pipeline: (None | PipelineVersions | Unset) If set, will only show studies with analyses from the specified MGnify pipeline version
- search: (None | str | Unset) Search within study titles and accessions
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single study analysed by MGnify

MGnify studies inherit directly from studies (or projects) in ENA.

Supported parameters:
- accession: (str)
List all analysed runs

List all analysed runs in the MGnify database.

Supported parameters:
- has_experiment_type: (ExperimentTypes | None | Unset) If set, will only show runs with the specified experiment type
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get the detail of a single analysed run

Get the detail of a single analysed run in the MGnify database.

Supported parameters:
- accession: (str)
List all biomes

List all biomes in the MGnify database.

Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- max_depth: (int | None | Unset) Maximum depth of the biome lineage to include, e.g. `root` is 1 and `root:Host-Associated:Human` is level 3
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
List all biomes

List all biomes in the MGnify database.

Supported parameters:
- biome_lineage: (None | str | Unset) The lineage to match, including all descendant biomes
- max_depth: (int | None | Unset) Maximum depth of the biome lineage to include, e.g. `root` is 1 and `root:Host-Associated:Human` is level 3
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
List all genome catalogues

MGnify Genomes Catalogues are biome-specific collections of isolate and MAG genomes.

Supported parameters:
- page: (int | Unset) Default: 1.
- page_size: (int | None | Unset)
Get genome catalogue by ID

Supported parameters:
- catalogue_id: (str)
✅ describe_resources() returned: {}

10. 🗺️ What comes next#

Milestones for the 2-hour PyPI publish session#

#	Fix	File	Time
M1	Create `cli.py` with `main()`	`mgnipy/cli.py` (new)	15 min
M2	Pass config to proxy constructors	`mgnipy/mgnipy.py:52`	10 min
M3	Narrow `testpaths` in pytest config	`pyproject.toml`	5 min
M4	Build wheel and test-install	—	15 min
M5	Rewrite README with accurate examples	`README.md`	20 min
M6	Add CHANGELOG	`CHANGELOG.md` (new)	5 min
M7	Tag version and push	git	10 min

To run the milestone tests#

# All milestones (offline, no API calls)
uv run pytest tests/milestones/test_milestones.py -v

# Include live-API regression tests
uv run pytest tests/milestones/test_milestones.py -v -m live_api

# After fixing M1, remove @pytest.mark.xfail from TestM1_CLI.test_after_fix_*
# and re-run — those tests should now be green

Deferred post-publish#

SingleResource — lazy accession-keyed objects (studies["MGYS00001422"])
.order_by(), .exists() methods
describe_resources() implementation
85% test coverage with mocked API calls
Full docstring pass

MGnipy — Capabilities Demo

Contents

MGnipy — Capabilities Demo#

1. Setup#

2. MGnifier — direct low-level API#

3. Output formatters#

4. Query planning: dry_run, preview, explain#

5. Immutable filter cloning#

6. Pagination: page(n) and get(limit)#

7. MGnipy facade + proxy classes#

8. Async: aget, apage, afirst#