mgnipy.V2.proxies.biomes module#
- class mgnipy.V2.proxies.biomes.BiomeDetail(id=None, *, biome_lineage=None, config=None, **kwargs)[source]#
Bases:
BiomesTreeMixin,MGnifyDetail- async abulk_fetch(*args, **kwargs)#
Asynchronously fetch a large collection of results efficiently.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
All fetched results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = await query.abulk_fetch(limit=100)
- async aget()#
Asynchronously fetch all pages of results.
- Returns:
All result data.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = await query.aget()
- async aget_list(resource, *, fetch=True, explain=False)#
Get list proxy for a specific accession/pubmed_id/catalogue_id detail.
- Parameters:
- Returns:
A proxy for the next resource.
- Return type:
Examples
samples = await study.aget_list(“samples”, fetch=False)
- async apage(*args, **kwargs)#
Asynchronously fetch a specific page or range of pages.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
The requested page(s) of results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> page_data = await query.apage(1)
- bulk_fetch(*args, **kwargs)#
Fetch a large collection of results efficiently.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
All fetched results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = query.bulk_fetch(limit=100)
- clear_cache()#
Clear the cached results for the current resource and parameters. This will delete any cached files associated with the current query parameters.
- config: MGnipyConfig#
- continue_iterator(*args, **kwargs)#
Continue iteration from a specific page. THis is a facade of underlying QueryExecutor.continue_iterator, allowing users to resume iteration after an interruption or to jump to a specific page.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.continue_iterator(start_page=5)
- describe_endpoint(**kwargs)#
Retrieve documentation about the endpoint.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> docs = query.describe_endpoint()
- describe_relationships()#
Describe the related resources and their relationships.
- Return type:
None
Note
This method is not yet implemented.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.describe_relationships()
- property downloads: list [dict [str , Any ]]#
A list of download information dicts for the detail, extracted from the details results.
Each dict is updated with the identifier of the detail. The identifier key is determined by the id_param_key of the detail class, e.g. “accession” for studies, samples, runs, analyses, genomes, assemblies; “biome_lineage” for biomes; “pubmed_id” for publications; “catalogue_id” for catalogues.
- dry_run()#
Plan the API call by validating parameters and estimating the number of pages and records available. Prints the plan details for the user to review before executing the full data retrieval. This method can be called before get() to ensure that the parameters are valid and to understand the scope of the data retrieval.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies", params={"search": "gut"}) >>> query.dry_run()
- emgapi_handler: DescribeEmgapiModule#
- explain(head=None)#
Print example API URLs that would be called.
- Parameters:
head (int , optional) – Maximum number of URLs to print. If
None, prints all.- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.explain(head=3)
- filter(**filters)#
Update the parameters for the API call to filter results.
- Parameters:
**filters – Keyword arguments corresponding to the supported parameters for the current resource. These will be used to filter the results returned by the API.
- Returns:
A new QuerySet instance with updated parameters for filtering results.
- Return type:
- first()#
Get the first record from the query results.
Executes the query and returns the first metadata record.
- Returns:
The first record as a dictionary, or
Noneif unavailable.- Return type:
dict or None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> first_record = query.first()
- get()#
Fetch all pages of results.
- Returns:
All result data.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = query.get()
- get_list(resource, *, fetch=True, explain=False)#
Get list proxy for a specific accession/pubmed_id/catalogue_id detail.
- Parameters:
- Returns:
A proxy for the next resource.
- Return type:
Examples
samples = study.get_list(“samples”, fetch=False)
- property id_param_key: str #
Get the parameter name used to identify this resource.
- Returns:
The identifier parameter (e.g., “accession”, “biome_lineage”).
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> key = query.id_param_key
- property identifier: str | None #
Get the identifier value from the query parameters.
Used for constructing URLs to related resources.
- Returns:
The identifier value, or
Noneif not set.- Return type:
str or None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies", accession="MGYS000000001", config={}) >>> query.identifier
- property last_successful_page: int | None #
Get the last successfully retrieved page number.
- Returns:
The last successful page number, or None if no pages have been retrieved yet.
- Return type:
int or None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.get() >>> print(query.last_successful_page)
- list_relationships()#
Get the names of related resources available from this resource.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> relationships = query.list_relationships()
- list_supported_params()#
Get the valid query filter parameters for this resource.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> params = query.list_supported_params()
- list_urls()#
Generate and return a list of URLs for all the API requests that would be made to retrieve the data based on the current parameters. This allows the user to see exactly which endpoints and query parameters will be used in the API calls before executing them.
- page(*args, **kwargs)#
Fetch a specific page or range of pages.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
The requested page(s) of results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> page_data = query.page(1)
- preview()#
Get a DataFrame preview of the first page of results.
Quickly check the structure and content of the data without retrieving all pages.
- Returns:
DataFrame containing the first page of metadata.
- Return type:
pd.DataFrame
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> df = query.preview()
- property progress#
Get the progress of the current query execution as a percentage.
- Returns:
Progress percentage and counts (e.g., “75.00% (150/200 pages)”).
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> print(query.progress)
- queries(**httpx_kwargs)#
Generate a list of query parameter dictionaries for each API request that would be made based on the current parameters. This allows the user to see the specific query parameters for each request before executing them.
- property records: chain | None #
Get an iterator of individual metadata records from the retrieved results, if available. This property provides a convenient way to access the metadata records without needing to handle pagination.
- Returns:
An iterator that yields individual metadata records if results are available, otherwise None.
- Return type:
chain or None
- property request_url: str #
Get the URL for the API request based on the current resource and parameters. This is a single URL that represents the request for the current page of results.
- Returns:
The constructed URL for the API request.
- Return type:
- reset_iterator()#
Reset the pagination state to the beginning.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.reset_iterator()
- property resource: SupportedEndpoints#
- property results_ids: list [str ] | None #
Get the list of identifiers from the current results.
- Returns:
List of identifiers (accessions, etc.), or
Noneif no results.- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.get() >>> ids = query.results_ids
- resume()#
Again facade of QueryExecutor.resume, allowing users to easily continue fetching results after an interruption.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.resume()
- show_tree(method='compact')#
- Parameters:
method (Literal ['compact', 'show', 'print', 'horizontal', 'hshow', 'h', 'hprint', 'vertical', 'vshow', 'v', 'vprint'])
- to_df(data=None, expand_nested_dicts=False, rename_columns=None, **kwargs)#
Convert the current or provided metadata to a pandas DataFrame.
- Parameters:
data (list of dict , optional) – List of records to convert. If
None, uses :pyattr:`data`.expand_nested_dicts (list of str or bool , optional) – List of keys to expand into separate columns, or
Trueto expand defaults.rename_columns (dict of str to str, optional) – A dictionary mapping old column names to new column names.
**kwargs – Additional keyword arguments passed to
pd.DataFrame.
- Returns:
DataFrame containing the metadata or
Nonewhen no data is available.- Return type:
pd.DataFrame or None
Examples
>>> handler = ResultsHandler(data=[{"a": 1, "b": 2}]) >>> df = handler.to_df() >>> list(df.columns) ['a', 'b'] >>> df.iloc[0]['a'] np.int64(1)
- to_json(data=None, orient='records', lines=True, **json_kwargs)#
Convert the current metadata to a JSON string or save it to a file.
- Parameters:
- Returns:
The JSON string representation of the metadata, or None if no data is available.
- Return type:
str or None
- Raises:
RuntimeError – If no data is available to convert.
- to_list(data=None)#
Convert the current or provided metadata to a list of dictionaries.
- Parameters:
data (optional) – The paginated data to convert. If
None, uses :pyattr:`data`.- Returns:
A list of metadata records as dictionaries, or
Noneif no data is available.- Return type:
Examples
>>> handler = ResultsHandler(data=[{"x": 10}]) >>> handler.to_list() [{'x': 10}]
- to_polars(data=None, expand_nested_dicts=False, rename_columns=None, **polars_kwargs)#
Convert the current metadata to a Polars DataFrame.
- Parameters:
- Returns:
A Polars DataFrame containing the metadata.
- Return type:
pl.DataFrame
- Raises:
RuntimeError – If no data is available to convert.
- property tree: Tree#
Convert the biomes metadata to a tree structure for visualization or analysis.
- Returns:
A tree representation of the biomes and their relationships.
- Return type:
Tree
- class mgnipy.V2.proxies.biomes.Biomes(*, params=None, config=None, **kwargs)[source]#
Bases:
BiomesTreeMixin,MGnifyList- async abulk_fetch(*args, **kwargs)#
Asynchronously fetch a large collection of results efficiently.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
All fetched results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = await query.abulk_fetch(limit=100)
- async aenrich_details(limit=200, hide_progress=False)#
Async version of enrich_details that retrieves details for each item in the MGnifyList asynchronously.
- Parameters:
limit (Optional[int ], default=200) – An optional integer to limit the number of items to enrich. If not provided, it defaults to 200. If set to None, there will be no limit on the number of items enriched.
hide_progress (bool , default=False) – A boolean flag to control the display of the progress bar. If set to True, the progress bar will be hidden.
- Returns:
This method does not return anything. It updates the internal state of the MGnifyList instance by populating the .details .details_df and .details_results with the details of each item.
- Return type:
None
- async aget()#
Asynchronously fetch all pages of results.
- Returns:
All result data.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = await query.aget()
- async aget_detail()#
Async variant of get_detail.
- Returns:
The next detail proxy, or None if no more details to iterate.
- Return type:
MGnifyDetail or None
- property aiter_details: AsyncIterator [dict ]#
Async version of iter_details.
- Returns:
An async iterator that yields MGnifyDetail results one by one, fetched on demand.
- Return type:
AsyncIterator[dict ]
- async apage(*args, **kwargs)#
Asynchronously fetch a specific page or range of pages.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
The requested page(s) of results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> page_data = await query.apage(1)
- bulk_fetch(*args, **kwargs)#
Fetch a large collection of results efficiently.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
All fetched results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = query.bulk_fetch(limit=100)
- clear_cache()#
Clear the cached results for the current resource and parameters. This will delete any cached files associated with the current query parameters.
- config: MGnipyConfig#
- continue_detail_iterator(start_index=None)#
Continue iterating for MGnifyDetails from start_index or the next index after the last successful detail.
- Parameters:
start_index (int , optional) – The index to continue from. If None, will continue from the next index after the last successful detail, or 0 if no successful detail yet.
- Returns:
The current instance with the detail iterator reset to the specified index.
- Return type:
- continue_iterator(*args, **kwargs)#
Continue iteration from a specific page. THis is a facade of underlying QueryExecutor.continue_iterator, allowing users to resume iteration after an interruption or to jump to a specific page.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.continue_iterator(start_page=5)
- describe_endpoint(**kwargs)#
Retrieve documentation about the endpoint.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> docs = query.describe_endpoint()
- describe_relationships()#
Describe the related resources and their relationships.
- Return type:
None
Note
This method is not yet implemented.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.describe_relationships()
- property details: list [MGnifyDetail]#
- details_df(*args, **kwargs)#
Convert the current or provided metadata to a pandas DataFrame.
- Parameters:
data (list of dict , optional) – List of records to convert. If
None, uses :pyattr:`data`.expand_nested_dicts (list of str or bool , optional) – List of keys to expand into separate columns, or
Trueto expand defaults.rename_columns (dict of str to str, optional) – A dictionary mapping old column names to new column names.
**kwargs – Additional keyword arguments passed to
pd.DataFrame.
- Returns:
DataFrame containing the metadata or
Nonewhen no data is available.- Return type:
pd.DataFrame or None
- property details_ids: list [str ]#
A list of detail identifiers (e.g. accessions) extracted from the details results.
- property details_results: list [dict [str , Any ]]#
A list of detail results dicts for the detail, extracted from the details results.
- dry_run()#
Plan the API call by validating parameters and estimating the number of pages and records available. Prints the plan details for the user to review before executing the full data retrieval. This method can be called before get() to ensure that the parameters are valid and to understand the scope of the data retrieval.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies", params={"search": "gut"}) >>> query.dry_run()
- emgapi_handler: DescribeEmgapiModule#
- enrich_details(limit=200, hide_progress=False)#
Gets the details for each mgnify list item. Iterates through the accessions/ids (.results_ids) and retrieves their details using the corresponding detail proxy (e.g., RunDetail for Runs).
- Parameters:
limit (Optional[int ], default=200) – An optional integer to limit the number of runs to enrich. If not provided, it defaults to 200. If set to None, there will be no limit on the number of runs enriched.
hide_progress (bool , default=False) – A boolean flag to control the display of the progress bar. If set to True, the progress bar will be hidden.
- Returns:
This method does not return anything. It updates the internal state of the MGnifyList instance by populating the .details .details_df and .details_results with the details of each item.
- Return type:
None
- explain(head=None)#
Print example API URLs that would be called.
- Parameters:
head (int , optional) – Maximum number of URLs to print. If
None, prints all.- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.explain(head=3)
- filter(**filters)#
Update the parameters for the API call to filter results.
- Parameters:
**filters – Keyword arguments corresponding to the supported parameters for the current resource. These will be used to filter the results returned by the API.
- Returns:
A new QuerySet instance with updated parameters for filtering results.
- Return type:
- first()#
Get the first record from the query results.
Executes the query and returns the first metadata record.
- Returns:
The first record as a dictionary, or
Noneif unavailable.- Return type:
dict or None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> first_record = query.first()
- get()#
Fetch all pages of results.
- Returns:
All result data.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> results = query.get()
- get_detail()#
Get the next MGnifyDetail based on current _detail_index. Updates _last_successful_detail on success.
- Returns:
The next detail proxy, or None if no more details to iterate.
- Return type:
MGnifyDetail or None
Example
>>> from mgnipy.V2.proxies import Studies >>> studies = Studies(search="tomato") >>> studies.bulk_fetch() >>> first_detail = studies.get_detail() >>> second_detail = studies.get_detail()
- property id_param_key: str #
Get the parameter name used to identify this resource.
- Returns:
The identifier parameter (e.g., “accession”, “biome_lineage”).
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> key = query.id_param_key
- property iter_details: Iterator [dict ]#
Yield MGnifyDetail results one by one.
- Returns:
An iterator that yields MGnifyDetail results one by one, fetched on demand.
- Return type:
Iterator[dict ]
Examples
>>> from mgnipy.V2.proxies import Studies >>> studies = Studies() >>> result_dict = next(studies.iter_details)
- property last_successful_page: int | None #
Get the last successfully retrieved page number.
- Returns:
The last successful page number, or None if no pages have been retrieved yet.
- Return type:
int or None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.get() >>> print(query.last_successful_page)
- list_relationships()#
Get the names of related resources available from this resource.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> relationships = query.list_relationships()
- list_supported_params()#
Get the valid query filter parameters for this resource.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> params = query.list_supported_params()
- list_urls()#
Generate and return a list of URLs for all the API requests that would be made to retrieve the data based on the current parameters. This allows the user to see exactly which endpoints and query parameters will be used in the API calls before executing them.
- page(*args, **kwargs)#
Fetch a specific page or range of pages.
- Parameters:
*args – Positional arguments forwarded to executor.
**kwargs – Keyword arguments forwarded to executor.
- Returns:
The requested page(s) of results.
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> page_data = query.page(1)
- page_size(n)#
Set the page size for paginated API calls.
- preview()#
Get a DataFrame preview of the first page of results.
Quickly check the structure and content of the data without retrieving all pages.
- Returns:
DataFrame containing the first page of metadata.
- Return type:
pd.DataFrame
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> df = query.preview()
- property progress#
Get the progress of the current query execution as a percentage.
- Returns:
Progress percentage and counts (e.g., “75.00% (150/200 pages)”).
- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> print(query.progress)
- queries(**httpx_kwargs)#
Generate a list of query parameter dictionaries for each API request that would be made based on the current parameters. This allows the user to see the specific query parameters for each request before executing them.
- property records: chain | None #
Get an iterator of individual metadata records from the retrieved results, if available. This property provides a convenient way to access the metadata records without needing to handle pagination.
- Returns:
An iterator that yields individual metadata records if results are available, otherwise None.
- Return type:
chain or None
- property request_url: str #
Get the URL for the API request based on the current resource and parameters. This is a single URL that represents the request for the current page of results.
- Returns:
The constructed URL for the API request.
- Return type:
- reset_iterator()#
Reset the pagination state to the beginning.
- Return type:
None
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.reset_iterator()
- property resource: SupportedEndpoints#
- property results_ids: list [str ] | None #
Get the list of identifiers from the current results.
- Returns:
List of identifiers (accessions, etc.), or
Noneif no results.- Return type:
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.get() >>> ids = query.results_ids
- resume()#
Again facade of QueryExecutor.resume, allowing users to easily continue fetching results after an interruption.
Examples
>>> from mgnipy.V2.core import MGnifier >>> query = MGnifier("studies") >>> query.resume()
- resume_detail_iterator()#
Resume from the element after the last successful MGnifyDetail fetch.
- Returns:
The current instance with the detail iterator reset to the specified index.
- Return type:
- Raises:
RuntimeError – If there is no last successful detail to resume from.
- show_tree(method='compact')#
- Parameters:
method (Literal ['compact', 'show', 'print', 'horizontal', 'hshow', 'h', 'hprint', 'vertical', 'vshow', 'v', 'vprint'])
- to_df(data=None, expand_nested_dicts=False, rename_columns=None, **kwargs)#
Convert the current or provided metadata to a pandas DataFrame.
- Parameters:
data (list of dict , optional) – List of records to convert. If
None, uses :pyattr:`data`.expand_nested_dicts (list of str or bool , optional) – List of keys to expand into separate columns, or
Trueto expand defaults.rename_columns (dict of str to str, optional) – A dictionary mapping old column names to new column names.
**kwargs – Additional keyword arguments passed to
pd.DataFrame.
- Returns:
DataFrame containing the metadata or
Nonewhen no data is available.- Return type:
pd.DataFrame or None
Examples
>>> handler = ResultsHandler(data=[{"a": 1, "b": 2}]) >>> df = handler.to_df() >>> list(df.columns) ['a', 'b'] >>> df.iloc[0]['a'] np.int64(1)
- to_json(data=None, orient='records', lines=True, **json_kwargs)#
Convert the current metadata to a JSON string or save it to a file.
- Parameters:
- Returns:
The JSON string representation of the metadata, or None if no data is available.
- Return type:
str or None
- Raises:
RuntimeError – If no data is available to convert.
- to_list(data=None)#
Convert the current or provided metadata to a list of dictionaries.
- Parameters:
data (optional) – The paginated data to convert. If
None, uses :pyattr:`data`.- Returns:
A list of metadata records as dictionaries, or
Noneif no data is available.- Return type:
Examples
>>> handler = ResultsHandler(data=[{"x": 10}]) >>> handler.to_list() [{'x': 10}]
- to_polars(data=None, expand_nested_dicts=False, rename_columns=None, **polars_kwargs)#
Convert the current metadata to a Polars DataFrame.
- Parameters:
- Returns:
A Polars DataFrame containing the metadata.
- Return type:
pl.DataFrame
- Raises:
RuntimeError – If no data is available to convert.
- property tree: Tree#
Convert the biomes metadata to a tree structure for visualization or analysis.
- Returns:
A tree representation of the biomes and their relationships.
- Return type:
Tree