torch_staintools.cache package

Submodules

torch_staintools.cache.base module

class torch_staintools.cache.base.Cache(size_limit: int)

Bases: ABC, Generic[C, V]

A simple abstraction of cache.

abstract classmethod build(*args, **kwargs)
property data_cache
dump(path: str, force_overwrite: bool = False)

Dump the cached data to the local file system.

Parameters:
  • path – output filename

  • force_overwrite – whether to force overwriting the existing file on path

Returns:

get(key: Hashable, func: Callable | None, *func_args, **func_kwargs)

Get the data cached under key.

If the corresponding data of key is not yet cached, it will be computed by the func(*func_args, **func_kwargs) and the results will be cached if the remaining size is sufficient.

Parameters:
  • key – the address of the data in cache

  • func – callable to evaluate the new data to cache if not yet cached under key

  • *func_args – positional arguments of func

  • **func_kwargs – keyword arguments of func

Returns:

abstract get_batch_hit(keys: List[Hashable]) V | List[V]
abstract get_batch_miss(keys: List[Hashable], func: Callable[[...], V], *args, **kwargs) V | List[V]
is_cache_valid()
abstract is_cached(key: Hashable)

whether the key already stores a value.

Parameters:

key – key to query

Returns:

bool of whether the corresponding key already stores a value.

abstract load(path: str)

Load cache from the local file system.

Parameters:

path

Returns:

abstract query(key: Hashable)

Behavior of how to read data under key in cache. Used in get and get_batch

Parameters:

key

Returns:

static size_in_bound(current_size, in_data_size, limit_size)

Check whether the size is still in-bound with new data added into cache

Parameters:
  • current_size – current size of cache

  • in_data_size – size of new data

  • limit_size – current size limit (no greater than). If zero or negative then no size limit is enforced.

Returns:

bool. If the size is still in-bound with new data loaded into the cache.

property size_limit
abstract write_batch(keys: List[Hashable], batch: V | List[V])

Write a batch of data to the cache.

Parameters:
  • keys – list of keys corresponding to individual data points in the batch.

  • batch – batch data to cache.

Returns:

write_to_cache(key: Hashable, value: V)

Write the data (value) to the given address (key) in the cache

Parameters:
  • key – any hashable that points the data to the address in the cache

  • value – value of the data to cache

Returns:

torch_staintools.cache.tensor_cache module

class torch_staintools.cache.tensor_cache.TensorCache(size_limit: int, device: device | None = None)

Bases: Cache[Dict[Hashable, Tensor], Tensor]

An implementation of Cache specifically for tensor using a built-in dict.

For now, it is used to store stain matrices directly on CPU or GPU memory since stain matrices are typically small (e.g., 2x3 for mapping between H&E and RGB).

Size of concentrations, however, are proportionally to number of pixels x num_stains, therefore it might be better to be cached on the local file system.

classmethod build(*, size_limit: int = -1, device: device | None = None, path: str | None = None)

Factory builder.

Parameters:
  • size_limit – limit of the cache size by number of entries (no greater than number of keys). Negative value or zero means no limit will be enforced.

  • device – which device (CPU or GPUs) to store the tensor. If None then by default it will be set as torch.device(‘cpu’).

  • path – If specified, previously dumped cache file will be loaded from the path.

Returns:

classmethod collect(tensor_batch_list: List[Tensor]) Tensor
device: device
get_batch_hit(keys: List[Hashable]) Tensor
get_batch_miss(keys: List[Hashable], func: Callable[[...], Tensor], *args, **kwargs) Tensor
is_cached(key)

whether the key already stores a value.

Parameters:

key – key to query

Returns:

bool of whether the corresponding key already stores a value.

load(path: str)

Load cache from the local file system.

Keys will be updated. Cached data already in memory will be overwritten if the same key existing in the dumped cache file to load from. Cached data that do not exist in the dumped cache file (by key) will not be affected.

Parameters:

path – file path to the local cache file to load.

Returns:

query(key) Tensor

Implementation of abstract method: query

Read from dict directly

Parameters:

key

Returns:

queried output

Raises:

KeyError.

to(device: device)

Move the cache to the specified device. Simulate torch.nn.Module.to and torch.Tensor.to.

The dict itself will be reused but the corresponding tensors stored in the dict might be copied to the target device if they are not already on the target device.

Parameters:

device – Target device

Returns:

self.

static validate_value_type(value: ndarray | Tensor)

Helper function to validate the input.

Must be a torch.Tensor. If it is a numpy ndarray, it will be converted to tensor.

Parameters:

value – value to validate

Returns:

torch.Tensor.

Raises:

AssertionError if the output is not a torch.Tensor

write_batch(keys: List[Hashable], batch: Tensor)

Write a batch of data to the cache.

Parameters:
  • keys – list of keys corresponding to individual data points in the batch.

  • batch – batch data to cache.

Returns:

Module contents