π¦ facets moduleΒΆ
Hyperspace indexing and operations.
- class pupyl.indexer.facets.Index(size, data_dir=None, trees=0.01, volatile=False)ΒΆ
Procedures over multidimensional spaces.
- __enter__()ΒΆ
Context opening for an index.
- Returns:
Context initialization.
- Return type:
self
- __exit__(exc_type, exc_val, exc_tb)ΒΆ
Context closing for an index.
- __getitem__(position)ΒΆ
Return item at index, supporting negative slicing.
- Parameters:
position (
int
) β The id of desired item to be returned.- Returns:
With indexed tensors.
- Return type:
list
Example
index[10] # Returns the 10th item.
index[-1] # Returns the last item.
- __init__(size, data_dir=None, trees=0.01, volatile=False)ΒΆ
Indexing tensors operations and approximate nearest neighbours search.
- Parameters:
size (
int
) β Shape of unidimensional vectors which will be indexeddata_dir (
str
) β Location where to load or save the indextrees (
float
) β Defines the factor over the number of trees to be created based on the dataset size. Should be a number between 0 and 1.volatile (
bool
) β If the index will be temporary or not.
- Raises:
OSError: β When the
data_dir
parameter is not a directory.NoDataDirForPermanentIndex: β When no
data_dir
was passed for a permament index.DataDirDefinedForVolatileIndex: β If a
data_dir
was defined for a volatile index.FileIsNotAnIndex: β When an index was tried to be loaded but it wasnβt a valid file.
- __iter__()ΒΆ
Returns an iterable for the index.
- Yields:
list
β With indexed tensors.
- __len__()ΒΆ
Returns how many items are indexed.
- Returns:
Describing how many items are indexed.
- Return type:
int
Example
len(index) # Will return 10 for an index with 10 elements indexed
- __next__()ΒΆ
Iterates over the iterable.
- Returns:
With an indexed tensor.
- Return type:
list
- Raises:
StopIteration: β When the iterable is exhausted.
- __weakref__ΒΆ
list of weak references to the object (if defined)
- append(tensor, check_unique=False)ΒΆ
Inserts a new tensor at the end of the index.
Attention
Be advised that this operation is linear on the index size (\(O(n)\)).
- Parameters:
tensor (
numpy.ndarray
orlist
) β The tensor to insert into the index.check_unique (
bool
) β Defines if the append method should verify the existence of a really similar tensor on the current index. In other words, it checks for the unicity of the value.
Warning
Be advised that the unicity check (
check_unique=True
) creates an overhead on the append process.- Raises:
NullTensorError: β If a null (empty) tensor is passed through.
- export_by_group_by(path, top=10, **kwargs)ΒΆ
Export images, grouping them by similarity.
- Parameters:
path (
str
) β Place to create the directories and export the images.top (
int
) β How many similar internal images should be filtered.position (
int
) β Returns the groups based on a specified position.
- export_results(path, similars, keep_ids=False, keep_names=False)ΒΆ
Export internal image at
position
by copying it topath
.- Parameters:
path (
str
) β Place where to export the images.similars (
iterable
) β Containing image ids to export topath
.keep_ids (
bool
) β If the original ids must be preserved or not.keep_names (
bool
) β If the original names must be preserved or not.
- flush()ΒΆ
Commits an indexer work.
- group_by(top=10, **kwargs)ΒΆ
Returns all (or some position) on the index which is similar with each other inside index.
- Parameters:
top (
int
) β How many similar internal images should be returned.position (
int
) β Returns the groups based on a specified position.
- Yields:
list
β Ifposition
is defined.dict
β Generator with a dictionary containing internal ids as key and a list of similar images as values.
- Raises:
EmptyIndexError: β If the underlying index is null.
TopNegativeOrZero: β If
top
parameter is zero or below.
- index(tensor)ΒΆ
Searchs for the first and most similar image compared to the query image.
- Parameters:
tensor (
numpy.ndarray
orlist
) β A vector to search for the most similar.- Returns:
Describing the most similar resulting index.
- Return type:
int
- property index_nameΒΆ
Getter for property index_name.
- Returns:
With current index name.
- Return type:
str
- item(position, top=10, distances=False)ΒΆ
Searchs the index using an internal position
- Parameters:
position (
int
) β The item id within index.top (
int
) β How many similar items should be returned.distances (
bool
) β If should be returned also the distances between items.
- Returns:
list
oftuples
β if distances isTrue
, thislist
containing pairs of items and distances.list
β if distances isFalse
, thislist
containing similar items.
- items()ΒΆ
Returns indexed items.
- Yields:
int
β With item identification.
- items_values()ΒΆ
Returns all items and values.
- Yields:
tuple
β With anint
representing its id and alist
with the actual tensor.
- property pathΒΆ
Getter for property path.
- Returns:
With the path set.
- Return type:
str
- pop(position=None)ΒΆ
Pops-out the index at position, returning it.
Attention
Be advised that this operation is linear on the index size (\(O(n)\)).
- Parameters:
position (
int
) β Removes and returns the value atposition
.- Returns:
With the popped-out item.
- Return type:
int
- refresh()ΒΆ
Updates all information regarding the index file, first unloading it, followed by reloading back the index.
- remove(position)ΒΆ
Removes the tensor at
position
from the database.Attention
Be advised that this operation is linear on the index size (\(O(n)\)).
- Parameters:
position (
int
) β The index that must be removed.- Raises:
IndexNotBuildYet: β If was tried to remove a tensor from a not built yet index file.
IndexError: β If
position
is bigger than the index current size.
- remove_feature_cache(index)ΒΆ
Removes a feature cache used during an indexing process.
- Parameters:
index (
int
) βindex
associated to a cache marked for removal.
- search(tensor, results=16, return_distances=False)ΒΆ
Searchs for the most similar images compared to the query image (or with increasing distances).
- Parameters:
tensor (
numpy.ndarray
orlist
) β A vector to search for the most similar ones.results (
int (optional)(default
:16)
) β How many results to return. If similar images are less thanresults
, it exhausts and will be returned current total results.return_distances (
bool (optional)(default
:False)
) β If the distances between tensors should be returned or not.
- Yields:
int
ortuple
β Anint
representing the index of the most similar, the second one and so on or atuple
(in the case ofreturn_distances=True
), where the first element is theint
representing the most similar index and afloat
with the distance betweentensor
and other tensors already indexed.
- property sizeΒΆ
Getter for property size.
- Returns:
Describing the size of a ANN tree.
- Return type:
int
- property treesΒΆ
Getter for property trees.
- Returns:
With the factor over the index size to make trees.
- Return type:
float
- values()ΒΆ
Returns indexed values.
- Yields:
list
β With indexed tensors.
- property volatileΒΆ
Getter for property volatile.
- Returns:
If the index is volatile or not.
- Return type:
bool