π½ database moduleΒΆ
Operations and storage for images.
- class pupyl.storage.database.ImageDatabase(import_images, data_dir=None, **kwargs)ΒΆ
Handling storage and database operations for images.
- __getitem__(position)ΒΆ
Returns the item at index.
- Parameters:
position (
int) β The position inside database to return.- Returns:
With some metadata related to the item.
- Return type:
dict
Example
img_db = ImageDatabase(import_images=True, data_dir='pupyl')img_db[10]# May return:{'original_file_name': '2610447919_1b91946bd1.jpg','original_path': '/tmp/tmpekd0cuie','original_file_size': '52K','original_access_time': '2021-06-14T19:07:27','id': 10}
- __init__(import_images, data_dir=None, **kwargs)ΒΆ
Image storage and operations.
- Parameters:
import_images (
bool) β If images must be imported (copied) to the internal database or not.data_dir (
str) β Location to save the image storage files and assets. If a value is ommited for this parameter, will be created a new temporary folder in the underlying (operating system) default temporary directory.bucket_size (
int) β Defines the number of files per bucket inside the database. Since each file and associated assets are saved together, splitting up the directories will help avoid issues likeToo many files, also allowing read parallelization of assets among others features. In other words, this parameter describes how many image files will be saved on an internal database directory before starting to save to another new one.image_size (
tuple) β Defines the dimensions (in pixels, width x height) of saved images on the database. Only has some effect ifimport_imagesis True. Case a resize happens, the aspect ratio of the original image will be preserved, henceimage_sizeis an approximation. In other words, the image will be resized to dimensions close to800x600, but using one pair of dimensions that not offends the image aspect.
Caution
If no value is passed to
data_dir, all database assets will be created on the defined temporary directory. By doing this, be advised that all your image search will (probably) vanish after a system reboot. If you donβt want that this happens, please, define a non-volatiledata_dir.
- __len__()ΒΆ
Return how many items are indexed in the database.
- Returns:
Describing how may images are indexed on the current database.
- Return type:
int
Example
img_db = ImageDatabase(import_images=True, data_dir='pupyl')len(img_db) # May return 709
- property bucket_sizeΒΆ
Getter for bucket_size property.
- Returns:
With how many files per bucket will be stored.
- Return type:
int
- property image_sizeΒΆ
Getter for image_size property.
- Returns:
Describing the internal (approximated) dimensions of each image. If
_import_imagesis undefined, returns by default(800x600).- Return type:
tuple
- property import_imagesΒΆ
Getter for import_images property.
- Returns:
If images should be imported into the current database or not.
- Return type:
bool
- insert(index, uri)ΒΆ
Inserts an image into the database.
- Parameters:
index (
int) β The index number attributed to the image.uri (
str) β Where the original file is located.
- list_images(return_ids=False, top=None)ΒΆ
Returns images on current database.
- Parameters:
return_ids (
bool) β If the method should also return the file ids inside database.top (
int) β How many pictures from image database should be listed. Not setting this parameter (which means not referencing it or setting it to zero or below) will return all images in the database.
- Yields:
tupleorstrβ Ifreturn_ids=True, atuplewith(int, str)representing respectively the index and the path inside the database will be returned. Otherwise, ifreturn_ids=False, just astrwith the full path will return.
- load_image(index, as_tensor=False)ΒΆ
Returns the image data at a specified index.
- Parameters:
index (
int) β The location of the image inside database.as_tensor (
bool) β How to return the image from database: asbytes(as_tensor=False) or as anumpy.ndarraytensor (as_tensor=True)
- Returns:
Returns image bytes (as_tensor=False) or numpy.ndarray (as_tensor=True), containing image converted to its tensor representation.
- Return type:
bytesornumpy.ndarray
- load_image_metadata(index, **kwargs)ΒΆ
Loads the metadata for an image inside the database.
- Parameters:
index (
int) β Regarding the position of some image inside database.filtered (
iterable) β Describing which fields to filter (or select) for return.distance (
float) β The distance between the tensor represented byindexand thequeryimage.
- Returns:
Containing the parsed json file.
- Return type:
dict- Raises:
IndexError: β When
indexis not found.
- mount_file_name(index, **kwargs)ΒΆ
Creates the full name path that the file will be saved inside database.
- Parameters:
index (
int) β The indexer id associated with the file.extension (
str) β Describing the file extension.
- Returns:
With the full path inside the database.
- Return type:
str
- remove(index)ΒΆ
Removes the image at
index.- Parameters:
index (
int) β The image index to remove from database.
Danger
Use this method with caution. The deleted image cannot be restored. No prompt are shown before deletion.
Attention
Be advised that this operation is linear on the index size (\(O(n)\)). It provokes changes on the current image
index, for all indexed images. For instance, if theindexat 54 was deleted, every image with index greater than 54 will have theiddecreased by one.
- save_image_metadata(index, uri)ΒΆ
Stores image metadata information retrieved from the file.
- Parameters:
index (
int) β The index related to the image.uri (
str) β Location where the image is stored.
- what_bucket(index)ΒΆ
Discovers in what bucket the file should be saved.
- Parameters:
index (
int) β The index that references an image in the database.- Returns:
With the bucket number that the image is saved.
- Return type:
int