ae.files
generic file object helpers
this namespace portion is pure Python code, providing helpers for file object and content managing. it only depends
on the ae.base namespace portion.
Hint
more helper functions to manage directory/folder structures are provided by the ae.paths portion.
the helper function copy_bytes() provides recoverable copies of binary files and file streams, with progress
callbacks for every copied chunk/buffer.
file_lines() and read_file_text() are helpers to read/load text file contents.
the function write_file_text() stores a string to a text file.
the helper function file_transfer_progress() puts the number of transferred bytes in a short and user-readable
format, to be displayed as progress string in a file transfer.
RegisteredFile and CachedFile encapsulate and optionally cache the contents of a file within a file
object. instances of these classes are compatible with the file objects provided by Python’s pathlib module. but
also pure path strings can be used as file objects (see also the FileObject type).
all these types of file objects are supported by the class FilesRegister from the
ae.paths portion.
registered file
a registered file object represents a single file on your file system and can be instantiated from one of the classes
RegisteredFile or CachedFile provided by this module/portion:
from ae.files import RegisteredFile
rf = RegisteredFile('path/to/the/file_name.extension')
assert str(rf) == 'path/to/the/file_name.extension'
assert rf.path == 'path/to/the/file_name.extension'
assert rf.stem == 'file_name'
assert rf.ext == '.extension'
assert rf.properties == {}
file properties will be automatically attached to each file object instance with the instance attribute
properties. in the last example it results in an empty dictionary because the
path of this file object does not contain folder names with an underscore character.
file properties
file property names and values are automatically determined from the names of their subfolders, specified in the
path attribute. every subfolder name containing an underscore character in the format
<property-name>_<value> will be interpreted as a file property:
rf = RegisteredFile('property1_69/property2_3.69/property3_whatever/file_name.ext')
assert rf.properties['property1'] == 69
assert rf.properties['property2'] == 3.69
assert rf.properties['property3'] == 'whatever'
the property types int, float and string are recognized and converted into a property value. boolean values can be
specified as 1 and 0 integers.
cached file
a cached file created from the CachedFile behaves like a registered file and additionally provides the
possibility to cache parts or the whole file content as well as the file pointer of the opened file:
cf = CachedFile('integer_69/float_3.69/string_whatever/file_name.ext')
assert str(cf) == 'integer_69/float_3.69/string_whatever/file_name.ext'
assert cf.path == 'integer_69/float_3.69/string_whatever/file_name.ext'
assert cf.stem == 'file_name'
assert cf.ext == '.ext'
assert cf.properties['integer'] == 69
assert cf.properties['float'] == 3.69
assert cf.properties['string'] == 'whatever'
pn instantiation of the CachedFile file object the default file object loader function
_default_object_loader() will be used, which opens a file stream via Python’s open() built-in.
alternatively, you can specify a specific file object loader with the object_loader parameter
or by assigning a callable directly to the object_loader attribute:
cf = CachedFile('integer_69/float_3.69/string_whatever/file_name.ext',
object_loader=lambda cached_file_obj: my_open_method(cached_file_obj.path))
the cached file object is accessible via the loaded_object attribute of the cached file object
instance:
assert isinstance(cf.loaded_object, TextIOWrapper)
cf.loaded_object.seek(...)
cf.loaded_object.read(...)
cf.loaded_object.close()
Module Attributes
file object type, e.g. a file path str or any class or callable where the returned instance/value is either a string or an object with a stem attribute (holding the file name w/o extension), like e.g. |
|
types of file property values |
|
dict of file properties |
|
file name or file stream pointer |
Functions
|
recoverable copy of a file or stream (file-like object), optionally with progress callbacks. |
|
returning lines of the text file specified by file_path argument as tuple. |
|
return string to display the transfer progress of transferred bytes in short and user-readable format. |
|
returning content of the text file specified by file_path argument as string, while suppressing exceptions. |
|
write the passed text string or list of line strings into the text file specified by file_path argument. |
Classes
|
represents a cacheables registered file object - see also cached file examples. |
|
represents a single file - see also registered file examples. |
- FileObject
file object type, e.g. a file path str or any class or callable where the returned instance/value is either a string or an object with a stem attribute (holding the file name w/o extension), like e.g.
CachedFile,RegisteredFile,pathlib.Pathorpathlib.PurePath.alias of
str|RegisteredFile|CachedFile|Path|PurePath|Any
- copy_bytes(src_file, dst_file, *, transferred_bytes=0, total_bytes=0, buf_size=16384, overwrite=False, move_file=False, recoverable=False, errors=None, progress_func=<function dummy_function>, **progress_kwargs)[source]
recoverable copy of a file or stream (file-like object), optionally with progress callbacks.
- Parameters:
src_file¶ (
Union[str,BinaryIO]) – source file name or opened stream (file-like) object. if passing a non-seekable stream together with a non-zero value intransferred_bytes, then the source stream has to be set to the correct position before you call this function. if passing any source stream, then also the total file/stream size has to be passed into thetotal_bytesparameter. source file streams do also not support a True value in themove_fileargument.dst_file¶ (
Union[str,BinaryIO]) – destination file name or opened stream (file-like) object. recoverable copies and copies with a True value in theoverwriteargument are not allowed; always use a destination file name if you need a recoverable/overwriting copy.transferred_bytes¶ (
int) – file offset at which the copy process starts. if not passed for recoverable copies, then copy_bytes will determine this value from the file length of the destination file.total_bytes¶ (
int) – source file size in bytes (needed only ifsrc_fileis a stream).buf_size¶ (
int) – size of copy buffer/chunk in bytes (that get copied before each progress callback).overwrite¶ (
bool) – pass True to allow overwriting of the destination file. if the destination file exists already, then this function will return an error (when this argument gets not specified or has a value that evaluates as False).move_file¶ (
bool) – pass True to delete the source file on complete copying (only works if the source file is a stream).recoverable¶ (
bool) – pass True to allow recoverable file copy (only working if the source file is a stream).errors¶ (
Optional[List[str]]) – pass an empty list to get a list of detailed error messages.progress_func¶ (
Callable) – optional callback to dispatch or break/cancel the copy progress for large files. if the callback returns a non-empty value, it will be interpreted as cancel reason, the copy process will be stopped, and an error will be returned.progress_kwargs¶ – optional additional kwargs passed to the progress function. the kwargs total_bytes and transferred_bytes will be updated before the callback.
- Return type:
- Returns:
destination file name/stream as string or empty string on error.
Hint
this function is extending the compatible Python functions
shutil.copyfileobj(),shutil.copyfile(),shutil.copy(),shutil.copy2()andhttp.server.SimpleHTTPRequestHandler.copyfile()with recoverability and a progress callback. it can also be used as an argument for thecopy_functionparameter of e.g.shutil.copytree()andshutil.move().
- file_lines(file_path, encoding=None)[source]
returning lines of the text file specified by file_path argument as tuple.
- file_transfer_progress(transferred_bytes, total_bytes=0, decimal_places=3)[source]
return string to display the transfer progress of transferred bytes in short and user-readable format.
- read_file_text(file_path, encoding=None, error_handling='ignore')[source]
returning content of the text file specified by file_path argument as string, while suppressing exceptions.
- Parameters:
- Return type:
- Returns:
the file contents as a string. if the file could not be decoded, found or opened, returns an empty string (if
error_handlingis unspecified or set to ‘ignore’), otherwise None. this function suppresses and catches exceptions such as FileNotFoundError, OSError, PermissionError, and ValueError.
- write_file_text(text_or_lines, file_path, encoding=None)[source]
write the passed text string or list of line strings into the text file specified by file_path argument.
- Parameters:
text_or_lines¶ (
Union[str,List[str],Tuple[str]]) – new file content either passed as string or list of line strings (will be concatenated with the line separator of the current OS: os.linesep).file_path¶ (
str) – file path/name to write the passed content into (overwriting any previous content!).encoding¶ (
Optional[str]) – encoding used to write/convert/interpret the file content to write.
- Return type:
- Returns:
True if the content got written to the file, False on error/exception. this function suppresses and catches exceptions such as FileExistsError, FileNotFoundError, OSError, PermissionError, and ValueError.
- class RegisteredFile(file_path, **kwargs)[source]
Bases:
objectrepresents a single file - see also registered file examples.
- __init__(file_path, **kwargs)[source]
initialize the registered file instance.
- Parameters:
kwargs¶ – not supported, only there to have compatibility to
CachedFileto detect invalid kwargs.
- __eq__(other)[source]
allow equality checks.
- Parameters:
other¶ (
Union[str,RegisteredFile,CachedFile,Path,PurePath,Any]) – another file object to compare this instance with.- Return type:
- Returns:
True if both objects are of this type and contain a file with the same path, else False.
- __repr__()[source]
for config var storage and eval recovery.
- Returns:
evaluable/recoverable representation of this object.
- __str__()[source]
return the file path of the registered file.
- Returns:
file path string of this file object.
- __hash__ = None
- _default_object_loader(file_obj)[source]
file object loader that is opening the file and keeping the handle of the opened file.
- Parameters:
file_obj¶ (
Union[str,RegisteredFile,CachedFile,Path,PurePath,Any]) – file object (path string or obj with path attribute holding the complete file path).- Returns:
file handle to the opened file.
- class CachedFile(file_path, object_loader=<function _default_object_loader>, late_loading=True)[source]
Bases:
RegisteredFilerepresents a cacheables registered file object - see also cached file examples.
- __init__(file_path, object_loader=<function _default_object_loader>, late_loading=True)[source]
create a cached file object instance.
- Parameters:
object_loader¶ (
Callable[[CachedFile],Any]) – callable converting the file_obj into a cached object (available vialoaded_object).late_loading¶ (
bool) – pass False to convert/load file_obj cache early, directly at instantiation.