ae.files
generic file object helpers
this namespace portion is pure Python providing helpers for file object and content managing. it only depends on the
ae.base
namespace portion.
Hint
more helper functions to manage directory/folder structures are provided by the ae.paths
portion.
the helper function copy_bytes()
provides recoverable copies of binary files and file streams, with progress
callbacks for each copied bytes chunk/buffer.
file_lines()
and read_file_text()
are helpers to read/load text file contents.
the function write_file_text()
stores a string to a text file.
the helper function file_transfer_progress()
puts the amount of transferred bytes in a short and user readable
format, to be displayed as progress string in a file transfer progress.
RegisteredFile
and CachedFile
encapsulate and optionally cache the contents of a file within a file
object. instances of these classes are compatible with the file objects provided by Python’s pathlib
module. but
also pure path strings can be used as file objects (see also the FileObject
type).
all these types of file objects are supported by the files register class FilesRegister
from the
ae.paths
portion.
registered file
a registered file object represents a single file on your file system and can be instantiated from one of the classes
RegisteredFile
or CachedFile
provided by this module/portion:
from ae.files import RegisteredFile
rf = RegisteredFile('path/to/the/file_name.extension')
assert str(rf) == 'path/to/the/file_name.extension'
assert rf.path == 'path/to/the/file_name.extension'
assert rf.stem == 'file_name'
assert rf.ext == '.extension'
assert rf.properties == {}
file properties will be automatically attached to each file object instance with the instance attribute
properties
. in the last example it results in an empty dictionary because the
path
of this file object does not contain folder names with an underscore character.
file properties
file property names and values are automatically determined from the names of their sub-folders, specified in the
path
attribute. every sub-folder name containing an underscore character in the format
<property-name>_<value> will be interpreted as a file property:
rf = RegisteredFile('property1_69/property2_3.69/property3_whatever/file_name.ext')
assert rf.properties['property1'] == 69
assert rf.properties['property2'] == 3.69
assert rf.properties['property3'] == 'whatever'
the property types int, float and string are recognized and converted into a property value. boolean values can be
specified as 1
and 0
integers.
cached file
a cached file created from the CachedFile
behaves like a registered file and additionally provides the
possibility to cache parts or the whole file content as well as the file pointer of the opened file:
cf = CachedFile('integer_69/float_3.69/string_whatever/file_name.ext')
assert str(cf) == 'integer_69/float_3.69/string_whatever/file_name.ext'
assert cf.path == 'integer_69/float_3.69/string_whatever/file_name.ext'
assert cf.stem == 'file_name'
assert cf.ext == '.ext'
assert cf.properties['integer'] == 69
assert cf.properties['float'] == 3.69
assert cf.properties['string'] == 'whatever'
pn instantiation of the CachedFile
file object the default file object loader function
_default_object_loader()
will be used, which opens a file stream via Python’s open()
built-in. alternatively
you can specify a specific file object loader with the object_loader
parameter or by assigning
a callable directly to the object_loader
attribute:
cf = CachedFile('integer_69/float_3.69/string_whatever/file_name.ext',
object_loader=lambda cached_file_obj: my_open_method(cached_file_obj.path))
the cached file object is accessible via the loaded_object
attribute of the cached file object
instance:
assert isinstance(cf.loaded_object, TextIOWrapper)
cf.loaded_object.seek(...)
cf.loaded_object.read(...)
cf.loaded_object.close()
Module Attributes
file object type, e.g. |
|
types of file property values |
|
dict of file properties |
|
file name or file stream pointer |
Functions
|
recoverable copy of a file or stream (file-like object), optionally with progress callbacks. |
|
returning lines of the text file specified by file_path argument as tuple. |
|
return string to display the transfer progress of transferred bytes in short and user readable format. |
|
returning content of the text file specified by file_path argument as string. |
|
write the passed text string or list of line strings into the text file specified by file_path argument. |
Classes
|
represents a cacheables registered file object - see also cached file examples. |
|
represents a single file - see also registered file examples. |
- FileObject
file object type, e.g. a file path str or any class or callable where the returned instance/value is either a string or an object with a stem attribute (holding the file name w/o extension), like e.g.
CachedFile
,RegisteredFile
,pathlib.Path
orpathlib.PurePath
.alias of
Union
[str
,RegisteredFile
,CachedFile
,Path
,PurePath
,Any
]
- copy_bytes(src_file, dst_file, *, transferred_bytes=0, total_bytes=0, buf_size=16384, overwrite=False, move_file=False, recoverable=False, errors=None, progress_func=<function dummy_function>, **progress_kwargs)[source]
recoverable copy of a file or stream (file-like object), optionally with progress callbacks.
- Parameters:
src_file¶ (
Union
[str
,BinaryIO
]) – source file name or opened stream (file-like) object. if passing a non-seekable stream together with a non-zero value intransferred_bytes
then the source stream has to be set to the correct position before you call this function. if passing any source stream then also the total file/stream size has to be passed into thetotal_bytes
parameter. source file streams does also not support a True value in themove_file
argument.dst_file¶ (
Union
[str
,BinaryIO
]) – destination file name or opened stream (file-like) object. recoverable copies and copies with a True value in theoverwrite
argument are not allowed; always use a destination file name if you need a recoverable/overwriting copy.transferred_bytes¶ (
int
) – file offset at which the copy process starts. if not passed for recoverable copies, then copy_bytes will determine this value from the file length of the destination file.total_bytes¶ (
int
) – source file size in bytes (needed only ifsrc_file
is a stream).buf_size¶ (
int
) – size of copy buffer/chunk in bytes (that get copied before each progress callback).overwrite¶ (
bool
) – pass True to allow to overwrite of destination file. if the destination file exists already then this function will return an error (when this argument get not passed or is False).move_file¶ (
bool
) – pass True to delete source file on complete copying (only works if source is a stream).recoverable¶ (
bool
) – pass True to allow recoverable file copy (only working if source is a stream).errors¶ (
Optional
[List
[str
]]) – pass empty list to get a list of detailed error messages.progress_func¶ (
Callable
) – optional callback to dispatch or break/cancel the copy progress for large files. if the callback returns a non-empty value it will be interpreted as cancel reason, the copy process will be stopped and an error will be returned.progress_kwargs¶ – optional additional kwargs passed to the progress function. the kwargs total_bytes and transferred_bytes will be updated before the callback.
- Return type:
- Returns:
destination file name/stream as string or empty string on error.
Hint
this function is extending the compatible Python functions
shutil.copyfileobj()
,shutil.copyfile()
,shutil.copy()
,shutil.copy2()
andhttp.server.SimpleHTTPRequestHandler.copyfile()
with recoverability and a progress callback. it can also be used as argument for thecopy_function
parameter of e.g.shutil.copytree()
andshutil.move()
.
- file_lines(file_path, encoding=None)[source]
returning lines of the text file specified by file_path argument as tuple.
- file_transfer_progress(transferred_bytes, total_bytes=0, decimal_places=3)[source]
return string to display the transfer progress of transferred bytes in short and user readable format.
- read_file_text(file_path, encoding=None, error_handling='ignore')[source]
returning content of the text file specified by file_path argument as string.
- Parameters:
encoding¶ (
Optional
[str
]) – encoding used to load and convert/interpret the file content.error_handling¶ (
str
) – pass ‘strict’ or None to return None (instead of an empty string) for the cases where either a decoding ValueError exception or any OSError, FileNotFoundError or PermissionError exception got raised. the default value ‘ignore’ will ignore any decoding errors (missing some characters) and will return an empty string on any file/os exception.
- Return type:
- Returns:
file content string. if the file could not be decoded, found or opened, then return empty string or None (None only if ‘strict’ got passed to the
error_handling
parameter).
- write_file_text(text_or_lines, file_path, encoding=None)[source]
write the passed text string or list of line strings into the text file specified by file_path argument.
- Parameters:
text_or_lines¶ (
Union
[str
,List
[str
],Tuple
[str
]]) – new file content either passed as string or list of line strings (will be concatenated with the line separator of the current OS: os.linesep).file_path¶ (
str
) – file path/name to write the passed content into (overwriting any previous content!).encoding¶ (
Optional
[str
]) – encoding used to write/convert/interpret the file content to write.
- Return type:
- Returns:
True if the content got written to the file, False on any file/OS error.
- class RegisteredFile(file_path, **kwargs)[source]
Bases:
object
represents a single file - see also registered file examples.
- __init__(file_path, **kwargs)[source]
initialize registered file_obj instance.
- Parameters:
kwargs¶ – not supported, only there to have compatibility to
CachedFile
to detect invalid kwargs.
- __eq__(other)[source]
allow equality checks.
- Parameters:
other¶ (
Union
[str
,RegisteredFile
,CachedFile
,Path
,PurePath
,Any
]) – other file object to compare this instance with.- Return type:
- Returns:
True if both objects are of this type and contain a file with the same path, else False.
- __repr__()[source]
for config var storage and eval recovery.
- Returns:
evaluable/recoverable representation of this object.
- __hash__ = None
- _default_object_loader(file_obj)[source]
file object loader that is opening the file and keeping the handle of the opened file.
- Parameters:
file_obj¶ (
Union
[str
,RegisteredFile
,CachedFile
,Path
,PurePath
,Any
]) – file object (path string or obj with path attribute holding the complete file path).- Returns:
file handle to the opened file.
- class CachedFile(file_path, object_loader=<function _default_object_loader>, late_loading=True)[source]
Bases:
RegisteredFile
represents a cacheables registered file object - see also cached file examples.
- __init__(file_path, object_loader=<function _default_object_loader>, late_loading=True)[source]
create cached file object instance.
- Parameters:
object_loader¶ (
Callable
[[CachedFile
],Any
]) – callable converting the file_obj into a cached object (available vialoaded_object
).late_loading¶ (
bool
) – pass False to convert/load file_obj cache early, directly at instantiation.