datasetgen.utils
¶
Module Contents¶
-
datasetgen.utils.
str2bool
(v: str) → bool[source]¶ Function to convert a string to bool and check if it is true.
- Parameters
v (str) – input string
- Raises
ArgumentTypeError – if the string is not a boolean value
- Returns
the string boolean value
- Return type
bool
-
datasetgen.utils.
gen_random_sizes
(num_files: int, min_file_size: int, max_file_size: int) → list[source]¶ Generates a list of sizes for each files using a random distribution.
- Parameters
num_files (int) – total number of files
min_file_size (int) – minimum file size
max_file_size (int) – maximum file size
- Returns
list of file sizes
- Return type
list
-
datasetgen.utils.
gen_in_range_random_sizes
(num_files: int, min_file_size: int, max_file_size: int) → list[source]¶ Generates a list of sizes that follows the use case distribution.
- Parameters
num_files (int) – total number of files
min_file_size (int) – minimum file size
max_file_size (int) – masimum file size
- Returns
list of file sizes
- Return type
list
-
datasetgen.utils.
gen_random_files
(num_files: int, min_file_size: int, max_file_size: int, size_generator_function: str = 'gen_in_range_random_sizes', start_from: int = 0) → dict[source]¶ Generates a dict with random files with a random size.
- Parameters
num_files (int) – total number of files
min_file_size (int) – minimum file size
max_file_size (int) – maximum file size
size_generator_function (str, optional) – function to use to generate file sizes, defaults to ‘gen_in_range_random_sizes’
start_from (int, optional) – filename reference index, defaults to 0
- Raises
Exception – size generator function not exists
- Returns
dictionary with filenames and their sizes
- Return type
dict