dgenerate submodules

dgenerate.arguments module

Argument parsing for the dgenerate command line tool.

exception dgenerate.arguments.DgenerateHelpException[source]

Bases: Exception

Raised by parse_args() and parse_known_args() when --help is encountered and help_raises=True

exception dgenerate.arguments.DgenerateUsageError[source]

Bases: Exception

Raised by parse_args() and parse_known_args() on argument usage errors.

class dgenerate.arguments.DgenerateArguments[source]

Bases: RenderLoopConfig

Represents dgenerate’s parsed command line arguments, can be used as a configuration object for dgenerate.renderloop.RenderLoop.

__init__()[source]

global_config: str | None = None: global config file path.

guidance_scales: _types.Floats: List of floating point guidance scales, this corresponds to the --guidance-scales argument of the dgenerate command line tool.

inference_steps: _types.Integers: List of inference steps values, this corresponds to the --inference-steps argument of the dgenerate command line tool.

plugin_module_paths: Sequence[str]: Plugin module paths --plugin-modules

prompts: _prompt.Prompts: List of prompt objects, this corresponds to the --prompts argument of the dgenerate command line tool.

seeds: _types.Integers: List of integer seeds, this corresponds to the --seeds argument of the dgenerate command line tool.

verbose: bool = False: Enable debug output? -v/--verbose

dgenerate.arguments.config_attribute_name_to_option(name)[source]

Convert an attribute name of DgenerateArguments into its command line option name.

Parameters:: name – the attribute name
Returns:: the command line argument name as a string

dgenerate.arguments.get_raw_help_text(option: str) → str[source]

Get the raw help text for a given command line option.

This text will not be formatted in any way, and may be indented as is defined in source code.

You should utilize inspect.cleandoc() and dgenerate.textprocessing.wrap_paragraphs() to format the text if displaying it to the user is intended.

Parameters:: option – The command line option name, short or long opt.
Returns:: The help text for the option.
Raises:: ValueError – If the option is not valid.

dgenerate.arguments.is_valid_option(option: str)[source]

Check if an option string is a valid option name in the parser.

Parameters:: option – The option name, short or long opt.
Returns:: True or False

dgenerate.arguments.parse_args(args: Sequence[str] | None = None, overrides: dict[str, str] | None = None, throw: bool = True, log_error: bool = True, help_raises: bool = False) → DgenerateArguments | None[source]

Parse dgenerate’s command line arguments and return a configuration object.

Parameters:

args – arguments list, as in args taken from sys.argv, or in that format
overrides – Optional dictionary of overrides to apply to the DgenerateArguments object after parsing but before validation, this should consist of attribute names with values.
throw – throw DgenerateUsageError on error? defaults to True
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises dgenerate.arguments.DgenerateHelpException ? When True, this will occur even if throw=False

Raises:

DgenerateUsageError –
DgenerateHelpException –

Returns:

DgenerateArguments. If throw=False then None will be returned on errors.

dgenerate.arguments.parse_device(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[str | None, list[str]][source]

Retrieve the -d/--device argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Raises:

DgenerateUsageError – If no argument value was provided.

Returns:

(value | None, unknown_args_list)

dgenerate.arguments.parse_directives_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --directives-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_functions_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --functions-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_image_processor_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --image-processor-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_known_args(args: Sequence[str] | None = None, throw: bool = True, log_error: bool = True, no_model: bool = True, no_help: bool = True, help_raises: bool = False) → tuple[DgenerateArguments, list[str]] | None[source]

Parse only known arguments off the command line.

Ignores dgenerate’s only required argument model_path by default.

No logical validation is performed, DgenerateArguments.check() is not called by this function, only argument parsing and simple type validation is performed by this function.

Parameters:

args – arguments list, as in args taken from sys.argv, or in that format
throw – throw DgenerateUsageError on error? defaults to True
log_error – Write ERROR diagnostics with dgenerate.messages?
no_model – Remove the model_path argument from the parser.
no_help – Remove the --help argument from the parser.
help_raises – --help raises dgenerate.arguments.DgenerateHelpException ? When True, this will occur even if throw=False

Raises:

DgenerateUsageError – on argument error (simple type validation only)
DgenerateHelpException –

Returns:

(DgenerateArguments, unknown_args_list). If throw=False then None will be returned on errors.

dgenerate.arguments.parse_latents_processor_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --latents-processor-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_offline_mode(args: Sequence[str] | None = None) → tuple[bool, list[str]][source]

Parse out -ofm/--offline-mode

Parameters:: args – command line arguments
Raises:: DgenerateUsageError – If no argument value was provided.
Returns:: (value | None, unknown_args_list)

dgenerate.arguments.parse_plugin_modules(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --plugin-modules argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Raises:

DgenerateUsageError – If no argument values were provided.

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_prompt_upscaler_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --prompt-upscaler-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_prompt_weighter_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --prompt-weighter-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_quantizer_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --quantizer-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_sub_command(args: Sequence[str] | None = None) → tuple[str | None, list[str]][source]

Retrieve the --sub-command argument value

Parameters:: args – command line arguments
Raises:: DgenerateUsageError – If no argument value was provided.
Returns:: (value | None, unknown_args_list)

dgenerate.arguments.parse_sub_command_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --sub-command-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_templates_help(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[list[str] | None, list[str]][source]

Retrieve the --templates-help argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(values | None, unknown_args_list)

dgenerate.arguments.parse_verbose(args: Sequence[str] | None = None, throw_unknown: bool = False, log_error: bool = False) → tuple[bool, list[str]][source]

Retrieve the -v/--verbose argument value

Parameters:

args – command line arguments
throw_unknown – Raise DgenerateUsageError if any other specified argument is not a valid dgenerate argument? This treats the primary model argument as optional, and only goes into effect if the specific argument is detected.
log_error – Write ERROR diagnostics with dgenerate.messages?

Returns:

(value, unknown_args_list)

dgenerate.auto1111_metadata module

This module provides functionality to embed Automatic1111 metadata into images. This metadata can be converted from dgenerate configs produced by –output-configs, or images with dgenerate metadata attached via –output-metadata.

exception dgenerate.auto1111_metadata.Auto1111MetadataCreationError[source]

Bases: Exception

Exception raised when there is an error creating Automatic1111 metadata.

dgenerate.auto1111_metadata.convert_and_insert_metadata(image_path: str, output_path: str | None = None, dgenerate_config: str | None = None, local_files_only: bool = False)[source]

Convert a dgenerate config to Automatic1111 metadata and add it to an image.

This function reads the dgenerate config file or existing dgenerate image metadata and converts it into Automatic1111 metadata format, then sets it to the image’s EXIF data, or to a copy of that image at output_path.

This operation will destroy existing EXIF data on JPEGs, and PNGs will have their parameters metadata field set to the Automatic1111 metadata format, which will overwrite anything there. The DgenerateConfig field in PNGs will be preserved.

Parameters:

image_path – input image path, this can be a JPEG or PNG file.
output_path – output image path, this can be a JPEG or PNG file, if not provided the input image will be modified.
dgenerate_config – dgenerate config text produced by --output-configs, in the case that the image does not contain metadata produced by --output-metadata. This is not a file path, it should be the config text itself as a string.
local_files_only – if True, do not download any files, only use local files and cache.

Raises:

Auto1111MetadataCreationError – if there is an error creating or writing the metadata.

dgenerate.auto1111_metadata.get_checkpoint_hash_cache() → str[source]

Get the default model hash cache directory.

Checkpoint hashes are used for Automatic1111 metadata to provide information about the models involved in a generation, this information is cached for performance.

Or the value of the environmental variable DGENERATE_CACHE joined with auto1111_metadata/cache.db.

Returns:: string (directory path)

dgenerate.batchprocess module

Batch processing / dgenerate config scripts.

exception dgenerate.batchprocess.BatchProcessError[source]

Bases: Exception

Thrown by BatchProcessor.run_file() and BatchProcessor.run_string() when an error in a batch processing script is encountered.

exception dgenerate.batchprocess.ConfigRunnerPluginArgumentError[source]

Bases: PluginArgumentError

Thrown when a dgenerate.batchprocess.ConfigRunnerPlugin plugin is not instantiated with correct arguments.

exception dgenerate.batchprocess.ConfigRunnerPluginNotFoundError[source]

Bases: PluginNotFoundError

Thrown when ConfigRunnerPluginLoader cannot find any dgenerate.batchprocess.ConfigRunnerPlugin implementation for a specified name.

class dgenerate.batchprocess.Auto1111MetadataDirective(**kwargs)[source]

Bases: ConfigRunnerPlugin

__init__(**kwargs)[source]

Parameters:: kwargs – plugin base class arguments

class dgenerate.batchprocess.BatchProcessor(invoker: Callable[[str, Sequence[str]], int] | Callable[[Sequence[str]], int], name: str, version: tuple[int, int, int] | str, template_variables: dict[str, Any] | None = None, reserved_template_variables: set[str] | None = None, template_functions: dict[str, Callable[[Any], Any]] | None = None, directives: dict[str, Callable[[list], None]] | None = None, builtins: dict[str, Callable[[Any], Any]] | None = None, injected_args: Sequence[str] | None = None, disable_directives: bool = False)[source]

Bases: object

Implements dgenerate’s batch processing scripts in a generified manner.

This is the bare-bones implementation of the shell with nothing implemented for you except for:

\env

\set

\sete

\setp

\unset

\unset_env

\print

\echo

\reset_lineno

If you wish to create this object to run a dgenerate configuration, use dgenerate.batchprocess.ConfigRunner

static default_builtins() → dict[str, Callable[[Any], Any]][source]: Return the default builtins available as template functions.

__init__(invoker: Callable[[str, Sequence[str]], int] | Callable[[Sequence[str]], int], name: str, version: tuple[int, int, int] | str, template_variables: dict[str, Any] | None = None, reserved_template_variables: set[str] | None = None, template_functions: dict[str, Callable[[Any], Any]] | None = None, directives: dict[str, Callable[[list], None]] | None = None, builtins: dict[str, Callable[[Any], Any]] | None = None, injected_args: Sequence[str] | None = None, disable_directives: bool = False)[source]

Parameters:

invoker – A function for invoking lines recognized as shell commands, should return a return code, This can be a function that just accepts a sequence of arguments, or a function that accepts the raw command line as a string followed by the parsed sequence of arguments.
name – The name of this batch processor, currently used in the version check directive and messages
version – Version for version check hash bang directive.
template_variables – Live template variables, the initial environment, this dictionary will be modified during runtime.
reserved_template_variables – These template variable names cannot be set with the \set or \setp directive, or un-defined with the \unset directive.
template_functions – Functions available to Jinja2
directives – batch processing directive handlers, for: \directives. This is a dictionary of names to functions which accept a single parameter, a list of directive arguments, and return a return code.
builtins – builtin functions available as template functions and \setp functions. A safe default collection of functions is used if this is not specified. Builtins may be overridden by functions defined in template_functions
injected_args – Arguments to be injected at the end of user specified arguments for every shell invocation. If -v/--verbose is present in injected_args debugging output will be enabled globally while the config runs, and not just for invocations. Passing -v/--verbose also enables printing stack traces for all unhandled directive exceptions to stderr.
disable_directives – If True, disables the use of all directives, including the built-in ones. This also disable template continuations, (lines starting with “{”) which are a form of directive.

render_template(string: str, stream: bool = False) → str | Iterator[str][source]

Render a template from a string

Parameters:

string – the string containing the Jinja2 template.
stream – Stream the results of generating this template line by line?

Returns:

rendered string

run_file(stream: Iterator[str])[source]

Process a batch processing script from a file stream.

Technically, from an iterator over lines of text.

Raises:: BatchProcessError –
Parameters:: stream – A filestream in text read mode

run_string(string: str)[source]

Process a batch processing script from a string

Raises:: BatchProcessError –
Parameters:: string – a string containing the script

user_define(name: str, value: Any)[source]

Define a template variable as if you were the user.

Raises:

BatchProcessError – if the specified variable name cannot be defined by the user due to not being a valid identifier string, being the name of a template function, being the name of a reserved template variable, or being the name of a builtin function.

Parameters:

name – Variable name
value – Assigned value

user_define_check(name: str)[source]

Check if a template variable can be defined by the user, raise if not.

Raises:: BatchProcessError – if the specified variable name cannot be defined by the user due to not being a valid identifier string, being the name of a template function, being the name of a reserved template variable, or being the name of a builtin function.
Parameters:: name – Variable name

user_set(name: str, value: str)[source]

Set a template variable as if you were using the set directive.

This applies template expansion and environmental variable expansion to both the name and value, then sets the template variable.

Raises:

BatchProcessError – if the specified variable name cannot be defined by the user due to validation errors.

Parameters:

name – Variable name (can contain template expressions)
value – Variable value (can contain template expressions and env vars)

user_setp(name: str, expression: str)[source]

Set a template variable to the result of evaluating a Python expression as if you were using the setp directive.

This applies template expansion and environmental variable expansion to the name, then evaluates the expression as Python and sets the template variable to the result.

Raises:

BatchProcessError – if the specified variable name cannot be defined by the user due to validation errors, or if the expression cannot be evaluated.

Parameters:

name – Variable name (can contain template expressions)
expression – Python expression to evaluate (can contain template expressions and env vars)

user_undefine(name: str)[source]

Undefine a template variable as if you were the user.

Raises:: BatchProcessError – if the specified variable name cannot be undefined by the user due to not being a valid identifier string, being the name of a template function, being the name of a reserved template variable, being the name of a builtin function, or a non existing template variable.
Parameters:: name – Variable name

user_undefine_check(name: str)[source]

Check if a template variable can be undefined by the user, raise if not.

Raises:: BatchProcessError – if the specified variable name cannot be undefined by the user due to not being a valid identifier string, being the name of a template function, being the name of a reserved template variable, being the name of a builtin function, or a non existing template variable.
Parameters:: name – Variable name

builtins: dict[str, Callable[[Any], Any]]

Safe python builtins that are always available as template functions and also usable with \setp

They may be overridden by functions defined in dgenerate.batchprocess.BatchProcessor.template_functions

property current_line: int: The current line number in the file being processed.

directives: dict[str, Callable[[Sequence[str]], int]] | None

Batch process directives, shell commands starting with a backslash.

Dictionary of callable(list) -> int.

The function should return a return code, 0 for success, anything else for failure.

property directives_builtins_help: dict[str, str]: Returns a dictionary of help strings for directives that are built into the BatchProcessor base class.

disable_directives: bool = False

If True, disables the use of all directives, including the built-in ones.

This also disable template continuations, (lines starting with “{”) which are a form of directive.

property executing_text: None | str: The text / command line that is currently being executed, or that was last executed.

expand_vars: Callable[[str], str]: A function for expanding environmental variables, defaults to dgenerate.textprocessing.shell_expandvars()

injected_args: Sequence[str]: Shell arguments to inject at the end of every invocation.

invoker: Callable[[str, Sequence[str]], int] | Callable[[Sequence[str]], int]

Invoker function, responsible for executing lines recognized as shell commands.

This can be a function that just accepts a sequence of arguments, or a function that accepts the raw command line as a string followed by the parsed sequence of arguments.

name: str: Name of this batch processor, currently used in the hash bang version check directive and messages.

reserved_template_variables: set[str]: These template variables cannot be set with the \set, \sete, or \setp directive, or un-defined with the \unset directive.

property running_template_continuation: bool: Is code that exists inside a template continuation being processed? :return: True or False

property template_continuation_end_line: int

End line of the template continuation being processed.

Value is only valid if BatchProcessor.running_template_continuation is True.

Returns:: line number

property template_continuation_start_line: int

Start line of the template continuation being processed.

Value is only valid if BatchProcessor.running_template_continuation is True.

Returns:: line number

template_functions: dict[str, Callable[[Any], Any]]: Functions available when templating is occurring.

template_variables: dict[str, Any]: Live template variables.

version: tuple[int, int, int]: Version tuple for the version check hash bang directive.

class dgenerate.batchprocess.CivitAILinksDirective(**kwargs)[source]

Bases: ConfigRunnerPlugin

__init__(**kwargs)[source]

Parameters:: kwargs – plugin base class arguments

class dgenerate.batchprocess.ConfigRunner(injected_args: Sequence[str] | None = None, render_loop: RenderLoop | None = None, plugin_loader: ConfigRunnerPluginLoader = None, version: tuple[int, int, int] | str = '5.0.0', throw: bool = False)[source]

Bases: BatchProcessor

A BatchProcessor that can run dgenerate batch processing configs from a string or file.

__init__(injected_args: Sequence[str] | None = None, render_loop: RenderLoop | None = None, plugin_loader: ConfigRunnerPluginLoader = None, version: tuple[int, int, int] | str = '5.0.0', throw: bool = False)[source]

Raises:

dgenerate.plugin.ModuleFileNotFoundError – If a module path parsed from --plugin-modules in injected_args could not be found on disk.

Parameters:

injected_args – dgenerate command line arguments in the form of a list, see: shlex module, or sys.argv. These arguments will be injected at the end of every dgenerate invocation in the config. --plugin-modules are parsed from injected_args and added to plugin_loader. If -v/--verbose is present in injected_args debugging output will be enabled globally while the config runs, and not just for invocations. Passing -v/--verbose also enables printing stack traces for all unhandled directive exceptions to stderr. If -ofm/--offline-mode is present in injected_args, plugins will be instructed to only look for resources such as models in cache / on disk and never attempt to download from the internet.
render_loop – RenderLoop instance, if None is provided one will be created.
plugin_loader – Batch processor plugin loader, if one is not provided one will be created.
version – Config version for #! dgenerate x.x.x version checks, defaults to dgenerate.__version__
throw – Whether to throw exceptions from dgenerate.invoker.invoke_dgenerate() or handle them. If you set this to True exceptions will propagate out of dgenerate invocations instead of a dgenerate.batchprocess.BatchProcessError being raised by the created dgenerate.batchprocess.BatchProcessor. A line number where the error occurred can be obtained using dgenerate.batchprocess.BatchProcessor.current_line.

generate_directives_help(directive_names: Collection[str] | None = None, help_wrap_width: int | None = None)[source]

Generate the help string for --directives-help

Parameters:

directive_names – Display help for specific directives, if None or [] is specified, display all.
help_wrap_width – Wrap documentation strings by this amount, if None use dgenerate.textprocessing.long_text_wrap_width()

Raises:

ValueError – if given directive names could not be found

Returns:

help string

generate_functions_help(function_names: Collection[str] | None = None, help_wrap_width: int | None = None)[source]

Generate the help string for --functions-help

Parameters:

function_names – Display help for specific functions, if None or [] is specified, display all.
help_wrap_width – Wrap documentation strings by this amount, if None use dgenerate.textprocessing.long_text_wrap_width()

Raises:

ValueError – if given directive names could not be found

Returns:

help string

generate_template_variables_help(variable_names: Collection[str] | None = None, show_values: bool = True)[source]

Generate a help string describing available template variables, their types, and values for use in batch processing.

This is used for --templates-help

Parameters:

variable_names – Display help for specific variables, if None or [] is specified, display all.
show_values – Show the value of the template variable or just the name?

Raises:

ValueError – if given variable names could not be found

Returns:

a human-readable description of all template variables

property local_files_only: bool

Is this config runner only going to look for resources such as models in cache / on disk?

This will be True if -ofm/--offline-mode was parsed from injected_args

property plugin_module_paths: frozenset[str]

Set of plugin module paths if they were injected into the config runner by --plugin-modules or used in a \import_plugins statement in a config.

Returns:: a set of paths, may be empty but not None

class dgenerate.batchprocess.ConfigRunnerPlugin(loaded_by_name: str, config_runner: ConfigRunner | None = None, render_loop: RenderLoop | None = None, local_files_only: bool = False, **kwargs)[source]

Bases: Plugin

Abstract base class for config runner plugin implementations.

__init__(loaded_by_name: str, config_runner: ConfigRunner | None = None, render_loop: RenderLoop | None = None, local_files_only: bool = False, **kwargs)[source]

Parameters:

loaded_by_name – The name the plugin was loaded by, will be passed by the loader. If None is passed, the first name mentioned by the plugin implementation will be used. This can simplify using some plugin classes directly without loading them through a loader implementation.
argument_error_type – This exception type will be raised upon argument errors (invalid arguments) when loading a plugin using a PluginLoader implementation. It should match the argument_error_type given to the PluginLoader implementation being used to load the inheritor of this class.
kwargs – Additional arguments that may arise when using an ARGS static signature definition with multiple NAMES in your implementation.

register_directive(name, implementation: Callable[[Sequence[str]], int])[source]

Register a config directive implementation on the dgenerate.batchprocess.ConfigRunner instance.

Your directive should return a return code, 0 for success and anything else for failure.

Returning non zero will cause BatchProcessError to be raised from the runner, halting execution of the config.

Any non-exiting exception will be eaten and rethrown as BatchProcessError, also halting execution of the config.

Raises:

RuntimeError – if a config directive with the same name already exists

Parameters:

name – directive name
implementation – implementation callable

register_template_function(name, implementation: Callable)[source]

Register a config template function implementation on the dgenerate.batchprocess.ConfigRunner instance.

Raises:

RuntimeError – if a template function with the same name already exists

Parameters:

name – function name
implementation – implementation callable

set_template_variable(name, value)[source]

Set a template variable on the dgenerate.batchprocess.ConfigRunner instance.

Parameters:

name – variable name
value – variable value

update_template_variables(values)[source]

Update multiple template variable values on the dgenerate.batchprocess.ConfigRunner instance.

Parameters:: values – variable values, dictionary of names to values

property config_runner: ConfigRunner | None: Provides access to the currently instantiated dgenerate.batchprocess.ConfigRunner object running the config file that this directive is being invoked in.

property injected_args: Sequence[str]

Return any arguments injected into the config from the command line.

If none were injected an empty sequence will be returned.

Returns:: command line arguments

property local_files_only: bool: Is this plugin only going to look for resources such as models in cache / on disk?

property plugin_module_paths: frozenset[str]

Set of plugin module paths if they were injected into the config runner by --plugin-modules or used in a \import_plugins statement in a config.

Returns:: a set of paths, may be empty but not None

property render_loop: RenderLoop | None

Provides access to the currently instantiated dgenerate.renderloop.RenderLoop object.

This object will have been used for any previous invocation of dgenerate in a config file.

class dgenerate.batchprocess.ConfigRunnerPluginLoader[source]

Bases: PluginLoader

Loads dgenerate.batchprocess.ConfigRunnerPlugin plugins.

__init__()[source]

load(uri: str, **kwargs) → ConfigRunnerPlugin[source]

Load an plugin using a URI string containing its name and arguments.

Parameters:

uri – The URI string
kwargs – default argument values, will be override by arguments specified in the URI

Raises:

ValueError – If uri is None
RuntimeError – If a plugin is discovered to be using a reserved argument name upon loading it.
dgenerate.plugin.PluginArgumentError – If there is an error in the loading arguments for the plugin.
dgenerate.plugin.PluginNotFoundError – If the plugin name mentioned in the URI could not be found.

Returns:

plugin instance

class dgenerate.batchprocess.ImageProcessDirective(**kwargs)[source]

Bases: ConfigRunnerPlugin

__init__(**kwargs)[source]

Parameters:: kwargs – plugin base class arguments

class dgenerate.batchprocess.PromptUpscaleDirective(**kwargs)[source]

Bases: ConfigRunnerPlugin

__init__(**kwargs)[source]

Parameters:: kwargs – plugin base class arguments

class dgenerate.batchprocess.ToDiffusersDirective(**kwargs)[source]

Bases: ConfigRunnerPlugin

__init__(**kwargs)[source]

Parameters:: kwargs – plugin base class arguments

dgenerate.devicecache module

High level utilitys for interacting with objects cached by dgenerate in GPU side memory.

These objects may be cached in small quantity by various dgenerate sub-modules.

dgenerate.devicecache.clear_device_cache(device: device | str | None = None)[source]

Clear every object cached by dgenerate in its GPU side cache for a specific device.

Parameters:: device – The device the objects are cached on, specifying None clears all devices.

dgenerate.devicecache.register_eviction_method(method: Callable[[device | None], None])[source]

Register a method for evicting an object cached on the GPU.

This will be called upon calling clear_device_cache().

Parameters:: method – Eviction method, first argument is the torch.device being considered, if this value is None, any torch device with cached objects should be considered.

dgenerate.eval module

Safe expression parsing with asteval.

dgenerate.eval.safe_builtins() → dict[source]

Return a dictionary / symbol table of basic python builtins that are considered safe and useful with asteval.

Returns:: symbol table

dgenerate.eval.standard_interpreter(symtable: dict | None = None, with_listcomp: bool = True, with_dictcomp: bool = True, with_setcomp: bool = True, with_ifexpr: bool = True, use_numpy: bool = False) → Interpreter[source]

Return a default safe interpreter from asteval.

Nothing that does not exist in symtable will be usable, if you provide no symtable, no functions / variables will be present.

All forms of assignment, import, etc. are disabled.

Parameters:

symtable – Symbol table
with_listcomp – Allow list comprehension?
with_dictcomp – Allow dict comprehension?
with_setcomp – Allow set comprehension?
with_ifexpr – Allow ternary statements?
use_numpy – Import numpy functions directly, without namespace?

Returns:

The interpreter

dgenerate.filecache module

On disk file cache implementation and primitives.

exception dgenerate.filecache.WebFileCacheOfflineModeException[source]

Bases: Exception

Exception raised when the web cache is in offline mode and a file is not found in the cache.

class dgenerate.filecache.CachedFile(data_dict)[source]

Bases: object

Represents the path of a file in a FileCache

__init__(data_dict)[source]

Parameters:: data_dict – file data dict parsed from the cache database.

metadata: dict[str, str]: Optional metadata for the file stored in the database.

path: str: The path to the file on disk.

class dgenerate.filecache.FileCache(db_path: str, cache_dir: str)[source]

Bases: object

A cache system that stores files and their metadata.

__init__(db_path: str, cache_dir: str)[source]

Initializes the FileCache object with a key-value store located at db_path and a cache directory at cache_dir. If the cache directory doesn’t exist, it creates it.

Parameters:

db_path – The path to the key-value store database.
cache_dir – The directory where the cache files are stored.

add(key: str, file_data: bytes | Iterable[bytes], metadata: Dict[str, str] = None, ext: str | None = None) → CachedFile[source]

Adds a file to the cache. If a file with the same key already exists, it overwrites the existing file. Otherwise, it creates a new file with a unique filename.

Parameters:

key – The key associated with the file.
file_data – The data of the file in bytes, or an iterable of binary chunks.
metadata – The metadata of the file.
ext – The extension of the file.

Returns:

A CachedFile object representing the added file.

delete_older_than(timedelta: timedelta) → Iterator[CachedFile][source]: Deletes items from the key-value store that are older than the specified timedelta, yielding each key and its corresponding CachedFile object.

get(key) → CachedFile | None[source]

Retrieves the CachedFile object for the specified key from the key-value store, or returns None if the key does not exist.

Parameters:: key – The key associated with the file.
Returns:: A CachedFile object representing the file, or None if the key does not exist.

items() → Iterator[CachedFile][source]: Yields all items in the key-value store as CachedFile objects.

keys() → Iterator[str][source]: Yields all keys in the key-value store.

class dgenerate.filecache.KeyValueStore(db_path: str)[source]

Bases: object

A key-value store using SQLite3 for storage.

__init__(db_path: str)[source]

Initialize the key-value store.

Parameters:: db_path – The path to the SQLite3 database file.

delete_older_than(timedelta: timedelta) → list[tuple[str, str]][source]

Delete all keys and their associated values that were created more than a certain time ago.

Parameters:: timedelta – The age of the keys to delete.
Returns:: The keys and values that were deleted.

get(key: str, default=None)[source]

Get the value associated with a key.

Parameters:

key – The key to get the value for.
default – The default value to return if the key is not found.

Returns:

The value associated with the key, or the default value if the key is not found.

items() → Iterator[str][source]

Get all values in the store.

Returns:: An iterator over the values in the store.

keys() → Iterator[str][source]

Get all keys in the store.

Returns:: An iterator over the keys in the store.

class dgenerate.filecache.WebFileCache(db_path: str, cache_dir: str, expiry_delta: timedelta = datetime.timedelta(seconds=43200))[source]

Bases: FileCache

A cache system that stores files and their metadata downloaded from the web.

static is_downloadable_url(string) → bool[source]

Does a string represent a URL that can be downloaded by this web cache implementation?

Parameters:: string – the string
Returns:: True or False

__init__(db_path: str, cache_dir: str, expiry_delta: timedelta = datetime.timedelta(seconds=43200))[source]

Initializes the WebFileCache object with a key-value store located at db_path, a cache directory at cache_dir, and an expiry delta. If the cache directory doesn’t exist, it creates it. It also attempts to clear old files.

Parameters:

db_path – The path to the key-value store database.
cache_dir – The directory where the cache files are stored.
expiry_delta – The time delta for file expiry.

download(url, mime_acceptable_desc: str | None = None, mimetype_is_supported: ~typing.Callable[[str], bool] | None = None, unknown_mimetype_exception=<class 'ValueError'>, overwrite: bool = False, tqdm_pbar=<class 'tqdm.std.tqdm'>, local_files_only: bool = False) → CachedFile[source]

Downloads a file and/or returns a file path from the cache. If the mimetype of the file is not supported, it raises an exception.

Raises:

requests.RequestException – Can raise any exception raised by requests.get for request related errors.
WebFileCacheOfflineModeException – If local_files_only mode is enabled and the file is not found in the cache.

Parameters:

url – The URL of the file.
mime_acceptable_desc – A description of acceptable mimetypes for use in exceptions.
mimetype_is_supported – A function that determines if a mimetype is supported for downloading.
unknown_mimetype_exception – The exception type to raise when an unknown mimetype is encountered.
overwrite – Always overwrite any previously cached file?
tqdm_pbar – tqdm progress bar type, if set to None no progress bar will be used. Defaults to tqdm.tqdm
local_files_only – If True, do not attempt to download files, only check cache.

Returns:

The path to the downloaded file.

request_mimetype(url, local_files_only: bool = False) → str[source]

Requests the mimetype of a file at a URL. If the file exists in the cache, a known mimetype is returned without connecting to the internet. Otherwise, it connects to the internet to retrieve the mimetype. This action does not update the cache.

Raises:

HTTPError – On http status errors.
WebFileCacheOfflineModeException – If local_files_only mode is enabled and the file is not found in the cache.

Parameters:

url – The URL of the file.
local_files_only – If True, do not make a request, only check the cache.

Returns:

The mimetype of the file.

property local_files_only: bool

Get the local_files_only mode status.

Returns:: True if local_files_only mode is enabled, False otherwise.

dgenerate.filelock module

Thread / Multiprocess safe file locking utilities.

dgenerate.filelock.suffix_path_maker(filenames: str | Iterable[str], suffix: str) → Callable[[str | None, int | None], str | Iterable[str]][source]

To be used with touch_avoid_duplicate(), a pathmaker implementation that appends a suffix and a number to a filename or list of files when a duplicate is detected for any of them in the directory.

Parameters:

filenames – Original filename, or a list of filenames
suffix – Suffix to append if needed, a trailing number will be appended

Returns:

dgenerate.filelock.temp_file_lock(path)[source]

Multiprocess synchronization utility.

Get a lock on an empty file as a context manager, delete the lock file if possible when done.

Parameters:: path – Path where the lock file will be created.
Returns:: Lock as a context manager

dgenerate.filelock.touch_avoid_duplicate(directory: str, path_maker: Callable[[str | None, int | None], str | Iterable[str]], lock_name: str = '.lock', return_list=False)[source]

Generate a filename in a directory and avoid duplicates using a file lock in that directory with a known name. Use to ensure duplicate checking in a directory is multiprocess safe, at least for processes using this function to write to the same directory.

Parameters:

return_list – Always return a list even if generated paths is only of length 1, defaults to False, which means that a single string will be returned if only one path was generated by the pathmaker
directory – The directory to create the lockfile in
path_maker – Callback that generates paths until a non-existent path is found, first argument is the base filename and the second is attempt number. On the first attempt to create the files both arguments will be none, in which case the callback should return a single filename or iterable of filenames to touch with duplicate avoidance. Calls to the callback thereafter will have non None values for both arguments and the callback should take the passed base filename and apply a suffix using the attempt number.
lock_name – Name of the lock file to be used as a mutex

Returns:

Unique path that has been touched (created but empty), or a tuple of paths if the path maker requested duplicate checks on multiple files

dgenerate.files module

Utilities for file like objects.

class dgenerate.files.GCFile(file)[source]

Bases: object

File object wrapper, close file on garbage collection

__init__(file)[source]

class dgenerate.files.PeekReader(iterator: Iterator[str])[source]

Bases: object

Read from a file like iterator object while peeking at the next line.

This is an iterable reader wrapper that yields the tuple (current_line, next_line)

next_line will be None if the next line is the end of iterator / file.

__init__(iterator: Iterator[str])[source]

Parameters:: iterator – The typing.Iterator capable reader to wrap.

class dgenerate.files.TerminalLineReader(file: BinaryIO | Callable[[], IO])[source]

Bases: object

Reads lines from a binary stream, typically stdout or stderr of a subprocess.

Breaks on newlines and carriage return, preserves newlines and carriage return in the output as is.

__init__(file: BinaryIO | Callable[[], IO])[source]

Parameters:: file – Binary IO object, or a function that returns one.

readline()[source]

property file: BinaryIO: The current file object being read.

pushback_byte: bytes | None

Byte on the stack which will be prepended to the next line if needed.

Should be set to None if file was provided a callable and the underlying reader has changed to a new instance.

class dgenerate.files.Unbuffered(stream)[source]

Bases: object

File wrapper which auto flushes a stream on write

__init__(stream)[source]

write(data)[source]

writelines(datas)[source]

dgenerate.files.stdin_is_tty()[source]

Safely checks if stdin is a tty

Returns:: True or False

dgenerate.globalconfig module

Configure dgenerate’s global constants.

dgenerate.globalconfig.get_config_dict()[source]

Return a dictionary representation of the global configuration.

Returns:: config dictionary

dgenerate.globalconfig.load_config(content_or_stream, mode: str = 'json')[source]

Load global config from a string.

Parameters:

content_or_stream – string content or file like object.
mode – json, yaml, or toml

dgenerate.globalconfig.pop_config()[source]

Pop the last saved configuration off the stack and restore it.

Raises:: IndexError – if the stack is empty.

dgenerate.globalconfig.push_config()[source]: Save the current configuration to the stack.

dgenerate.globalconfig.register_all()[source]: Register all public non-module type global objects inside the current module as config variables.

dgenerate.globalconfig.register_config_variable(module: str | ModuleType, variable_name: str, config_variable_name: str | None = None)[source]

Register a global config variable that exists inside an arbitrary module.

Parameters:

module – The module name or object reference.
variable_name – Name of the variable inside the module.
config_variable_name – Name to represent the variable in the global config file, if left None this will be variable_name in lowercase.

dgenerate.globalconfig.restore_config_context()[source]: Context manager which pushes the current global configuration to the stack and pops it when the with context ends.

dgenerate.globalconfig.serialize_current_config(stream=None, mode: str = 'json') → str | None[source]

Serialize the current global config.

Parameters:

stream – File like object, if not provided this function will return a string.
mode – json, yaml, or toml

Returns:

the serialized config

dgenerate.globalconfig.set_from_config_dict(config_dict: dict)[source]

Set the current global config from a dictionary object.

This dictionary may be partial, i.e. an incomplete set of settings as long as the key names mentioned are correct.

Parameters:: config_dict – The config dictionary
Raises:: KeyError – If a configuration key name is not valid.

dgenerate.hfhub module

Hugging Face Hub utilities for supporting Hugging Face downloads.

exception dgenerate.hfhub.NonHFConfigDownloadError[source]

Bases: NonHFDownloadError

Raised when a non-Hugging Face config download fails.

exception dgenerate.hfhub.NonHFDownloadError[source]

Bases: Exception

Raised when a non-Hugging Face download fails.

exception dgenerate.hfhub.NonHFModelDownloadError[source]

Bases: NonHFDownloadError

Raised when a non-Hugging Face model download fails.

class dgenerate.hfhub.HFBlobLink(repo_id: str, revision: str, subfolder: str, weight_name: str)[source]

Bases: object

Represents the constituents of a huggingface blob link.

static parse(blob_link)[source]

Attempt to parse a huggingface blob link out of a string.

If the string does not contain a blob link, return None.

Parameters:: blob_link – supposed blob link string
Returns:: HFBlobLink or None

__init__(repo_id: str, revision: str, subfolder: str, weight_name: str)[source]

repo_id: str

revision: str

subfolder: str

weight_name: str

dgenerate.hfhub.disable_offline_mode()[source]

Disable global offline mode for huggingface_hub.

This will allow network requests to the hub to be made again.

dgenerate.hfhub.download_non_hf_slug_config(path: str, local_files_only: bool = False)[source]

Check for a non hugging face slug or reference to a config file that is possibly downloadable as a single file.

If this is applicable, download it to the web cache and return its path. If the file already exists in the web cache simply return a path to it.

Hugging Face blob links are also supported, in which case the file will be downloaded to the huggingface cache.

If this is not applicable, return the path unchanged.

TQDM progress bar is used for any download that occurs.

Raises:

NonHFConfigDownloadError – If the download mimetype is not text/* or application/*.
dgenerate.webcache.WebFileCacheOfflineModeException – If local_files_only is True and a download is required for a non Hugging Face blob link. This will occur if the file in question is not found in the dgenerate web cache. This can also occur if the dgenerate.webcache` module is in global offline mode.
huggingface_hub.errors.HFValidationError – If the Hugging Face blob link is invalid.
huggingface_hub.errors.HfHubHTTPError – If the Hugging Face blob link is valid but the file could not be downloaded. This can also occur if local_files_only is True and the file is not found in the cache.
huggingface_hub.errors.OfflineModeIsEnabled – If global offline mode is enabled for huggingface_hub and the file is not found in the cache.

Parameters:

path – proposed model path
local_files_only – If True, do not attempt to download files, only check cache.

Returns:

path to downloaded file or unchanged model path.

dgenerate.hfhub.download_non_hf_slug_model(model_path: str, local_files_only: bool = False)[source]

Check for a non hugging face slug or reference to a model that is possibly downloadable as a single file.

If this is applicable, download it to the web cache and return its path. If the file already exists in the web cache simply return a path to it.

Hugging Face blob links are also supported, in which case the file will be downloaded to the huggingface cache.

If this is not applicable, return the path unchanged.

TQDM progress bar is used for any download that occurs.

Raises:

NonHFModelDownloadError – If the download mimetype is None or text/*.
dgenerate.webcache.WebFileCacheOfflineModeException – If local_files_only is True and a download is required for a non Hugging Face blob link. This will occur if the file in question is not found in the dgenerate web cache. This can also occur if the dgenerate.webcache` module is in global offline mode.
huggingface_hub.errors.HfHubHTTPError – If the Hugging Face blob link is valid but the file could not be downloaded. This can also occur if local_files_only is True and the file is not found in the cache.
huggingface_hub.errors.OfflineModeIsEnabled – If global offline mode is enabled for huggingface_hub and the file is not found in the cache.

Parameters:

model_path – proposed model path
local_files_only – If True, do not attempt to download files, only check cache.

Returns:

path to downloaded file or unchanged model path.

dgenerate.hfhub.enable_offline_mode()[source]

Enable global offline mode for huggingface_hub.

This will prevent any network requests from being made, and will only use files that are already in the hub cache.

dgenerate.hfhub.is_offline_mode() → bool[source]

Check if the global offline mode for huggingface_hub is enabled.

Returns:: True if offline mode is enabled, False otherwise.

dgenerate.hfhub.is_single_file_model_load(path)[source]

Should we use from_single_file() on this path?

Parameters:: path – The path
Returns:: True or False

dgenerate.hfhub.offline_mode_context(enabled=True)[source]

Context manager to temporarily enable or disable global offline mode for huggingface_hub.

Parameters:: enabled – If True, enables offline mode. If False, disables it.

dgenerate.hfhub.webcache_or_hf_blob_download(url: str, mime_acceptable_desc: str | None = None, mimetype_is_supported: ~typing.Callable[[str], bool] | None = None, unknown_mimetype_exception: type[Exception] = <class 'dgenerate.hfhub.NonHFDownloadError'>, local_files_only: bool = False) → str[source]

Download to dgenerate web cache or Hugging Face cache, depending on the model path.

If model path is a Hugging Face blob link, it will be downloaded to the Hugging Face cache.

If not, it will be downloaded to the dgenerate web cache.

TQDM progress bar is used for any download that occurs, TQDM progress bars will differ somewhat in appearance depending on whether the file is downloaded to the web cache or Hugging Face cache.

Parameters:

url – The url
mime_acceptable_desc – A description of acceptable mimetypes for use in exceptions. (dgenerate webcache)
mimetype_is_supported – A function that determines if a mimetype is supported for downloading. (dgenerate webcache)
unknown_mimetype_exception – The exception type to raise when an unknown mimetype is encountered. (dgenerate webcache)
local_files_only – If True, do not attempt to download files, only check cache.

Raises:

NonHFDownloadError – If the download mimetype unsupported.
dgenerate.webcache.WebFileCacheOfflineModeException – If local_files_only is True and a download is required for a non Hugging Face blob link. This will occur if the file in question is not found in the dgenerate web cache. This can also occur if the dgenerate.webcache` module is in global offline mode.
huggingface_hub.errors.HFValidationError – If the Hugging Face blob link is invalid.
huggingface_hub.errors.HfHubHTTPError – If the Hugging Face blob link is valid but the file could not be downloaded. This can also occur if local_files_only is True and the file is not found in the cache.
huggingface_hub.errors.OfflineModeIsEnabled – If global offline mode is enabled for huggingface_hub and the file is not found in the cache.

Returns:

filepath

dgenerate.hfhub.with_hf_errors_as_config_not_found(catch_all: Callable[[Exception], None] = None)[source]

Context manager that catches Hugging Face hub errors associated with missing models or invalid model name specification and raises a dgenerate.exceptions.ConfigNotFoundError exception.

Parameters:: catch_all – Optional callable to catch and handle all other exceptions.
Raises:: dgenerate.exceptions.ConfigNotFoundError – If a Hugging Face hub error occurs

dgenerate.hfhub.with_hf_errors_as_model_not_found(catch_all: Callable[[Exception], None] = None)[source]

Context manager that catches Hugging Face hub errors associated with missing models or invalid model name specification and raises a dgenerate.exceptions.ModelNotFoundError exception.

Parameters:: catch_all – Optional callable to catch and handle all other exceptions.
Raises:: dgenerate.exceptions.ModelNotFoundError – If a Hugging Face hub error occurs

dgenerate.image module

Image operations commonly used by dgenerate.

dgenerate.image.align_by(iterable: Iterable[int], align: int) → tuple[source]

Align all elements by a value and return a tuple

Parameters:

iterable – Elements to align
align – The alignment value, None indicates no alignment.

Returns:

tuple(…)

dgenerate.image.best_cv2_resampling(old_size: tuple[int, int], new_size: tuple[int, int]) → int[source]

Auto-select the best OpenCV resampling setting for a resize operation.

Parameters:

old_size – (tuple) Source image shape (height, width, channels).
new_size – (tuple) Destination image shape (height, width).

Returns:

(int) Best OpenCV interpolation method.

dgenerate.image.best_pil_resampling(old_size: tuple[int, int], new_size: tuple[int, int]) → Resampling[source]

Auto-select the best PIL resampling setting for a resize operation.

Parameters:

old_size – (tuple) Source image size (width, height).
new_size – (tuple) Destination image size (width, height).

Returns:

(PIL.Image.Resampling) Best resampling method.

dgenerate.image.copy_img(img: Image)[source]

Copy a PIL.Image.Image while preserving its filename attribute.

Parameters:: img – the image
Returns:: a copy of the image

dgenerate.image.create_jpeg_exif_with_user_comment(comment: str) → bytes[source]

Return JPEG EXIF data with a user comment field, this can be used with PIL.Image.save(img, exif=...).

This function is specifically for saving JPEG files only.

Returns:: EXIF data (bytes)

dgenerate.image.cv2_resize_image(img: ndarray, size: tuple[int, int] | None, aspect_correct: bool = False, align: int | None = None, algo: int | None = None)[source]

Resize a numpy.ndarray image and return a copy.

This function always returns a copy even if the images size did not change.

The new image dimension will always be forcefully aligned by align, specifying None or 1 indicates no alignment.

The filename attribute is preserved.

Parameters:

img – the image to resize
size – requested new size for the image, may be None.
aspect_correct – preserve aspect ratio?
align – Force alignment by this amount of pixels.
algo – cv2 resampling algorithm

Returns:

the resized image

dgenerate.image.find_mask_bounds(img: Image, padding: str | int | tuple[int, int] | tuple[int, int, int, int]) → tuple[int, int, int, int] | None[source]

Find the bounding box of white pixels in the mask. If no bounding box can be found, return None.

Raises:

ValueError – If the padding value is specified incorrectly.

Parameters:

img – The mask image (PIL Image)
padding – Bounding box padding value, see: normalize_padding_value() for accepted values.

Returns:

Tuple of (left, top, right, bottom) bounds, or None if no white pixels found.

dgenerate.image.get_filename(img: Image)[source]

Get the PIL.Image.Image.filename attribute or “NO_FILENAME” if it does not exist.

Parameters:: img – PIL.Image.Image
Returns:: filename string or “NO_FILENAME”

dgenerate.image.is_aligned(iterable: Iterable[int], align: int) → bool[source]

Check if all elements are aligned by a specific value.

Parameters:

iterable – Elements to test
align – The alignment value, None indicates no alignment.

Returns:

bool

dgenerate.image.is_image(obj) → bool[source]

Check if an object is a PIL Image.

Parameters:: obj – object to check
Returns:: True if the object is a PIL.Image.Image

dgenerate.image.is_power_of_two(iterable: Iterable[int]) → bool[source]

Check if all elements are a power of 2.

Parameters:: iterable – Elements to test
Returns:: bool

dgenerate.image.letterbox_image(img: <module 'PIL.Image' from '/home/docs/checkouts/readthedocs.org/user_builds/dgenerate/envs/latest/lib/python3.11/site-packages/PIL/Image.py'>, box_size: int | tuple[int, int] | tuple[int, int, int, int], box_is_padding: bool = False, box_color: str | int | float | tuple[int, int, int] | tuple[float, float, float] | None = None, inner_size: tuple[int, int] = None, aspect_correct: bool = True)[source]

Letterbox an image on to a colored background.

Parameters:

img – The image to letterbox
box_size –
Size of the outer letterbox, or padding values. - If box_is_padding=False:
- (int) both width and height equal to this integer
- (width, height) tuple for final letterbox size
- If box_is_padding=True: Can be either:
  - (padding) for uniform padding
  - (horizontal_padding, vertical_padding) for uniform padding
  - (left, top, right, bottom) for individual padding on each side
box_is_padding – The box_size argument should be interpreted as padding?
box_color – What color to use for the letter box background, the default is black. This should be specified as a HEX color code, or as a 3 tuple of integer or floating point RGB values, or as a single integer or float representing all color channels.
inner_size – The size of the inner image
aspect_correct – Should the size of the inner image be aspect correct?

Returns:

The letterboxed image

dgenerate.image.nearest_power_of_two(iterable: Iterable[int]) → tuple[source]

Round all elements to the nearest power of two and return a tuple.

Parameters:: iterable – Elements to round
Returns:: tuple(…)

dgenerate.image.normalize_padding_value(padding: str | int | tuple[int, int] | tuple[int, int, int, int]) → tuple[int, int, int, int][source]

Normalize a padding value.

This value can be a string, e.g. "10", or "10x10", or "10x10x10x10"

It can also be specified as a python int or tuple

Multidimensional padding values are laid out as: LEFTxTOPxRIGHTxBOTTOM, or WIDTHxHEIGHT

This is the same all across dgenerate.

Raises:: ValueError – If the padding value is specified incorrectly.
Parameters:: padding – Padding value
Returns:: Normalized padding (4 tuple of int)

dgenerate.image.paste_with_feather(background: Image, foreground: Image, location: tuple[int, int] | tuple[int, int, int, int] | list[int], feather: int = 30, shape: str = 'rectangle') → Image[source]

Composite an image onto a background with feathered (soft) edges.

Creates smooth, blended transitions between foreground and background images by applying Gaussian blur to a mask, eliminating hard edges. The feathering effect is achieved by shrinking the mask and then blurring it.

Parameters:

background – The background image to paste onto. Will be converted to RGBA mode.
foreground – The foreground image to paste. Will be resized to fit the specified location.
location – Specifies where to place the image. 2 elements (x, y) for top-left corner offset using input_img original size, 4 elements (x0, y0, x1, y1) for bounding box coordinates, or None to center with margin based on feather width.
feather – The desired width of the feathered edge in pixels.
shape – The shape of the mask. r / rect / rectangle for rectangular mask, c / circle / ellipse for circular.

Returns:

The composite image with feathered edges in the mode (channels) of the background image.

Raises:

ValueError – If location is provided but doesn’t contain 2 or 4 elements. If shape is not recognized.

dgenerate.image.read_jpeg_exif_user_comment(img: Image) → str | None[source]

Read the user comment field from a JPEG EXIF data, this can be used with PIL.Image.open(img).

This function is specifically for JPEG files only.

Parameters:: img – PIL.Image.Image
Returns:: user comment string or empty string if not found

dgenerate.image.resize_image(img: Image, size: tuple[int, int] | None, aspect_correct: bool = False, align: int | None = None, algo: Resampling | None = None)[source]

Resize a PIL.Image.Image and return a copy.

This function always returns a copy even if the images size did not change.

The new image dimension will always be forcefully aligned by align, specifying None or 1 indicates no alignment.

The filename attribute is preserved.

Parameters:

img – the image to resize
size – requested new size for the image, may be None.
aspect_correct – preserve aspect ratio?
align – Force alignment by this amount of pixels.
algo – Resampling algorithm

Returns:

the resized image

dgenerate.image.resize_image_calc(old_size: tuple[int, int], new_size: tuple[int, int] | None, aspect_correct: bool = False, align: int | None = None)[source]

Calculate the new dimensions for a requested resize of an image..

Parameters:

old_size – The old image size
new_size – The new image size
aspect_correct – preserve aspect ratio?
align – Ensure returned size is aligned to this value.

Returns:

calculated new size

dgenerate.image.to_rgb(img: Image)[source]

Convert a PIL.Image.Image to RGB format while preserving its filename attribute.

Parameters:: img – the image
Returns:: a converted copy of the image

dgenerate.image_process module

Implements the behaviors of dgenerate’s image-process sub-command and \image_process config directive.

exception dgenerate.image_process.ImageProcessHelpException[source]

Bases: Exception

Raised by parse_args() when --help is encountered and help_raises=True

exception dgenerate.image_process.ImageProcessRenderLoopConfigError[source]

Bases: Exception

Raised by ImageProcessRenderLoopConfig.check() on validation errors.

exception dgenerate.image_process.ImageProcessUsageError[source]

Bases: Exception

Thrown by parse_args() on usage errors.

class dgenerate.image_process.AnimationETAEvent(origin, frame_index: int, total_frames: int, eta: timedelta)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when there is an update about the estimated finish time of an animation being generated.

__init__(origin, frame_index: int, total_frames: int, eta: timedelta)[source]

eta: timedelta: Current estimated time to complete the animation.

frame_index: int: Frame index at which the ETA was calculated.

total_frames: int: Total frames needed for the animation to complete.

class dgenerate.image_process.AnimationFileFinishedEvent(origin: ImageProcessRenderLoop, path: str, starting_event: StartingAnimationFileEvent)[source]

Bases: Event

Generated in the event stream of ImageProcessRenderLoop.events()

Occurs when an animation (video or animated image) has finished being written to disk.

__init__(origin: ImageProcessRenderLoop, path: str, starting_event: StartingAnimationFileEvent)[source]

path: str: Path on disk where the video/animated image was saved.

starting_event: StartingAnimationFileEvent: Animation StartingAnimationFileEvent related to this file finished event.

class dgenerate.image_process.AnimationFinishedEvent(origin, starting_event: StartingAnimationEvent)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are done generating.

This occurs whether an animation was written to disk or not.

__init__(origin, starting_event: StartingAnimationEvent)[source]

starting_event: StartingAnimationEvent: Animation StartingAnimationEvent related to this file finished event.

class dgenerate.image_process.ImageFileSavedEvent(origin: ImageProcessRenderLoop, generated_event, path)[source]

Bases: Event

Generated in the event stream of ImageProcessRenderLoop.events()

Occurs when an image file is written to disk.

__init__(origin: ImageProcessRenderLoop, generated_event, path)[source]

generated_event: ImageGeneratedEvent: The ImageGeneratedEvent for the image that was saved.

path: str: Path to the saved image.

class dgenerate.image_process.ImageGeneratedEvent(origin: ImageProcessRenderLoop, image: Image, generation_step: int, suggested_directory: str, suggested_filename: str, is_animation_frame=False, frame_index: int | None = None)[source]

Bases: Event

Generated in the event stream of ImageProcessRenderLoop.events()

Occurs when an image is generated (but not saved yet).

__init__(origin: ImageProcessRenderLoop, image: Image, generation_step: int, suggested_directory: str, suggested_filename: str, is_animation_frame=False, frame_index: int | None = None)[source]

frame_index: int | None: The frame index if this is an animation frame.

generation_step: int: The current generation step. (zero indexed)

image: Image: The generated image.

is_animation_frame: bool: Is this image a frame in an animation?

suggested_directory: str

A suggested directory path for saving this image in.

A value of '.' may be present, this indicates the current working directory.

suggested_filename: str: A suggested filename for saving this image as. This filename will be unique to the render loop run / configuration. This is just the filename, it will not contain a directory name.

class dgenerate.image_process.ImageProcessArgs[source]

Bases: ImageProcessRenderLoopConfig

Configuration object for ImageProcessRenderLoop

__init__()[source]

plugin_module_paths: Sequence[str]

class dgenerate.image_process.ImageProcessRenderLoop(config: ImageProcessRenderLoopConfig = None, image_processor_loader: ImageProcessorLoader | None = None, message_header: str = 'image-process', disable_writes: bool = False)[source]

Bases: object

Implements the behavior of the image-process sub-command as well as \image_process directive.

__init__(config: ImageProcessRenderLoopConfig = None, image_processor_loader: ImageProcessorLoader | None = None, message_header: str = 'image-process', disable_writes: bool = False)[source]

Parameters:

config – ImageProcessRenderLoopConfig. If None is provided, a ImageProcessRenderLoopConfig instance will be created and assigned to ImageProcessRenderLoop.config.
image_processor_loader – dgenerate.imageprocessors.ImageProcessorLoader. If None is provided, an instance will be created and assigned to ImageProcessRenderLoop.image_processor_loader.
message_header – Used as the header for messages written via dgenerate.messages
disable_writes – Disable or enable all writes to disk, if you intend to only ever use the event stream of the render loop when using dgenerate as a library, this is a useful option. ImageProcessRenderLoop.written_images and ImageProcessRenderLoop.written_animations will not be available if writes to disk are disabled.

Run the render loop, and iterate over a stream of event objects produced by the render loop.

This calls ImageProcessRenderLoopConfig.check() on a copy of your configuration prior to running.

Event objects are of the union type RenderLoopEvent

The exceptions mentioned here are those you may encounter upon iterating, they will not occur upon simple acquisition of the event stream iterator.

Raises:

dgenerate.OutOfMemoryError – if the execution device runs out of memory
ImageProcessRenderLoopConfigError – on config errors

Returns:

RenderLoopEventStream

run()[source]

Run the render loop, this calls ImageProcessRenderLoopConfig.check() on a copy of your config prior to running.

Raises:

dgenerate.OutOfMemoryError – if the execution device runs out of memory
ImageProcessRenderLoopConfigError – on config errors

config: ImageProcessRenderLoopConfig = None: Render loop configuration.

disable_writes: bool = False

Disable or enable all writes to disk, if you intend to only ever use the event stream of the render loop when using dgenerate as a library, this is a useful option.

RenderLoop.last_images and last_animations will not be available if writes to disk are disabled.

image_processor_loader: ImageProcessorLoader: The loader responsible for loading user specified image processors

message_header: str = 'image-process': Used as the header for messages written via dgenerate.messages

property written_animations: Iterable[str]: Iterable over animation filenames written by the last run

property written_images: Iterable[str]: Iterable over image filenames written by the last run

class dgenerate.image_process.ImageProcessRenderLoopConfig[source]

Bases: SetFromMixin

__init__()[source]

check(attribute_namer: Callable[[str], str] = None)[source]

Performs logical validation on the configuration.

This may modify the configuration.

copy() → ImageProcessRenderLoopConfig[source]

Create a deep copy of this ImageProcessRenderLoopConfig instance.

Returns:: ImageProcessRenderLoopConfig instance that is a deep copy of this instance.

align: int = 1: Forced image alignment, corresponds to -al/--align

device: str = 'cpu': Rendering device, corresponds to -d/--device

frame_end: int | None = None: Optional zero indexed inclusive frame slice end, corresponds to -fe/--frame-end

frame_format: str = 'png': Animation frame format, corresponds to -ff/-frame-format

frame_start: int = 0: Zero indexed inclusive frame slice start, corresponds to -fs/--frame-start

input: Sequence[str]: Input file paths.

no_animation_file: bool = False: Disable animated file output when rendering an animation? mutually exclusive with no_frames. Corresponds to -naf/--no-animation-file

no_aspect: bool = False: Disable aspect correction? corresponds to -na/--no-aspect

no_frames: bool = False: Disable frame output when rendering an animation? mutually exclusive with no_animation. Corresponds to -nf/--no-frames

offline_mode: bool = False: Setting to true prevents dgenerate from downloading Hugging Face hub models that do not exist in the disk cache or a folder on disk. Referencing a model on Hugging Face hub that has not been cached because it was not previously downloaded will result in a failure when using this option.

output: Sequence[str] | None = None: Output file paths, corresponds to -o/--output

output_overwrite: bool = False: Should existing files be overwritten? corresponds to -ox/--output-overwrite

processors: Sequence[str] | None = None: Image processor URIs, corresponds to -p/--processors

resize: tuple[int, int] | None = None: Naive resizing value, corresponds to -r/--resize

class dgenerate.image_process.StartingAnimationEvent(origin, total_frames: int, fps: float, frame_duration: float)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are about to start being generated.

This occurs whether an animation is going to be written to disk or not.

__init__(origin, total_frames: int, fps: float, frame_duration: float)[source]

fps: float: FPS of the generated file.

frame_duration: float: Frame duration of the generated file, (the time a frame is visible in milliseconds)

total_frames: int: Number of frames written.

class dgenerate.image_process.StartingAnimationFileEvent(origin, path: str, total_frames: int, fps: float, frame_duration: float)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are about to start being written to a file.

__init__(origin, path: str, total_frames: int, fps: float, frame_duration: float)[source]

fps: float: FPS of the generated file.

frame_duration: float: Frame duration of the generated file, (the time a frame is visible in milliseconds)

path: str: File path where the animation will reside.

total_frames: int: Number of frames written.

class dgenerate.image_process.StartingGenerationStepEvent(origin, generation_step: int, total_steps: int)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a generation step is starting, a generation step may produce multiple images and or an animation.

__init__(origin, generation_step: int, total_steps: int)[source]

generation_step: int: The generation step number.

total_steps: int: The total number of steps that are needed to complete the render loop.

dgenerate.image_process.invoke_image_process(args: Sequence[str], render_loop: ImageProcessRenderLoop | None = None, config_overrides: dict[str, Any] | None = None, throw: bool = False, log_error: bool = True, help_raises: bool = False, help_name: str = 'image-process', help_desc: str | None = None) → int[source]

Invoke image-process using its command line arguments and return a return code.

image-process is invoked in the current process, this method does not spawn a subprocess.

Parameters:

args – image-process command line arguments in the form of a list, see: shlex module, or sys.argv
render_loop – ImageProcessRenderLoop instance, if None is provided one will be created. Note that the config object generated by argument parsing will completely overwrite the render loop config.
config_overrides – Optional dictionary of configuration overrides to apply to the render loop config object after argument parsing, this should consist of attribute names with values, the config object generated by argument parsing is of type dgenerate.image_process.arguments.ImageProcessArgs.
throw – Whether to throw known exceptions or handle them.
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises ImageProcessHelpException ? When True, this will occur even if throw=False
help_name – name used in the --help output
help_desc – description used in the --help output, if None is provided a default value will be used.

Raises:

ImageProcessUsageError –
ImageProcessHelpException –
dgenerate.ImageProcessorArgumentError –
dgenerate.ImageProcessorNotFoundError –
dgenerate.FrameStartOutOfBounds –
dgenerate.MediaIdentificationError –
dgenerate.OutOfMemoryError –
EnvironmentError –

Returns:

integer return-code, anything other than 0 is failure

dgenerate.image_process.invoke_image_process_events(args: Sequence[str], render_loop: ImageProcessRenderLoop | None = None, config_overrides: dict[str, Any] | None = None, throw: bool = False, log_error: bool = True, help_raises: bool = False, help_name: str = 'image-process', help_desc: str | None = None) → Generator[ImageProcessExitEvent | ImageGeneratedEvent | StartingAnimationEvent | StartingAnimationFileEvent | AnimationFileFinishedEvent | ImageFileSavedEvent | AnimationFinishedEvent | StartingGenerationStepEvent | AnimationETAEvent, None, None][source]

Invoke image-process using its command line arguments and return a stream of events.

image-process is invoked in the current process, this method does not spawn a subprocess.

The exceptions mentioned here are those you may encounter upon iterating, they will not occur upon simple acquisition of the event stream iterator.

Parameters:

args – image-process command line arguments in the form of a list, see: shlex module, or sys.argv
render_loop – ImageProcessRenderLoop instance, if None is provided one will be created. Note that the config object generated by argument parsing will completely overwrite the render loop config.
config_overrides – Optional dictionary of configuration overrides to apply to the render loop config object after argument parsing, this should consist of attribute names with values, the config object generated by argument parsing is of type dgenerate.image_process.arguments.ImageProcessArgs.
throw – Whether to throw known exceptions or handle them.
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises ImageProcessHelpException ? When True, this will occur even if throw=False
help_name – name used in the --help output
help_desc – description used in the --help output, if None is provided a default value will be used.

Raises:

ImageProcessUsageError –
ImageProcessHelpException –
dgenerate.ImageProcessorArgumentError –
dgenerate.ImageProcessorNotFoundError –
dgenerate.FrameStartOutOfBounds –
dgenerate.MediaIdentificationError –
dgenerate.OutOfMemoryError –
EnvironmentError –

Returns:

InvokeImageProcessEventStream

dgenerate.image_process.parse_args(args: Sequence[str] | None = None, overrides: dict[str, Any] | None = None, help_name: str = 'image-process', help_desc: str = None, throw: bool = True, log_error: bool = True, help_raises: bool = False) → ImageProcessArgs | None[source]

Parse and validate the arguments used for image-process, which is a dgenerate sub-command as well as config directive.

Parameters:

args – command line arguments
overrides – Optional dictionary of overrides to apply to the ImageProcessArgs object after parsing but before validation, this should consist of attribute names with values.
help_name – program name displayed in --help output.
help_desc – program description displayed in --help output.
throw – throw ImageProcessUsageError on error? defaults to True
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises ImageProcessHelpException ? When True, this will occur even if throw=False

Raises:

ImageProcessUsageError –
ImageProcessHelpException –

Returns:

parsed arguments object

dgenerate.imageprocessors module

Image processors implemented by dgenerate.

This includes many image processing tasks useful for creating diffusion input images, or for postprocessing.

exception dgenerate.imageprocessors.ImageProcessorArgumentError[source]

Bases: PluginArgumentError, ImageProcessorError

Raised when an image processor receives invalid arguments.

exception dgenerate.imageprocessors.ImageProcessorError[source]

Bases: Exception

Generic image processor error base exception.

exception dgenerate.imageprocessors.ImageProcessorImageModeError[source]

Bases: ImageProcessorError

Raised when an image processor cannot support a PIL images reported mode.

A mode being a mode string such as RGB, BGR, etc.

exception dgenerate.imageprocessors.ImageProcessorNotFoundError[source]

Bases: PluginNotFoundError, ImageProcessorError

Raised when a reference to an unknown image processor name is made.

Bases: ImageProcessor

adetailer, diffusion based post processor for SD1.5, SDXL, Kolors, SD3, and Flux

adetailer can detect features of your image and automatically generate an inpaint mask for them, such as faces, hands etc. and then re-run diffusion over those portions of the image using inpainting to enhance detail.

This image processor may only be used if a diffusion pipeline has been previously executed by dgenerate, that pipeline will be used to process the inpainting done by adetailer. For a single command line invocation you must use –post-processors to use this image processor correctly. In dgenerate config script, you may use it anywhere, and the last executed diffusion pipeline will be reused for inpainting.

Inpainting will occur on the device used by the last executed diffusion pipeline unless the “device” argument is specified, the detector model can be run on an alternate GPU if desired using the “detector-device” argument, otherwise the detector will run on “device”.

Example:

NOWRAP! –post-processors “adetailer;

model=Bingsu/adetailer;weight-name=face_yolov8n.pt;prompt=detailed image of a mans face;negative-prompt=nsfw, blurry, disfigured;guidance-scale=7;inference-steps=30;strength=0.4”

The “model” argument specifies which YOLO model to use. This can be a path to a local model file, a URL to download the model from, or a HuggingFace repository slug / blob link.

The “prompt” argument specifies the positive prompt to use for inpainting.

The “negative-prompt” argument specifies the negative prompt for inpainting.

The “prompt-weighter” argument specifies a prompt weighter plugin for applying prompt weighting to the provided positive and negative prompts. Prompt weighters may have arguments, when supplying URI arguments to a prompt weighter you must use double quoting around the prompt weighter definition, i.e: –post-processors “adetailer;model=…;prompt=test;prompt-weighter=’compel;syntax=sdwui’”

The “weight-name” argument specifies the file name in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument.

The “subfolder” argument specifies the subfolder in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument.

The “revision” argument specifies the revision of a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument. For example: “main”

The “token” argument specifies your HuggingFace authentication token explicitly if needed.

The “local-files-only” argument specifies that dgenerate should not attempt to download any model files, and to only look for them locally in the cache or otherwise.

The “seed” argument can be used to specify a specific seed for diffusion when performing inpainting on the input image.

The “inference-steps” argument specifies the amount of inference steps when performing inpainting on the input image.

The “guidance-scale” argument specifies the guidance scale for inpainting.

The “pag-scale” argument indicates the perturbed attention guidance scale, this enables a PAG inpaint pipeline if supported. If the previously used pipeline was a PAG pipeline, PAG is automatically enabled for inpainting if supported and this value defaults to 3.0 if not supplied. The adetailer processor supports PAG with –model-type sd and sdxl.

The “pag-adaptive-scale” argument indicates the perturbed attention guidance adaptive scale, this enables a PAG inpaint pipeline if supported. If the previously usee pipeline was a PAG pipeline, PAG is automatically enabled for inpainting if supported and this value defaults to 0.0 if not supplied. The adetailer processor supports PAG with –model-type sd and sdxl.

The “strength” argument is analogous to –image-seed-strengths

The “class-filter” argument can be used to detect only specific classes. This should be a comma-separated list of class IDs or class names, or a single value, for example: “0,2,person,car”. This filter is applied before “index-filter”.

Example “class-filter” values:

NOWRAP! # Only keep detection class ID 0 class-filter=0

NOWRAP! # Only keep detection class “hand” class-filter=hand

NOWRAP! # keep class ID 2,3 class-filter=2,3

NOWRAP! # keep class ID 0 & class Name “hand” # if entry cannot be parsed as an integer # it is interpreted as a name class-filter=0,hand

NOWRAP! # “0” is interpreted as a name and not an ID, # this is not likely to be useful class-filter=”0”,hand

NOWRAP! # List syntax is supported, you must quote # class names index-filter=[0, “hand”]

The “index-filter” argument is a list values or a single value that indicates what YOLO detection indices to keep, the index values start at zero. Detections are sorted by their top left bounding box coordinate from left to right, top to bottom, by (confidence descending). The order of detections in the image is identical to the reading order of words on a page (english). Inpainting will only be performed on the specified detection indices, if no indices are specified, then inpainting will be performed on all detections.

Example “index-filter” values:

NOWRAP! # keep the first, leftmost, topmost detection index-filter=0

NOWRAP! # keep detections 1 and 3 index-filter=[1, 3]

NOWRAP! # CSV syntax is supported (tuple) index-filter=1,3

The “detector-padding” argument specifies the amount of padding that will be added to the detection rectangle which is used to generate a masked area. The default is 0, you can make the mask area around the detected feature larger with positive padding and smaller with negative padding.

Padding examples:

NOWRAP! 32 (32px Uniform, all sides)

NOWRAP! 10x20 (10px Horizontal, 20px Vertical)

NOWRAP! 10x20x30x40 (10px Left, 20px Top, 30px Right, 40px Bottom)

The “mask-padding” argument indicates how much padding to place around the masked area when cropping out the image to be inpainted. This value must be large enough to accommodate any feathering on the edge of the mask caused by “mask-blur” or “mask-dilation” for the best result, the default value is 32. The syntax for specifying this value is identical to “detector-padding”.

The “mask-shape” argument indicates what mask shape adetailer should attempt to draw around a detected feature, the default value is “rectangle”. You may also specify “circle” to generate an ellipsoid shaped mask, which might be helpful for achieving better blending.

The “mask-blur” argument indicates the level of gaussian blur to apply to the generated inpaint mask, which can help with smooth blending in of the inpainted feature

The “mask-dilation” argument indicates the amount of dilation applied to the inpaint mask, see: cv2.dilate

The “model-masks” argument indicates that masks generated by the model itself should be preferred over masks generated from the detection bounding box. If this is True, and the model itself returns mask data, “mask-shape”, “mask-padding”, and “detector-padding” will all be ignored.

The “confidence” argument can be used to adjust the confidence value for the YOLO detector model. Defaults to: 0.3

The “detector-device” argument can be used to specify a device override for the YOLO detector, i.e. the GPU / Accelerate device the model will run on. Example: cuda:0, cuda:1, cpu

The “size” argument specifies the target size for processing detected areas. When specified, detected areas will always be scaled to this target size (with aspect ratio preserved) for processing, then scaled back to the original size for compositing. This can significantly improve detail quality for small detected features like faces or hands, or reduce processing time for overly large detected areas.

The scaling is based on the larger dimension (width or height) of the detected area. If the detected area’s larger dimension is smaller than the target size, it will be upscaled. If the detected area’s larger dimension is larger than the target size, it will be downscaled. Scaling is always performed when this argument is specified.

The value must be an integer greater than 1. The optimal resampling method is automatically selected based on whether upscaling or downscaling is needed.

Example: size=1024 (always process detected areas at 1024px for the larger dimension)

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

Parameters:: kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_post_resize() to invoke it.

Parameters:: image – image to process
Returns:: the processed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_pre_resize() to invoke it.

Parameters:

image – image to process
resize_resolution – image will be resized to this resolution after this process is complete. If None is passed no resize is going to occur. It is not the duty of the inheritor to resize the image, in fact it should NEVER be resized.

Returns:

the processed image

to(device) → AdetailerProcessor[source]

Does nothing for this processor.

Parameters:: device – the device
Returns:: this processor

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': 'in'}}

HIDE_ARGS = ['pipe', 'model-offload']

NAMES = ['adetailer']

OPTION_ARGS = {'mask-shape': ['r', 'rect', 'rectangle', 'c', 'circle', 'ellipse']}

class dgenerate.imageprocessors.AnylineProcessor(gaussian_sigma: float = 2.0, intensity_threshold: int = 2, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

anyline, MistoLine Control Every Line image preprocessor, see: https://huggingface.co/TheMistoAI/MistoLine

This is an edge detector based on TEED.

The “gaussian-sigma” argument is the gaussian filter sigma value.

The “intensity-threshold” argument is the pixel value intensity threshold.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect_resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(gaussian_sigma: float = 2.0, intensity_threshold: int = 2, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

gaussian_sigma – gaussian filter sigma value
intensity_threshold – pixel value intensity threshold
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly an anyline detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly an anyline detected image, or the input image

NAMES = ['anyline']

class dgenerate.imageprocessors.CannyEdgeDetectProcessor(lower: int = 50, upper: int = 100, aperture_size: int = 3, L2_gradient: bool = False, blur: bool = False, gray: bool = False, threshold_algo: str | None = None, sigma: float = 0.33, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Process the input image with the Canny edge detection algorithm.

The “lower” argument indicates the lower threshold value for the algorithm, and the “upper” argument indicates the upper threshold. “aperture-size” is the size of Sobel kernel used for find image gradients, it must be an odd integer from 3 to 7. “L2-gradient” specifies the equation for finding gradient magnitude, if True a more accurate equation is used. See: https://docs.opencv.org/3.4/da/d22/tutorial_py_canny.html.

If “blur” is true, apply a 3x3 gaussian blur before processing. If “gray” is true, convert the image to the cv2 “GRAY” format before processing, which does not happen automatically unless you are using a “threshold_algo” value, OpenCV is capable of edge detection on colored images, however you may find better results by converting to its internal grayscale format before processing, or you may not, it depends.

If “threshold_algo” is one of (“otsu”, “triangle”, “median”) try to calculate the lower and upper threshold automatically using cv2.threshold or cv2.median in the case of “median”. “sigma” scales the range of the automatic threshold calculation done when a value for “threshold_algo” is selected. “pre-resize” is a boolean value determining if the processing should take place before or after the image is resized by dgenerate.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(lower: int = 50, upper: int = 100, aperture_size: int = 3, L2_gradient: bool = False, blur: bool = False, gray: bool = False, threshold_algo: str | None = None, sigma: float = 0.33, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

lower – lower threshold for canny edge detection
upper – upper threshold for canny edge detection
aperture_size – aperture size, an odd integer from 3 to 7
L2_gradient – Use L2_gradient? https://docs.opencv.org/3.4/da/d22/tutorial_py_canny.html
blur – apply a 3x3 gaussian blur before processing?
gray – convert to cv2.GRAY format before processing?
threshold_algo – optional auto thresholding algorithm. One of “otsu”, “triangle”, or “median”. the lower, and upper threshold values are determined automagically from the image content if this argument is supplied a value.
sigma – scales the range of the automatic threshold calculation
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, canny edge detection may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a canny edge detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, canny edge detection may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a canny edge detected image, or the input image

to(device) → CannyEdgeDetectProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['canny']

class dgenerate.imageprocessors.CropProcessor(box: str, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Crop the input image to a specified box region.

The “box” argument specifies the crop region in the format “LEFTxTOPxRIGHTxBOTTOM”, where each value represents pixel coordinates. For example: “100x50x300x400” will crop the image with top left: (x=100, y=50), bottom right: (x=300, y=400)

The “pre-resize” argument determines if the cropping occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is cropped after dgenerate is done resizing it.

__init__(box: str, pre_resize: bool = False, **kwargs)[source]

Parameters:

box – Crop region in format “LEFTxTOPxRIGHTxBOTTOM”
pre_resize – Whether to crop before or after dgenerate’s resize operation
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Crop the image if pre_resize is False.

Parameters:: image – image to process
Returns:: the cropped image if pre_resize is False, otherwise the original image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Crop the image if pre_resize is True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

the cropped image if pre_resize is True, otherwise the original image

to(device) → CropProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['crop']

class dgenerate.imageprocessors.CropToMaskProcessor(mask: str | None = None, mask_processors: str | None = None, padding: str | int | None = None, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Crop an image to the bounds of a mask downloaded from a URL or loaded from a file.

This processor loads a mask image from a file path or URL and automatically crops the input image to the bounding box of the white areas in the mask. The mask should have white pixels for the area of interest and black/dark pixels for areas to ignore.

The “mask” argument specifies the path to a mask file or URL to download the mask from. If you do not specify “mask”, the processed image is assumed to be a mask.

The “mask-processors” argument allows you to pre-process the “mask” argument with an arbitrary image processor chain, for example: invert, gaussian-blur, etc. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “padding” argument can be used to add padding around the detected bounds:

NOWRAP! - A single integer (e.g., “10”) applies uniform padding on all sides - “WIDTHxHEIGHT” format (e.g., “10x20”) applies WIDTH padding horizontally and HEIGHT padding vertically - “LEFTxTOPxRIGHTxBOTTOM” format (e.g., “5x10x5x15”) applies specific padding to each side

Padding values may be negative if desired.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

classmethod inheritable_help(loaded_by_name)[source]

__init__(mask: str | None = None, mask_processors: str | None = None, padding: str | int | None = None, pre_resize: bool = False, **kwargs)[source]

Parameters:

mask – Path to mask image file or URL. White pixels indicate areas of interest. Or None indicating that the processed image is the mask.
mask_processors – Pre-process mask with an arbitrary image processor chain.
padding – Padding to apply around the detected bounds. Can be an integer for uniform padding, WIDTHxHEIGHT for horizontal/vertical padding, or LEFTxTOPxRIGHTxBOTTOM for specific side padding.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, cropping may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a cropped image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, cropping may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a cropped image, or the input image

to(device) → CropToMaskProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

FILE_ARGS = {'mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}}

NAMES = ['crop-to-mask']

class dgenerate.imageprocessors.DilateProcessor(size: int | str = 3, steps: int = 1, shape: str = 'rectangle', pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Apply morphological dilation to the input image using OpenCV.

Dilation is a morphological operation that expands white regions (foreground objects) in a binary or grayscale image. It’s commonly used to fill small holes inside objects or to connect nearby objects.

The “size” argument specifies the size of the structuring element used for dilation. It can be either an odd integer (e.g., 3, 5, 7, etc.) representing both width and height, or a string specifying different dimensions like “5x3” for width x height. All dimensions must be odd positive integers.

The “steps” argument specifies how many times the dilation operation is applied. More steps result in more expansion.

The “shape” argument specifies the shape of the structuring element:

NOWRAP! - “r” or “rect” or “rectangle”: rectangular kernel (default) - “c” or “circle” or “ellipse”: elliptical kernel - “+” or “cross”: cross-shaped kernel

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(size: int | str = 3, steps: int = 1, shape: str = 'rectangle', pre_resize: bool = False, **kwargs)[source]

Parameters:

size – size of the structuring element (must be odd integer or string like “5x3”)
steps – number of times to apply the dilation operation
shape – shape of the structuring element (“r” or “rect” or “rectangle”, “c” or “circle” or “ellipse”, or “+” or “cross”)
pre_resize – process the image before it is resized, or after? default is False (after)
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, dilation may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a dilated image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, dilation may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a dilated image, or the input image

to(device) → DilateProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['dilate']

OPTION_ARGS = {'shape': ['r', 'rect', 'rectangle', 'c', 'circle', 'ellipse', '+', 'cross']}

class dgenerate.imageprocessors.GaussianBlurProcessor(size: int | str = 5, sigma_x: float = 0.0, sigma_y: float = 0.0, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Apply Gaussian blur to the input image using OpenCV.

Gaussian blur is a widely used effect in image processing that reduces image noise and detail by convolving the image with a Gaussian kernel.

The “size” argument specifies the size of the Gaussian kernel. It can be either an odd integer (e.g., 3, 5, 7, etc.) representing both width and height, or a string specifying different dimensions like “5x3” for width x height. All dimensions must be odd positive integers. Larger kernel sizes produce more blur.

The “sigma-x” argument specifies the standard deviation in the X direction. If 0, it’s calculated from the kernel size.

The “sigma-y” argument specifies the standard deviation in the Y direction. If 0, it’s set to the same value as sigma-x.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(size: int | str = 5, sigma_x: float = 0.0, sigma_y: float = 0.0, pre_resize: bool = False, **kwargs)[source]

Parameters:

size – size of the Gaussian kernel (must be odd integer or string like “5x3”)
sigma_x – standard deviation in X direction (0 = calculate from kernel size)
sigma_y – standard deviation in Y direction (0 = same as sigma_x)
pre_resize – process the image before it is resized, or after? default is False (after)
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, Gaussian blur may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a blurred image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, Gaussian blur may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a blurred image, or the input image

to(device) → GaussianBlurProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['gaussian-blur']

class dgenerate.imageprocessors.HEDProcessor(scribble: bool = False, safe: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

HED detection (holistically-nested edge detection), this is an edge detection algorithm that can produced something akin to thick lineart.

The “scribble” argument determines whether scribble mode is enabled, this produces thicker lines.

The “safe” argument enables or disables numerically safe / more precise stepping.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect_resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect_resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(scribble: bool = False, safe: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

scribble – determines whether or not scribble mode is enabled, this produces thicker lines
safe – enables numerically safe / more precise stepping
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly an HED depth detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly an HED depth detected image, or the input image

NAMES = ['hed']

class dgenerate.imageprocessors.ImageProcessor(loaded_by_name: str, device: str | None = None, output_file: str | None = None, output_overwrite: bool = False, model_offload: bool = False, local_files_only: bool = False, **kwargs)[source]

Bases: Plugin, ABC

Abstract base class for image processor implementations.

classmethod inheritable_help(loaded_by_name)[source]

static image_in_filetypes()[source]: Utility for derived classes to get a list of supported image input file types for use with FILE_ARGS. :return: List of supported image input file types, for example ['*.png', '*.jpg'].

static image_out_filetypes()[source]: Utility for derived classes to get a list of supported image output file types for use with FILE_ARGS. :return: List of supported image output file types, for example ['*.png', '*.jpg'].

__init__(loaded_by_name: str, device: str | None = None, output_file: str | None = None, output_overwrite: bool = False, model_offload: bool = False, local_files_only: bool = False, **kwargs)[source]

Parameters:

loaded_by_name – The name the processor was loaded by
device – the device the processor will run on, for example: cpu, cuda, cuda:1. Specifying None causes the device to default to cpu.
output_file – output a debug image to this path
output_overwrite – can the debug image output path be overwritten?
model_offload – if True, any torch modules that the processor has registered are offloaded to the CPU immediately after processing an image
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

get_alignment() → int | None[source]

Overridable method.

Get required input image alignment, which will be forcefully applied.

If this function returns None, specific alignment is not required and will never be forced.

Returns:: integer or None

abstractmethod impl_post_resize(image: Image) → Image[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_post_resize() to invoke it.

Parameters:: image – image to process
Returns:: the processed image

abstractmethod impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None) → Image[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_pre_resize() to invoke it.

Parameters:

image – image to process
resize_resolution – image will be resized to this resolution after this process is complete. If None is passed no resize is going to occur. It is not the duty of the inheritor to resize the image, in fact it should NEVER be resized.

Returns:

the processed image

load_object_cached(tag: str, estimated_size: int, method: Callable, memory_guard_device: str | device | None = 'cpu')[source]

Load a potentially large object into the CPU side image_processor object cache.

Parameters:

tag – A unique string within the context of the image processor implementation constructor.
estimated_size – Estimated size in bytes of the object in RAM.
method – A method which loads and returns the object.
memory_guard_device – call ImageProcessor.memory_guard_device() on the specified device before the object is loaded (on cache miss)

Returns:

The loaded object

memory_guard_device(device: str | device, memory_required: int)[source]

Check a specific device against an amount of memory in bytes.

If the device is a gpu device and any of the memory constraints specified by dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_GPU_MEMORY_CONSTRAINTS are met on that device, attempt to remove cached objects off a gpu device to free space.

If the device is a cpu and any of the memory constraints specified by dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_CACHE_GC_CONSTRAINTS are met, attempt to remove cached image processor objects off the device to free space. Then, enforce dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_CACHE_MEMORY_CONSTRAINTS.

Parameters:

device – the device
memory_required – the amount of memory required on the device in bytes

Returns:

True if an attempt was made to free memory, False otherwise.

post_resize(image: Image) → Image[source]

Invoke a processors ImageProcessor.impl_post_resize() method.

Implements important behaviors depending on if the image was modified.

This is the only appropriate way to invoke a processor manually.

The original image will be closed if the implementation returns a new image instead of modifying it in place, you should not count on the original image being open and usable once this function completes though it is safe to use the input image in a with context, if you need to retain a copy, pass a copy.

Raises:

dgenerate.OutOfMemoryError – if the execution device runs out of memory
dgenerate.ImageProcessorImageModeError – if a passed image has an invalid format

Parameters:

self – ImageProcessor implementation instance
image – the image to pass

Returns:

processed image, may be the same image or a copy.

pre_resize(image: Image, resize_resolution: tuple[int, int] | None = None) → Image[source]

Invoke a processors ImageProcessor.impl_pre_resize() method.

Implements important behaviors depending on if the image was modified.

This is the only appropriate way to invoke a processor manually.

The original image will be closed if the implementation returns a new image instead of modifying it in place, you should not count on the original image being open and usable once this function completes though it is safe to use the input image in a with context, if you need to retain a copy, pass a copy.

Raises:

dgenerate.OutOfMemoryError – if the execution device runs out of memory
dgenerate.ImageProcessorImageModeError – if a passed image has an invalid format

Parameters:

self – ImageProcessor implementation instance
image – the image to pass
resize_resolution – the size that the image is going to be resized to after this step, or None if it is not being resized.

Returns:

processed image, may be the same image or a copy.

process(image: Image, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None)[source]

Perform image processing on an image, including the requested resizing step.

Invokes the image processor pre and post resizing with appropriate arguments and correct resource management.

The original image will be closed if the implementation returns a new image instead of modifying it in place, you should not count on the original image being open and usable once this function completes though it is safe to use the input image in a with context, if you need to retain a copy, pass a copy.

Raises:

dgenerate.OutOfMemoryError – if the execution device runs out of memory
dgenerate.ImageProcessorImageModeError – if a passed image has an invalid format

Parameters:

image – image to process
resize_resolution – image will be resized to this dimension by this method.
aspect_correct – Should the resize operation be aspect correct?
align – Align by this amount of pixels, if the input image is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.

Returns:

the processed image

register_module(module)[source]

Register torch.nn.Module objects.

These will be brought on to the cpu during finalization.

All of these modules can be cast to a specific device with ImageProcessor.to

Parameters:: module – the module

set_size_estimate(size_bytes: int)[source]

Set the estimated size of this plugin in bytes for memory management heuristics, this is intended to be used by implementors of the ImageProcessor plugin class.

For the best memory optimization, this value should be set very shortly before any associated model even enters CPU side ram, IE: before it is loaded at all.

Raises:: ValueError – if size_bytes is less than zero.
Parameters:: size_bytes – the size in bytes

to(device: device | str) → ImageProcessor[source]

Move all torch.nn.Module modules registered to this image processor to a specific device.

Raises:: dgenerate.OutOfMemoryError – if there is not enough memory on the specified device
Parameters:: device – The device string, or torch device object
Returns:: the image processor itself

FILE_ARGS = {'output-file': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.pcx', '*.dds', '*.ps', '*.eps', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.tif', '*.tiff', '*.mpo', '*.msp', '*.palm', '*.pdf', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm'])], 'mode': 'out'}}

HIDE_ARGS = ['local-files-only']

property device: str

The rendering device requested for this processor.

Torch modules associated with the processor will not be on this device until the processor is used.

Returns:: device string, for example “cuda”, “cuda:N”, or “cpu”

property image_modes: list[str]

Returns a list of PIL image modes that this processor can handle.

This may be overridden by implementers

Returns:: ['RGB']

property local_files_only: bool: Is this image processor only going to look for resources such as models in cache / on disk?

property model_offload: bool

Model offload status.

Returns:: True or False

property modules_device: device

The rendering device that this processors modules currently exist on.

This will change with calls to ImageProcessor.to() and possibly when the processor is used.

Returns:: torch.device, using str() on this object will yield a device string such as “cuda”, “cuda:N”, or “cpu”

property size_estimate: int: Estimated size of the models / objects used by this image processor. :return: size in bytes

class dgenerate.imageprocessors.ImageProcessorChain(image_processors: Iterable[ImageProcessor] | None = None)[source]

Bases: ImageProcessor

Implements chainable image processors.

Chains processing steps together in a sequence.

__init__(image_processors: Iterable[ImageProcessor] | None = None)[source]

Parameters:: image_processors – optional initial image processors to fill the chain, accepts an iterable

add_processor(image_processor: ImageProcessor)[source]

Add a imageprocessor implementation to the chain.

Parameters:: image_processor – dgenerate.imageprocessors.imageprocessor.ImageProcessor

impl_post_resize(image: Image)[source]

Invoke post_resize on all image processors in this image processor chain in turn.

Every subsequent invocation receives the last processed image as its argument.

This method should not be invoked directly, use the class method dgenerate.imageprocessors.imageprocessor.ImageProcessor.post_resize() to invoke it.

Parameters:: image – initial image to process
Returns:: the processed image, possibly affected by every imageprocessor in the chain

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Invoke pre_resize on all image processors in this imageprocessor chain in turn.

Every subsequent invocation receives the last processed image as its argument.

This method should not be invoked directly, use the class method dgenerate.imageprocessors.imageprocessor.ImageProcessor.pre_resize() to invoke it.

Parameters:

image – initial image to process
resize_resolution – the size which the image will be resized to after this step, this is only information for the image processors and the image will not be resized by this method. Image processors should never resize images as it is the responsibility of dgenerate to do that for the user.

Returns:

the processed image, possibly affected by every image processor in the chain

to(device: device | str) → ImageProcessorChain[source]

Move all torch.nn.Module modules registered to this image processor to a specific device.

Raises:: dgenerate.OutOfMemoryError – if there is not enough memory on the specified device
Parameters:: device – The device string, or torch device object
Returns:: the image processor itself

HIDDEN = True

class dgenerate.imageprocessors.ImageProcessorLoader[source]

Bases: PluginLoader

Loads dgenerate.imageprocessor.ImageProcessor plugins.

__init__()[source]

Parameters:

base_class – Base class of plugins, will be used for searching modules.
description – Short plugin description / name, used in exception messages.
reserved_args – Constructor arguments that are used by the plugin class which cannot be redefined by implementors of the plugin class. This should be a list of plugin argument descriptors, PluginArg
argument_error_type – This exception type will be raised when the plugin is loaded with invalid URI arguments.
not_found_error_type – This exception type will be raised when a plugin could not be located by a name specified in a loading URI.

load(uri: str | Iterable[str], device: str = 'cpu', local_files_only: bool = False, **kwargs) → ImageProcessor | ImageProcessorChain | None[source]

Load an image processor or multiple image processors. They are loaded by URI, which is their name and any module arguments, for example: canny;lower=50;upper=100

Specifying multiple processors with a list will create an image processor chain object.

Raises:

RuntimeError – if more than one class was found using the provided name mentioned in the URI.
ImageProcessorNotFoundError – if the name mentioned in the URI could not be found.
ImageProcessorArgumentError – if the URI contained invalid arguments.

Parameters:

uri – Processor URI or list of URIs
device – Request a specific rendering device, default is CPU
local_files_only – Should the image processor(s) avoid downloading files from Hugging Face hub and only check the cache or local directories?
kwargs – Default argument values, will be overridden by arguments specified in the URI

Returns:

dgenerate.imageprocessors.ImageProcessor or dgenerate.imageprocessors.ImageProcessorChain

class dgenerate.imageprocessors.ImageProcessorMixin(image_processor: ImageProcessor | None = None, *args, **kwargs)[source]

Bases: object

Mixin functionality for objects that can do image processing such as implementors of dgenerate.mediainput.AnimationReader

This object can also be instantiated and used alone.

This object implements resizing functionality identical to an image processor in the absense or disabled state of the image processor.

Which is used for among other things, frame slicing with an image processor involved.

__init__(image_processor: ImageProcessor | None = None, *args, **kwargs)[source]

Parameters:

processor – the processor implementation that will be doing the image processing.
args – mixin forwarded args
kwargs – mixin forwarded kwargs

process_image(image: Image, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None)[source]

Perform image processing on an image, including the requested resizing step.

Invokes the assigned image processor on an image.

If no processor is assigned or the processor is disabled, only necessary resizing will be performed based on the given arguments.

The original image will be closed if the processor returns a new image instead of modifying it in place, you should not count on the original image being open and usable once this function completes with a processor assigned and the processor enabled, though it is safe to use the input image in a with context, if you need to retain a copy, pass a copy.

Parameters:

image – image to process
resize_resolution – image will be resized to this dimension by this method.
aspect_correct – Should the resize operation be aspect correct?
align – Align by this amount of pixels, if the input image is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.

Returns:

the processed image, processed by the processor assigned in the constructor.

image_processor: ImageProcessor | None = None

Current image processor.

Images will still be resized as needed/requested if this is not assigned.

image_processor_enabled: bool

Enable or disable image processing.

Images will still be resized as needed/requested with this disabled.

class dgenerate.imageprocessors.InpaintProcessor(model: str, mask: str | None = None, mask_processors: str | None = None, image: str | None = None, image_processors: str | None = None, dtype: str = 'float32', pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Inpaint an image using inpainting model architectures supported by Spandrel (AI-based inpainting).

This processor uses inpainting models such as LaMa, MAT, and other architectures supported by Spandrel for advanced AI-based inpainting to fill in missing or masked areas in images.

This processor requires either a mask or subject image to be provided via the “mask” or “image” arguments. These arguments are mutually exclusive.

When using the “mask” argument, the incoming image is considered the subject image to be inpainted, and the “mask” argument provides a grayscale mask where white pixels (255) indicate areas to inpaint and black pixels (0) indicate areas to preserve.

When using the “image” argument, the incoming image is considered the mask, and the “image” argument provides the subject image to be inpainted. The incoming mask image should be a grayscale image where white pixels (255) indicate areas to inpaint and black pixels (0) indicate areas to preserve.

The “mask” or “image” argument should point to a file path on disk or a URL that can be downloaded. Both local files and remote URLs are supported. The mask or image will be resized to match the dimensions of the corresponding target image if they are not the same size.

The “mask-processors” argument allows you to pre-process the “mask” argument with an arbitrary image processor chain, for example: invert, gaussian-blur, etc. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “image-processors” argument allows you to pre-process the “image” argument with an arbitrary image processor chain, for example: invert, gaussian-blur, etc. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “model” argument specifies the path to an inpainting model file supported by Spandrel, such as LaMa, MAT, or other compatible architectures. This can be a local file path or a URL that can be downloaded.

The “dtype” argument can be used to specify the datatype to use for the model in memory, it can be either “float32” or “float16”. Using “float16” will result in a smaller memory footprint if supported by the model. If a model doesn’t support FP16, it will automatically fall back to FP32 with a warning message.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

classmethod inheritable_help(loaded_by_name)[source]

__init__(model: str, mask: str | None = None, mask_processors: str | None = None, image: str | None = None, image_processors: str | None = None, dtype: str = 'float32', pre_resize: bool = False, **kwargs)[source]

Parameters:

model – Path to inpainting model file (LaMa, MAT, etc.) supported by Spandrel, or URL.
mask – Path to mask image file or URL. White pixels indicate areas to inpaint.
mask_processors – Pre-process mask with an arbitrary image processor chain.
image – Path to subject image file or URL when incoming image is the mask.
image_processors – Pre-process image with an arbitrary image processor chain.
dtype – The datatype to use for the model in memory, either “float32” or “float16”.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image) → Image[source]

Implementation called after resize if pre_resize is False.

Parameters:: image – Input image (already resized)
Returns:: Processed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None) → Image[source]

Implementation called before resize if pre_resize is True.

Parameters:

image – Input image
resize_resolution – Target resolution for resize

Returns:

Processed image

to(device) → InpaintProcessor[source]

Move the processor and its model to the specified device.

Parameters:: device – Target device
Returns:: Self

FILE_ARGS = {'image': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}, 'mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}, 'model': {'filetypes': [('Models', ['*.pt', '*.pth', '*.ckpt', '*.safetensors'])], 'mode': 'in'}}

NAMES = ['inpaint']

OPTION_ARGS = {'dtype': ['float32', 'float16']}

class dgenerate.imageprocessors.LeresDepthProcessor(threshold_near: int = 0, threshold_far: int = 0, boost: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

LeReS depth detector.

The “threshold-near” argument is the near threshold, think the low threshold of canny.

The “threshold-far” argument is the far threshold, think the high threshold of canny.

The “boost” argument determines if monocular depth boost is used.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(threshold_near: int = 0, threshold_far: int = 0, boost: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

threshold_near – argument is the near threshold, think the low threshold of canny
threshold_far – argument is the far threshold, think the high threshold of canny
boost – argument determines if monocular depth boost is used
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a LeReS depth detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a LeReS depth detected image, or the input image

NAMES = ['leres']

class dgenerate.imageprocessors.LetterboxProcessor(box_size: str, box_is_padding: bool = False, box_color: str | None = None, inner_size: str | None = None, aspect_correct: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Letterbox an image.

The “box-size” argument is the size of the outer letterbox.

In non-padding mode “box-size” may be specified as the absolute size of the final image “WIDTHxHEIGHT”, or with a single integer denoting both width and height.

The “box-is-padding” argument can be used to indicate that “box-size” should be interpreted as padding.

When in padding mode, “box-size” can be specified as a width / height padding around the original image i.e. “WIDTHxHEIGHT”, (a single integer can also suffice). Or as a four part padding value: “LEFTxTOPxRIGHTxBOTTOM”

The “box-color” argument specifies the color to use for the letter box background, the default is black. This should be specified as a HEX color code. e.g. #FFFFFF or #FFF

The “inner-size” argument specifies the size of the inner image, in the form: “WIDTHxHEIGHT”

The “aspect-correct” argument can be used to determine if the aspect ratio of the inner image is maintained or not.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(box_size: str, box_is_padding: bool = False, box_color: str | None = None, inner_size: str | None = None, aspect_correct: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

box_size – Size of the outer letterbox, or padding
box_is_padding – The letterbox_size argument should be interpreted as padding?
box_color – What color to use for the letter box background, the default is black. This should be specified in as a HEX color code.
inner_size – The size of the inner image
aspect_correct – Should the size of the inner image be aspect correct?

impl_post_resize(image: Image)[source]

Letterbox operation is performed by this method if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the letterboxed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Letterbox operation is performed by this method if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the letterboxed image

to(device) → LetterboxProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['letterbox']

class dgenerate.imageprocessors.LineArtAnimeProcessor(detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Anime line art generator, generate anime line art from an image.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a lineart image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a lineart image, or the input image

NAMES = ['lineart-anime']

class dgenerate.imageprocessors.LineArtProcessor(course: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Line art generator, generate line art from an image.

The “course” argument determines whether to use the course model or the normal model.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(course: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

course – determines whether to use the course model or the normal model
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a lineart image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a lineart image, or the input image

NAMES = ['lineart']

class dgenerate.imageprocessors.LineArtStandardProcessor(gaussian_sigma: float = 6.0, intensity_threshold: int = 8, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Standard lineart detector, generate lineart from an image.

The “gaussian-sigma” argument is the gaussian filter sigma value.

The “intensity-threshold” argument is the pixel value intensity threshold.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(gaussian_sigma: float = 6.0, intensity_threshold: int = 8, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

gaussian_sigma – gaussian filter sigma value
intensity_threshold – pixel value intensity threshold
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a lineart image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a lineart image, or the input image

to(device) → LineArtStandardProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['lineart-standard']

class dgenerate.imageprocessors.MLSDProcessor(threshold_score: float = 0.1, threshold_distance: float = 0.1, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Machine Learning Model for Detecting Wireframes. Wireframe edge detector, this processor overlays lines on to the edges of objects in an image.

The “threshold-score” argument is the score threshold.

The “threshold-distance” argument is the distance threshold.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(threshold_score: float = 0.1, threshold_distance: float = 0.1, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

threshold_score – score threshold
threshold_distance – distance threshold
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a wireframe detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a wireframe detected image, or the input image

NAMES = ['mlsd']

class dgenerate.imageprocessors.MidasDepthProcessor(normals: bool = False, alpha: float = 6.283185307179586, background_threshold: float = 0.1, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

MiDaS depth detector and normal map generation.

The “normals” argument determines if this processor produces a normal map or a depth image.

The “alpha” argument is related to normal map generation.

The “background_threshold” argument is related to normal map generation.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect_resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(normals: bool = False, alpha: float = 6.283185307179586, background_threshold: float = 0.1, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

normals – Return a generated normals image instead of a depth image?
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a MiDaS depth detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a MiDaS depth detected image, or the input image

NAMES = ['midas']

class dgenerate.imageprocessors.MirrorFlipProcessor(pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Implements the “mirror” and “flip” PIL.ImageOps operations as an image imageprocessor

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

static help(loaded_by_name)[source]

__init__(pre_resize: bool = False, **kwargs)[source]

Parameters:: kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Mirrors or flips the image depending on what name was used to invoke this imageprocessor implementation.

Executes if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the mirrored or flipped image.

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Mirrors or flips the image depending on what name was used to invoke this imageprocessor implementation.

Executes if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the mirrored or flipped image.

to(device) → MirrorFlipProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['mirror', 'flip']

class dgenerate.imageprocessors.NormalBaeProcessor(detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Normal Bae Detector, generate a normal map from an image.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a normal map image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a normal map image, or the input image

NAMES = ['normal-bae']

class dgenerate.imageprocessors.OpenPoseProcessor(include_body: bool = True, include_hand: bool = False, include_face: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Generate an OpenPose rigging from the input image (of a human/humanoid) for use with a ControlNet.

“include-body” is a boolean value indicating if a body rigging should be generated.

“include-hand” is a boolean value indicating if a detailed hand/finger rigging should be generated.

“include-face” is a boolean value indicating if a detailed face rigging should be generated.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(include_body: bool = True, include_hand: bool = False, include_face: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

include_body – generate a body rig?
include_hand – include detailed hand rigging?
include_face – include detailed face rigging?
pre_resize – process the image before it is resized, or after? default is after (False)
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, OpenPose rig generation may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly an OpenPose rig image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, OpenPose rig generation may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly an OpenPose rig image, or the input image

NAMES = ['openpose']

class dgenerate.imageprocessors.OutpaintMaskProcessor(box: str, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Create a mask that can be used to extend the incoming image.

The “box” argument is the size of the outer mask as it extends from the inner image, This may be specified as a uniform pixel padding “WIDTHxHEIGHT”, or as a four part padding “LEFTxTOPxRIGHTxBOTTOM”, you may also specify padding width and height simultaneously with a single integer.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(box: str, pre_resize: bool = False, **kwargs)[source]

Parameters:: box – Size of the outer mask

impl_post_resize(image: Image)[source]

Operation is performed by this method if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the letterboxed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Operation is performed by this method if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the letterboxed image

to(device) → OutpaintMaskProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['outpaint-mask']

class dgenerate.imageprocessors.PasteProcessor(image: str, image_processors: str | None = None, position: str | None = None, feather: int | None = None, feather_shape: str = 'rectangle', mask: str | None = None, mask_processors: str | None = None, position_mask: str | None = None, position_mask_padding: str | int | None = None, position_mask_processors: str | None = None, reverse: bool = False, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Paste an image on top of the incoming image at a specified position.

The “image” argument specifies the path to the image file to paste, this may be path on disk or a URL link to an image file.

The “image-processors” argument allows you to pre-process “image” with an arbitrary image processor chain. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “position” argument specifies where to paste the image. It can be:

NOWRAP! - “LEFTxTOP” format (e.g., “100x50”) to specify the top-left coordinate - “LEFTxTOPxRIGHTxBOTTOM” format (e.g., “100x50x300x200”) to specify a bounding

box where the source image will be resized to fit

The “feather” argument specifies the feathering radius in pixels for softening edges. This creates smooth transitions from opaque to transparent. If not specified, no feathering is applied. Cannot be used together with the “mask” parameter, as this auto generates a feather mask for you.

The “feather-shape” argument controls the shape of the feathering:

NOWRAP! - “r” or “rect” or “rectangle” (default): Rectangular feathering from edges - “c” or “circle” or “ellipse”: Elliptical feathering from center

Only used when “feather” is specified.

The “mask” argument allows you to specify a mask image path that will be used to control the transparency of the pasted image. This may be a path on disk or a URL link to an image file. The mask should be a grayscale image where white areas represent full opacity and black areas represent full transparency. Cannot be used together with the “feather” parameter. This mask will always be resized to the size of the pasted image, which may be the “image” argument, or the processed image depending on the value of “reverse”.

The “mask-processors” argument allows you to pre-process the “mask” argument with an arbitrary image processor chain. For example: invert, gaussian-blur, etc. This cannot be used in “feather” mode on the auto generated feather mask, only on user supplied masks. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “position-mask” argument allows you to specify a mask image, which will have its white area bounds detected to determine the value of “position” for pasting. A bounding box will be determined by looking at the image and finding the extents of the white pixels in the mask. If you specify an image, the “position” argument will be ignored. This mask will always be resized to the size of the background image, which may be the processed image or the “image” argument depending on the value of “reverse”.

The “position-mask-padding” argument allows you to specify a padding value for the bounding box detection on “position-mask”, this allows you to add positive or negative padding the detected mask bounding box. This value should be a single integer (uniform), or WIDTHxHEIGHT (horizontal and vertical padding), or (LEFTxTOPxRIGHTxBOTTOM)

The “position-mask-processors” argument allows you to pre-process the “position-mask” argument with an arbitrary image processor chain. For example: invert, gaussian-blur, etc. This cannot be used in “feather” mode on the auto generated feather mask, only on user supplied masks. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “reverse” argument allows you to reverse the paste operation, meaning the “image” argument is to be considered the background, and the processed image is to be the pasted content.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

classmethod inheritable_help(loaded_by_name)[source]

Parameters:

image – path to the image file to paste, or paste on to if reverse=True
image_processors – Pre-process image with an arbitrary image processor chain
position – position specification in “LEFTxTOP” or “LEFTxTOPxRIGHTxBOTTOM” format
feather – feathering radius in pixels for softening edges (cannot be used with mask)
feather_shape – shape of feathering (“rectangle”, “rect”, “circle”, or “ellipse”)
mask – path to a mask image file for controlling transparency (cannot be used with feather)
mask_processors – Pre-process mask with an arbitrary image processor chain, not compatible with feather.
position_mask – path to a mask image file for determining paste position from white area bounds
position_mask_padding – padding value for the position mask bounding box (default “0”)
position_mask_processors – Pre-process position_mask with an arbitrary image processor chain
reverse – Reverse the paste operation?
pre_resize – process the image before it is resized, or after? default is False (after)
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Process the image after resizing if pre_resize is False.

Parameters:: image – image to process
Returns:: the processed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Process the image before resizing if pre_resize is True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

the processed image

to(device) → PasteProcessor[source]

Does nothing for this processor since it’s PIL-based.

Parameters:: device – the device (ignored)
Returns:: this processor

FILE_ARGS = {'image': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}, 'mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}, 'position_mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}}

NAMES = ['paste']

OPTION_ARGS = {'feather_shape': ['r', 'rect', 'rectangle', 'c', 'circle', 'ellipse']}

class dgenerate.imageprocessors.PatchMatchProcessor(mask: str | None = None, mask_processors: str | None = None, image: str | None = None, image_processors: str | None = None, patch_size: int = 5, seed: int | None = None, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Inpaint an image with the PatchMatch algorithm (content aware fill).

The PatchMatch algorithm is used in this processor for pyramidical inpainting (filling in missing or masked areas) in images. This processor requires either a mask or subject image to be provided via the “mask” or “image” arguments. These arguments are mutually exclusive.

When using the “mask” argument, the incoming image is considered the subject image to be inpainted, and the “mask” argument provides a grayscale mask where white pixels (255) indicate areas to inpaint and black pixels (0) indicate areas to preserve.

When using the “image” argument, the incoming image is considered the mask, and the “image” argument provides the subject image to be inpainted. The incoming mask image should be a grayscale image where white pixels (255) indicate areas to inpaint and black pixels (0) indicate areas to preserve.

The “mask” or “image” argument should point to a file path on disk or a URL that can be downloaded. Both local files and remote URLs are supported. The mask or image will be resized to match the dimensions of the corresponding target image if they are not the same size.

The “mask-processors” argument allows you to pre-process the “mask” argument with an arbitrary image processor chain, for example: invert, gaussian-blur, etc. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “image-processors” argument allows you to pre-process the “image” argument with an arbitrary image processor chain, for example: invert, gaussian-blur, etc. This arguments value must be quoted (single or double string quotes) if you intend to supply arguments to the processors in the chain. The pixel alignment of this processor chain defaults to 1, meaning no forced alignment will occur, you can force alignment using the “resize” image processor if desired.

The “patch-size” argument specifies the patch size for the PatchMatch algorithm. Larger patch sizes can provide better coherence but may be slower.

The “seed” argument allows you to specify a random number generator seed for reproducible results.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

classmethod inheritable_help(loaded_by_name)[source]

Parameters:

mask – Path to mask image file or URL. White pixels indicate areas to inpaint.
mask_processors – Pre-process mask with an arbitrary image processor chain.
image – Path to subject image file or URL when incoming image is the mask.
image_processors – Pre-process image with an arbitrary image processor chain.
patch_size – Patch size for PatchMatch algorithm. Default is 5.
seed – Random number generator seed for reproducible results. If None, uses random seed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image) → Image[source]

Implementation called after resize if pre_resize is False.

Parameters:: image – Input image (already resized)
Returns:: Processed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None) → Image[source]

Implementation called before resize if pre_resize is True.

Parameters:

image – Input image
resize_resolution – Target resolution for resize

Returns:

Processed image

to(device) → PatchMatchProcessor[source]

PatchMatch runs on CPU, so device changes are ignored.

Parameters:: device – Target device (ignored)
Returns:: Self

FILE_ARGS = {'image': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}, 'mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}}

NAMES = ['patchmatch']

class dgenerate.imageprocessors.PidiNetProcessor(apply_filter: bool = False, safe: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

PidiNet (Pixel Difference Networks for Efficient Edge Detection) edge detector.

The “apply-filter” argument enables possibly crisper edges / less noise.

The “safe” argument enables or disables numerically safe / more precise stepping.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(apply_filter: bool = False, safe: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

apply_filter – enables possibly crisper edges / less noise
safe – enables numerically safe / more precise stepping
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly an edge detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly an edge detected image, or the input image

NAMES = ['pidi']

class dgenerate.imageprocessors.PosterizeProcessor(bits: int, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Posterize the input image with PIL.ImageOps.posterize.

Accepts the argument ‘bits’, an integer value from 1 to 8.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(bits: int, pre_resize: bool = False, **kwargs)[source]

Parameters:

bits – required argument, integer value from 1 to 8
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Posterize operation is performed by this method if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the posterized image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Posterize operation is performed by this method if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the posterized image

to(device) → PosterizeProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['posterize']

class dgenerate.imageprocessors.ResizeProcessor(size: str | None = None, scale: float | tuple[float, float] | None = None, align: int | None = None, aspect_correct: bool = True, algo: str = 'auto', pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Resize an image using basic resampling algorithms.

The “size” argument is the new image size.

The “scale” argument is either a single floating point value to scale both dimensions by, or a tuple of two floating point values to scale x and y dimensions separately. This is mutually exclusive with “size”. When specifying a tuple, you may use CSV, for example: “2,1”, meaning X*2, Y*1.

The “align” argument is the new image alignment.

The “aspect-correct” argument is a boolean argument that determines if the resize is aspect correct.

The “algo” argument is the resize filtering algorithm, which can be one of: “auto”, “nearest”, “box”, “bilinear”, “hamming”, “bicubic”, “lanczos”

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(size: str | None = None, scale: float | tuple[float, float] | None = None, align: int | None = None, aspect_correct: bool = True, algo: str = 'auto', pre_resize: bool = False, **kwargs)[source]

Parameters:

loaded_by_name – The name the processor was loaded by
device – the device the processor will run on, for example: cpu, cuda, cuda:1. Specifying None causes the device to default to cpu.
output_file – output a debug image to this path
output_overwrite – can the debug image output path be overwritten?
model_offload – if True, any torch modules that the processor has registered are offloaded to the CPU immediately after processing an image
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

impl_post_resize(image: Image)[source]

Perform the resize operation.

Parameters:: image – The image after being resized by dgenerate.
Returns:: The resized image.

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Does nothing, no-op.

Parameters:

image – the image.
resize_resolution – dimension dgenerate will resize to.

Returns:

The same image.

to(device) → ResizeProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['resize']

class dgenerate.imageprocessors.SegmentAnythingProcessor(detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Segment Anything Model, this processor attempts to create cutouts for every distinct object in an image.

Note that this processor is just for use with controlnet models that support SAM annotated input images.

If you want to generate masks or preview segmentation using prompting, use the “u-sam” processor instead.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect-resolution” before detection runs is aspect correct, this defaults to true.

The “detect-align” argument determines the pixel alignment of the image resize requested by “detect-resolution”, it defaults to 1 indicating no requested alignment.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(detect_resolution: str | None = None, detect_aspect: bool = True, detect_align: int = 1, pre_resize: bool = False, **kwargs)[source]

Parameters:

detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
detect_align – the input image is forcefully aligned to this amount of pixels before being processed.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a segment-anything image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a segment-anything image, or the input image

NAMES = ['sam']

class dgenerate.imageprocessors.SimpleColorProcessor(pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Implements the “grayscale” and “invert” PIL.ImageOps operations as an image imageprocessor.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

static help(loaded_by_name)[source]

__init__(pre_resize: bool = False, **kwargs)[source]

Parameters:: kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Invert or grayscale the image depending on which name was used to invoke this imageprocessor.

Executes if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the inverted or grayscale image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Invert or grayscale the image depending on which name was used to invoke this imageprocessor.

Executes if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the inverted or grayscale image

to(device) → SimpleColorProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['grayscale', 'invert']

class dgenerate.imageprocessors.SolarizeProcessor(threshold: int = 128, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Solarize the input image with PIL.ImageOps.solarize.

Accepts the argument “threshold” which is an integer value from 0 to 255.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(threshold: int = 128, pre_resize: bool = False, **kwargs)[source]

Parameters:

threshold – integer value from 0 to 255, default is 128
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Solarize operation is performed by this method if pre_resize constructor argument was False.

Parameters:: image – image to process
Returns:: the solarized image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Solarize operation is performed by this method if pre_resize constructor argument was True.

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this imageprocessor

Returns:

the solarized image

to(device) → SolarizeProcessor[source]: Does nothing for this processor. :param device: the device :return: this processor

HIDE_ARGS = ['device', 'model-offload']

NAMES = ['solarize']

class dgenerate.imageprocessors.TEEDProcessor(safe: bool = True, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

teed, a (tiny efficient edge detector).

The “safe” argument enables or disables numerically safe / more precise stepping.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect_resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(safe: bool = True, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

safe – enables or disables numerically safe / more precise stepping
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a teed edge detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a teed edge detected image, or the input image

NAMES = ['teed']

Bases: ImageProcessor

static help(loaded_by_name: str)[source]

Parameters:

asset – SAM model asset to use, an Ultralytics asset name
points – list of point prompts - can be nested lists [[x,y], [x,y,label]] or strings [“x,y”, “x,y,label”]
boxes – list of bounding box prompts - can be nested lists [[x1,y1,x2,y2]] or strings [“x1,y1,x2,y2”]
boxes_mask – path or URL to a black and white mask image where white areas will be converted to bounding boxes
boxes_mask_processors – image processor chain to apply to the boxes mask before extracting bounding boxes
font_size – size of label text, if None will be calculated based on image dimensions
line_width – thickness of mask outline lines, if None will be calculated based on image dimensions
line_color – override color for mask outlines and text label backgrounds as hex color code (e.g. “#FF0000” or “#F00”)
masks – generate mask images instead of preview, default is False
outpaint – invert generated masks for outpainting, only effective when masks is True, default is False
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, SAM mask processing may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a SAM mask processed image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, SAM mask processing may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a SAM mask processed image, or the input image

FILE_ARGS = {'boxes-mask': {'filetypes': [('Images', ['*.avif', '*.avifs', '*.blp', '*.bmp', '*.dib', '*.bufr', '*.cur', '*.pcx', '*.dcx', '*.dds', '*.ps', '*.eps', '*.fit', '*.fits', '*.fli', '*.flc', '*.ftc', '*.ftu', '*.gbr', '*.gif', '*.grib', '*.h5', '*.hdf', '*.png', '*.apng', '*.jp2', '*.j2k', '*.jpc', '*.jpf', '*.jpx', '*.j2c', '*.icns', '*.ico', '*.im', '*.iim', '*.jfif', '*.jpe', '*.jpg', '*.jpeg', '*.mpg', '*.mpeg', '*.tif', '*.tiff', '*.msp', '*.pcd', '*.pxr', '*.pbm', '*.pgm', '*.ppm', '*.pnm', '*.pfm', '*.psd', '*.qoi', '*.bw', '*.rgb', '*.rgba', '*.sgi', '*.ras', '*.tga', '*.icb', '*.vda', '*.vst', '*.webp', '*.wmf', '*.emf', '*.xbm', '*.xpm'])], 'mode': 'in'}}

NAMES = ['u-sam']

OPTION_ARGS = {'asset': ['sam_l.pt', 'sam_b.pt', 'mobile_sam.pt', 'sam2_t.pt', 'sam2_s.pt', 'sam2_b.pt', 'sam2_l.pt', 'sam2.1_t.pt', 'sam2.1_s.pt', 'sam2.1_b.pt', 'sam2.1_l.pt']}

class dgenerate.imageprocessors.UpscalerProcessor(model: str, tile: int | str = 512, overlap: int = 32, scale: int | None = None, force_tiling: bool = False, dtype: str = 'float32', pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

Implements tiled upscaling with chaiNNer compatible upscaler models.

The “model” argument should be a path to a chaiNNer compatible upscaler model on disk, such as a model downloaded from https://openmodeldb.info/, or an HTTP/HTTPS URL that points to a raw model file. This may also be a Hugging Face blob link.

For example: “upscaler;model=https://github.com/xinntao/Real-ESRGAN/releases/download/v0.2.5.0/realesr-general-x4v3.pth”

Downloaded models are cached in the dgenerate web cache on disk until the cache expiry time for the file is met.

The “tile” argument can be used to specify the tile size for tiled upscaling, it must be divisible by 2, and defaults to 512. Specifying ‘auto’ indicates that this value should be calculated based off available GPU memory if applicable. Specifying 0 disables tiling entirely.

The “overlap” argument can be used to specify the overlap amount of each tile in pixels, it must be greater than or equal to 0, and defaults to 32.

The “scale” argument can be used to specify the output scale of the image regardless of the models scale, this equates to an image resize on each tile output of the model as necessary, with auto selected resizing algorithm for the best quality. This is effectively equivalent to basic image resizing of the upscaled output post upscale, just with somewhat reduced memory overhead as it occurs during tiling. When this argument is not specified, the scale of the model architecture is used and no resizing occurs. This argument must be greater than or equal to 1.

The “force-tiling” argument can be used to force external image tiling for upscaler model architectures which discourage the use of external tiling (SCUNEt and MixDehazeNet currently), this may mean that the model needs information about the whole image to achieve a good result. External tiling breaks up the image into tiles before feeding it to the model and reassembles the images output by the model, this is not the default behavior when a model specifies that tiling is discouraged, tiling is only on by default for models where external tiling is fully supported. Only use this if you run into memory issues with models that discourage external tiling, in the case that the model discourages its use, using it may result in substandard image output.

The “dtype” argument can be used to specify the datatype to use to for the model in memory, it can be either “float32” or “float16”. Using “float16” will result in a smaller memory footprint if supported.

The “pre-resize” argument is a boolean value determining if the processing should take place before or after the image is resized by dgenerate.

Example: “upscaler;model=my-model.pth;tile=256;overlap=16”

__init__(model: str, tile: int | str = 512, overlap: int = 32, scale: int | None = None, force_tiling: bool = False, dtype: str = 'float32', pre_resize: bool = False, **kwargs)[source]

Parameters:

model – chaiNNer compatible upscaler model on disk, or at a URL
tile – specifies the tile size for tiled upscaling, it must be divisible by 2, and defaults to 512. Specifying ‘auto’ indicates that this value should be calculated based off available GPU memory if applicable. Specifying 0 disable tiling entirely.
overlap – the overlap amount of each tile in pixels, it must be greater than or equal to 0, and defaults to 32.
scale – the scale factor to use for the upscaler, it must be greater than 1, and defaults to None (meaning use the model’s scale).
dtype – the datatype to use to for the model in memory, it may be either “float32” or “float16”. using float16 will result in a smaller memory footprint if supported.
force_tiling – Force external image tiling for model architectures that discourage it.
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_post_resize() to invoke it.

Parameters:: image – image to process
Returns:: the processed image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None) → Image[source]

Inheritor must implement.

This method should not be invoked directly, use the class method ImageProcessor.call_pre_resize() to invoke it.

Parameters:

image – image to process
resize_resolution – image will be resized to this resolution after this process is complete. If None is passed no resize is going to occur. It is not the duty of the inheritor to resize the image, in fact it should NEVER be resized.

Returns:

the processed image

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': 'in'}}

NAMES = ['upscaler']

OPTION_ARGS = {'dtype': ['float32', 'float16']}

Bases: ImageProcessor

Process the input image with Ultralytics YOLO object detection.

This processor operates in two distinct modes:

Detection Mode (default, masks=False):

Returns the original image with bounding boxes or mask outlines drawn around detected objects, along with labels showing the detection index, class ID, and class name. The colors of the boxes and text are automatically chosen to contrast with the background for optimal visibility.

Mask Mode (masks=True):

Returns a single composite mask image containing all detected objects combined together. This is useful for inpainting, outpainting, or other mask-based image processing operations.

The “model” argument specifies which YOLO model to use. This can be a path to a local model file, a URL to download the model from, or a HuggingFace repository slug / blob link.

The “weight-name” argument specifies the file name in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument.

The “subfolder” argument specifies the subfolder in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument.

The “revision” argument specifies the revision of a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument. For example: “main”

The “token” argument specifies your HuggingFace authentication token explicitly if needed for accessing private repositories.

The “local-files-only” argument specifies that dgenerate should not attempt to download any model files, and to only look for them locally in the cache or otherwise.

The “font-size” argument determines the size of the label text. If not specified, it will be automatically calculated based on the image dimensions.

The “line-width” argument controls the thickness of the bounding box lines. If not specified, it will be automatically calculated based on the image dimensions.

The “line-color” argument overrides the color for bounding box lines, mask outlines, and text label backgrounds. This should be specified as a HEX color code, e.g. “#FFFFFF” or “#FFF”. If not specified, colors are automatically chosen to contrast with the background. The text color will always be automatically chosen to contrast with the background for optimal readability.

The “class-filter” argument can be used to detect only specific classes. This should be a comma-separated list of class IDs or class names, or a single value, for example: “0,2,person,car”. This filter is applied before “index-filter”.

Example “class-filter” values:

NOWRAP! # Only keep detection class ID 0 class-filter=0

NOWRAP! # Only keep detection class “hand” class-filter=hand

NOWRAP! # keep class ID 2,3 class-filter=2,3

NOWRAP! # keep class ID 0 & class Name “hand” # if entry cannot be parsed as an integer # it is interpreted as a name class-filter=0,hand

NOWRAP! # “0” is interpreted as a name and not an ID, # this is not likely to be useful class-filter=”0”,hand

NOWRAP! # List syntax is supported, you must quote # class names class-filter=[0, “hand”]

The “index-filter” argument is a list values or a single value that indicates what YOLO detection indices to keep, the index values start at zero. Detections are sorted by their top left bounding box coordinate from left to right, top to bottom, by (confidence descending). The order of detections in the image is identical to the reading order of words on a page (english). Processing will only be performed on the specified detection indices, if no indices are specified, then processing will be performed on all detections.

Example “index-filter” values:

NOWRAP! # keep the first, leftmost, topmost detection index-filter=0

NOWRAP! # keep detections 1 and 3 index-filter=[1, 3]

NOWRAP! # CSV syntax is supported (tuple) index-filter=1,3

The “confidence” argument sets the confidence threshold for detections (0.0 to 1.0), defaults to: 0.3

The “model-masks” argument indicates that masks generated by the model itself should be utilized instead of just detection bounding boxes. If this is True, and the model returns mask data (seg models do this), mask outlines will be drawn instead of bounding boxes. And in “masks” mode, these masks will be used for the composited mask that gets generated. This defaults to False, meaning that bounding boxes will be used by default.

The “masks” argument enables mask generation mode. When True, the processor returns a composite mask image instead of the annotated detection image. This defaults to False.

The “outpaint” argument inverts the generated masks, creating inverted masks suitable for outpainting operations. This only has an effect when “masks” is True. This defaults to False.

The “detector-padding” argument specifies the amount of padding that will be added to the detection rectangle for both bounding box drawing and mask generation. The default is 0, you can make the bounding box and mask area around the detected feature larger with positive padding and smaller with negative padding.

Padding examples:

NOWRAP! 32 (32px Uniform, all sides)

NOWRAP! 10x20 (10px Horizontal, 20px Vertical)

NOWRAP! 10x20x30x40 (10px Left, 20px Top, 30px Right, 40px Bottom)

The “mask-shape” argument indicates what mask shape should be drawn around a detected feature, the default value is “rectangle”. You may also specify “circle” to generate an ellipsoid shaped mask.

Note: When “model-masks” is True and the model returns mask data, the “detector-padding” and “mask-shape” arguments will be ignored as the model’s own masks are used directly.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

Parameters:

model – YOLO model to use, can be a local path, a URL, or a HuggingFace repository slug
weight_name – file name in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument
subfolder – subfolder in a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument
revision – revision of a HuggingFace repository for the model weights, if you have provided a HuggingFace repository slug to the model argument (e.g. “main”)
token – HuggingFace authentication token if needed for accessing private repositories
font_size – size of label text, if None will be calculated based on image dimensions
line_width – thickness of bounding box lines, if None will be calculated based on image dimensions
line_color – override color for bounding box lines, mask outlines, and text label backgrounds as hex color code (e.g. “#FF0000” or “#F00”)
class_filter – list of class IDs or class names to include (e.g. [0,2,"person","car"])
index_filter – list of detection indices to include (e.g. [0,1,3])
confidence – confidence threshold for detections (0.0 to 1.0)
model_masks – overlay model-generated masks instead of bounding boxes when available, default is False
masks – generate mask images for detected objects, default is False
outpaint – invert generated masks for outpainting, only effective when masks is True, default is False
detector_padding – padding around detection rectangles for both bounding box drawing and mask generation
mask_shape – shape of generated masks (“rectangle” or “circle”)
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, YOLO detection may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a YOLO detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, YOLO detection may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a YOLO detected image, or the input image

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': 'in'}}

NAMES = ['yolo']

OPTION_ARGS = {'mask-shape': ['r', 'rect', 'rectangle', 'c', 'circle', 'ellipse']}

Bases: ImageProcessor

static help(loaded_by_name: str)[source]

Parameters:

yolo_model – YOLO model to use for object detection, can be a local path, a URL, or a HuggingFace repository slug
yolo_weight_name – file name in a HuggingFace repository for the YOLO model weights, if you have provided a HuggingFace repository slug to the yolo_model argument
yolo_subfolder – subfolder in a HuggingFace repository for the YOLO model weights, if you have provided a HuggingFace repository slug to the yolo_model argument
yolo_revision – revision of a HuggingFace repository for the YOLO model weights, if you have provided a HuggingFace repository slug to the yolo_model argument (e.g. “main”)
yolo_token – HuggingFace authentication token if needed for accessing private repositories
sam_asset – SAM model asset to use, an Ultralytics asset name
font_size – size of label text, if None will be calculated based on image dimensions
line_width – thickness of mask outline lines, if None will be calculated based on image dimensions
line_color – override color for mask outlines and text label backgrounds as hex color code (e.g. “#FF0000” or “#F00”)
class_filter – list of class IDs or class names to include (e.g. [0,2,"person","car"])
index_filter – list of detection indices to include (e.g. [0,1,3])
confidence – confidence threshold for YOLO detections (0.0 to 1.0), default is 0.3
masks – generate mask images instead of preview, default is False
outpaint – invert generated masks for outpainting, only effective when masks is True, default is False
detector_padding – padding around YOLO detection rectangles, default is 0
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize, YOLO-SAM processing may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:: image – image to process
Returns:: possibly a YOLO-SAM processed image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize, YOLO-SAM processing may or may not occur here depending on the boolean value of the processor argument “pre-resize”

Parameters:

image – image to process
resize_resolution – purely informational, is unused by this processor

Returns:

possibly a YOLO-SAM processed image, or the input image

FILE_ARGS = {'yolo-model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': 'in'}}

NAMES = ['yolo-sam']

OPTION_ARGS = {'sam-asset': ['sam_l.pt', 'sam_b.pt', 'mobile_sam.pt', 'sam2_t.pt', 'sam2_s.pt', 'sam2_b.pt', 'sam2_l.pt', 'sam2.1_t.pt', 'sam2.1_s.pt', 'sam2.1_b.pt', 'sam2.1_l.pt']}

class dgenerate.imageprocessors.ZoeDepthProcessor(gamma_corrected: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Bases: ImageProcessor

zoe depth detector, a SOTA depth estimation model which produces high-quality depth maps.

The “gamma-corrected” argument determines if gamma correction is performed on the produced depth math.

The “detect-resolution” argument is the resolution the image is resized to internal to the processor before detection is run on it. It should be a single dimension for example: “detect-resolution=512” or the X/Y dimensions seperated by an “x” character, like so: “detect-resolution=1024x512”. If you do not specify this argument, the detector runs on the input image at its full resolution. After processing the image will be resized to whatever you have requested dgenerate resize it to via –output-size or –resize/–align in the case of the image-process sub-command, if you have not requested any resizing the output will be resized back to the original size of the input image.

The “detect-aspect” argument determines if the image resize requested by “detect_resolution” before detection runs is aspect correct, this defaults to true.

The “pre-resize” argument determines if the processing occurs before or after dgenerate resizes the image. This defaults to False, meaning the image is processed after dgenerate is done resizing it.

__init__(gamma_corrected: bool = False, detect_resolution: str | None = None, detect_aspect: bool = True, pre_resize: bool = False, **kwargs)[source]

Parameters:

gamma_corrected – perform gamma correction on the depth map?
detect_resolution – the input image is resized to this dimension before being processed, providing None indicates it is not to be resized. If there is no resize requested during the processing action via resize_resolution it will be resized back to its original size.
detect_aspect – if the input image is resized by detect_resolution or detect_align before processing, will it be an aspect correct resize?
pre_resize – process the image before it is resized, or after? default is False (after).
kwargs – forwarded to base class

impl_post_resize(image: Image)[source]

Post resize.

Parameters:: image – image
Returns:: possibly a zoe depth detected image, or the input image

impl_pre_resize(image: Image, resize_resolution: tuple[int, int] | None)[source]

Pre resize.

Parameters:

image – image to process
resize_resolution – resize resolution

Returns:

possibly a zoe depth detected image, or the input image

NAMES = ['zoe']

dgenerate.imageprocessors.create_image_processor(uri: str | Iterable[str], output_file: str | None = None, output_overwrite: bool = True, device: str = 'cpu', model_offload: bool = False, local_files_only: bool = False) → ImageProcessor[source]

Create an image processor implementation using the default ImageProcessorLoader instance.

Providing a collection of URIs will create an ImageProcessorChain object.

Parameters:

output_file – Output path for the processor debug image
output_overwrite – enable overwrite for the processor debug image?
uri – The image processor URI
device – Device to run processing on
model_offload – enable cpu model offloading?
local_files_only – Should the processor avoid downloading files from Hugging Face hub and only check the cache or local directories?

Returns:

A ImageProcessor implementation

dgenerate.imageprocessors.image_processor_exists(uri: str)[source]

Check if a image processor implementation exists for a given URI.

This uses the default ImageProcessorLoader instance.

Parameters:: uri – The image processor URI
Returns:: True or False

dgenerate.imageprocessors.image_processor_help(names: Sequence[str], plugin_module_paths: Sequence[str] | None = None, throw=False, log_error=True)[source]

Implements --image-processor-help command line option

Parameters:

names – arguments (processor names, or empty list)
plugin_module_paths – extra plugin module paths to search
throw – throw on error? or simply print to stderr and return a return code.
log_error – log errors to stderr?

Raises:

Returns:

return-code, anything other than 0 is failure

dgenerate.imageprocessors.image_processor_name_from_uri(uri: str)[source]

Extract just the implementation name from a image processor URI.

Parameters:: uri – the URI
Returns:: the implementation name.

dgenerate.imageprocessors.image_processor_names()[source]

Implementation names for all image processors implemented by dgenerate, which are visible to the default ImageProcessorLoader instance.

Returns:: a list of latents processor implementation names.

dgenerate.imageprocessors.constants module

dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_GPU_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear gpu VRAM upon a image processor plugin calling dgenerate.imageprocessors.ImageProcessor.memory_guard_device() on a cuda device, syntax provided via dgenerate.memory.gpu_memory_constraints()

If any of these constraints are met, an effort is made to clear modules off a GPU which are cached for fast repeat usage but are okay to flush.

The only available extra variable is: memory_required, which is the amount of memory the image processor plugin requested to be available.

dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_CACHE_GC_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear objects out of any CPU side cache upon a image processor plugin calling dgenerate.imageprocessors.ImageProcessor.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of any named CPU side cache.

The only available extra variable is: memory_required, which is the amount of memory the image processor plugin requested to be available.

dgenerate.imageprocessors.constants.IMAGE_PROCESSOR_CACHE_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear specifically the image processor object cache upon a image processor plugin calling dgenerate.imageprocessors.ImageProcessor.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of the image processor object cache.

Available extra variables are: memory_required, which is the amount of memory the image processor plugin requested to be available, and cache_size which is the current size of the image processor object cache.

dgenerate.invoker module

Functions to invoke dgenerate inside the current process using its command line arguments.

class dgenerate.invoker.DgenerateExitEvent(origin, return_code: int)[source]

Bases: Event

Generated in the event stream created by invoke_dgenerate_events()

Exit with return code event for invoke_dgenerate_events()

__init__(origin, return_code: int)[source]

return_code: int

dgenerate.invoker.invoke_dgenerate(args: Sequence[str], render_loop: RenderLoop | None = None, config_overrides: dict[str, Any] | None = None, throw: bool = False, log_error: bool = True, help_raises: bool = False) → int[source]

Invoke dgenerate using its command line arguments and return a return code.

dgenerate is invoked in the current process, this method does not spawn a subprocess.

Meta arguments such as --file, --shell, --no-stdin, and --console are not supported

Parameters:

args – dgenerate command line arguments in the form of a list, see: shlex module, or sys.argv
render_loop – dgenerate.renderloop.RenderLoop instance, if None is provided one will be created. Note that the config object generated by argument parsing will completely overwrite the render loop config.
config_overrides – Optional dictionary of configuration overrides to apply to the render loop config object after argument parsing, this should consist of attribute names with values, the config object generated by argument parsing is of type dgenerate.arguments.DgenerateArguments.
throw – Whether to throw known exceptions or handle them.
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises dgenerate.arguments.DgenerateHelpException ? When True, this will occur even if throw=False

Raises:

dgenerate.DgenerateUsageError –
dgenerate.DgenerateHelpException –
dgenerate.ImageSeedError –
dgenerate.UnknownMimetypeError –
dgenerate.MediaIdentificationError –
dgenerate.FrameStartOutOfBounds –
dgenerate.InvalidModelFileError –
dgenerate.InvalidModelUriError –
dgenerate.ModelUriLoadError –
dgenerate.InvalidSchedulerNameError –
dgenerate.OutOfMemoryError –
dgenerate.ModelNotFoundError –
dgenerate.NonHFModelDownloadError –
dgenerate.UnsupportedPipelineConfigError –
dgenerate.PromptWeightingUnsupported –
dgenerate.PluginNotFoundError –
dgenerate.PluginArgumentError –
dgenerate.ModuleFileNotFoundError –
dgenerate.SpacyModelNotFoundException –
dgenerate.WebFileCacheOfflineModeException –
OSError –

Returns:

integer return-code, anything other than 0 is failure

dgenerate.invoker.invoke_dgenerate_events(args: Sequence[str], render_loop: RenderLoop | None = None, config_overrides: dict[str, Any] | None = None, throw: bool = False, log_error: bool = True, help_raises: bool = False) → Generator[ImageGeneratedEvent | StartingAnimationEvent | StartingAnimationFileEvent | AnimationFileFinishedEvent | ImageFileSavedEvent | AnimationFinishedEvent | StartingGenerationStepEvent | AnimationETAEvent | DgenerateExitEvent, None, None][source]

Invoke dgenerate using its command line arguments and return a stream of events, you must iterate over this stream of events to progress through the execution of dgenerate.

dgenerate is invoked in the current process, this method does not spawn a subprocess.

Meta arguments such as --file, --shell, --no-stdin, and --console are not supported

The exceptions mentioned here are those you may encounter upon iterating, they will not occur upon simple acquisition of the event stream iterator.

Parameters:

args – dgenerate command line arguments in the form of a list, see: shlex module, or sys.argv
render_loop – dgenerate.renderloop.RenderLoop instance, if None is provided one will be created. Note that the config object generated by argument parsing will completely overwrite the render loop config.
config_overrides – Optional dictionary of configuration overrides to apply to the render loop config object after argument parsing, this should consist of attribute names with values, the config object generated by argument parsing is of type dgenerate.arguments.DgenerateArguments.
throw – Whether to throw known exceptions or handle them.
log_error – Write ERROR diagnostics with dgenerate.messages?
help_raises – --help raises dgenerate.arguments.DgenerateHelpException ? When True, this will occur even if throw=False

Raises:

dgenerate.DgenerateUsageError –
dgenerate.DgenerateHelpException –
dgenerate.ImageSeedError –
dgenerate.UnknownMimetypeError –
dgenerate.MediaIdentificationError –
dgenerate.FrameStartOutOfBounds –
dgenerate.InvalidModelFileError –
dgenerate.InvalidModelUriError –
dgenerate.ModelUriLoadError –
dgenerate.InvalidSchedulerNameError –
dgenerate.OutOfMemoryError –
dgenerate.ModelNotFoundError –
dgenerate.NonHFModelDownloadError –
dgenerate.UnsupportedPipelineConfigError –
dgenerate.PromptWeightingUnsupported –
dgenerate.PromptEmbeddedArgumentError –
dgenerate.PluginNotFoundError –
dgenerate.PluginArgumentError –
dgenerate.ModuleFileNotFoundError –
dgenerate.SpacyModelNotFoundException –
dgenerate.WebFileCacheOfflineModeException –
OSError –

Returns:

InvokeDgenerateEventStream

dgenerate.invoker.InvokeDgenerateEventStream

Event stream produced by invoke_dgenerate_events()

dgenerate.invoker.InvokeDgenerateEvents

Events yield-able by invoke_dgenerate_events()

dgenerate.mediainput module

Media input, handles reading videos/animations, static images, and tensor files (.pt, .pth, .safetensors), and creating readers from image seed URIs.

Also provides media download capabilities and temporary caching of web based files.

Provides information about supported input formats including tensor formats for latent data.

Note: Tensor files are loaded as-is without any preprocessing, resizing, or image processing operations.

exception dgenerate.mediainput.FrameStartOutOfBounds[source]

Bases: ValueError

Raised by MultiMediaReader when the provided frame_start frame slicing value is calculated to be out of bounds.

exception dgenerate.mediainput.ImageSeedArgumentError[source]

Bases: ImageSeedError

Raised when image seed URI keyword arguments receive invalid values.

exception dgenerate.mediainput.ImageSeedError[source]

Bases: Exception

Raised on image seed parsing and loading errors.

exception dgenerate.mediainput.ImageSeedFileNotFoundError[source]

Bases: ImageSeedError

Raised when a file on disk in an image seed could not be found.

exception dgenerate.mediainput.ImageSeedParseError[source]

Bases: ImageSeedError

Raised on image seed syntactic parsing error.

exception dgenerate.mediainput.ImageSeedSizeMismatchError[source]

Bases: ImageSeedError

Raised when the constituent image sources of an image seed specification are mismatched in dimension.

exception dgenerate.mediainput.MediaIdentificationError[source]

Bases: Exception

Raised when a media file is being loaded and it fails to load due to containing invalid or unexpected data.

exception dgenerate.mediainput.UnknownMimetypeError[source]

Bases: Exception

Raised when an unsupported mimetype is encountered

class dgenerate.mediainput.AnimatedImageReader(file: str | BinaryIO, file_source: str, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ImageProcessor = None)[source]

Bases: ImageProcessorMixin, AnimationReader

Implementation of AnimationReader that reads animated image formats using Pillow.

__init__(file: str | BinaryIO, file_source: str, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ImageProcessor = None)[source]

Parameters:

file – a file path or binary file stream
file_source – the source filename for the animated image, should be the filename. this is for informational purpose when reading from a stream or a cached file and should be provided in every case even if it is a symbolic value only. It should possess a file extension. PIL.Image.Image objects produced by the reader will have this value set to their filename attribute.
resize_resolution – Progressively resize each frame to this resolution while reading. The provided resolution will be aligned by align if it is not None.
aspect_correct – Should resize operations be aspect correct?
align – Align by this amount of pixels, if the input file is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.
image_processor – optionally process every frame with this image processor

Raises:

MediaIdentificationError – If the animated image data is an unknown format or corrupt.

class dgenerate.mediainput.AnimationReader(width: int, height: int, fps: float, frame_duration: float, total_frames: int, **kwargs)[source]

Bases: object

Abstract base class for animation readers.

__init__(width: int, height: int, fps: float, frame_duration: float, total_frames: int, **kwargs)[source]

Parameters:

width – width of the animation, X dimension
height – height of the animation, Y dimension
fps – frames per second
frame_duration – frame duration in milliseconds
total_frames – total frames in the animation
kwargs – for mixins

property fps: float: Frames per second.

property frame_duration: float: Duration of each frame in milliseconds.

property height: int: Height dimension, (Y dimension).

property size: tuple[int, int]: returns (width, height) as a tuple.

property total_frames: int: Total number of frames that can be read.

property width: int: Width dimension, (X dimension).

class dgenerate.mediainput.IPAdapterImageUri(path, resize, aspect, align)[source]

Bases: object

__init__(path, resize, aspect, align)[source]

align: int: Pixel alignment, defaults to 1.

aspect: bool: Aspect correct resizing?

path: str: File path or URL to an image.

resize: str | None: Image resize dimension in the form WIDTHxHEIGHT or WIDTH

scale: float: IP Adapter image scale value.

Bases: object

An ImageSeed with attached image data

__init__(images: Image | None = None, mask_images: Image | None = None, control_images: Sequence[Image] | None = None, floyd_image: Image | None = None, adapter_images: list[Sequence[Image]] | None = None, latents: Sequence[Tensor] | None = None)[source]

adapter_images: Sequence[Sequence[Image]] | None: IP Adapter images, or None.

control_images: Sequence[Image] | None: Control guidance images, or None.

floyd_image: Image | None

An optional image from a Deep Floyd IF stage, used for disambiguation in the case of using Deep Floyd for img2img and inpainting, where the un-varied input image is needed as a parameter for both stages. This image is used to define the image that was generated by Deep Floyd in a previous stage and to be used in the next stage, where ImageSeed.image defines the img2img image that you want a variation of.

This image will never be assigned a value when ImageSeed.control_images has a a value. As that is considered incorrect –image-seeds

fps: float | None = None: Frames per second in the case that ImageSeed.is_animation_frame is True

frame_duration: float | None = None: Duration of a frame in milliseconds in the case that ImageSeed.is_animation_frame is True

frame_index: int | None = None: Frame index in the case that ImageSeed.is_animation_frame is True

images: Sequence[Image] | None

An optional images used for img2img mode, or inpainting mode in combination with ImageSeed.mask_images.

May be None when using IP Adapter only images, IE: the adapter: ... uri syntax.

May also be None when using latents only input, IE: the latents: ... uri syntax.

is_animation_frame: bool: Is this part of an animation?

latents: Sequence[Tensor] | None

Raw latent tensors loaded from .pt, .pth, or .safetensors files, or None.

These tensors are loaded as-is without any image processing, resizing, or alignment operations.

mask_images: Sequence[Image] | None: An optional inpaint mask images, may be None.

total_frames: int | None = None: Total frame count in the case that ImageSeed.is_animation_frame is True

uri: str: The original URI string that this image seed originates from.

class dgenerate.mediainput.ImageSeedInfo(is_animation: bool, total_frames: int | None, fps: float | None, frame_duration: float | None)[source]

Bases: object

Information acquired about an --image-seeds uri

__init__(is_animation: bool, total_frames: int | None, fps: float | None, frame_duration: float | None)[source]

fps: float | None: Animation frames per second in the case that ImageSeedInfo.is_animation is True

frame_duration: float | None: Animation frame duration in milliseconds in the case that ImageSeedInfo.is_animation is True

is_animation: bool: Does this image seed specification result in an animation?

total_frames: int | None: Animation frame count in the case that ImageSeedInfo.is_animation is True

class dgenerate.mediainput.ImageSeedParseResult[source]

Bases: object

The result of parsing an --image-seeds uri

get_control_image_paths() → Sequence[str] | None[source]

Return ImageSeedParseResult.seed_path if ImageSeedParseResult.is_single_spec is True.

If the image seed is not a single specification, return ImageSeedParseResult.control_path.

If ImageSeedParseResult.control_path is not set and the image seed is not a single specification, return None.

Returns:: list of resource paths, or None

adapter_images: Sequence[Sequence[IPAdapterImageUri]] | None = None

IP Adapter image URIs.

In parses such as:

--image-seeds "adapter: adapter-image1.png + adapter-image2.png"

--image-seeds "adapter: adapter-image1.png + adapter-image2.png;control=control.png"

--image-seeds "img2img.png;adapter=adapter-image1.png + adapter-image2.png"

--image-seeds "img2img.png;adapter=adapter-image1.png + adapter-image2.png;control=control.png"

--image-seeds "img2img.png;adapter=adapter-image1.png + adapter-image2.png;mask=inpaint-mask.png;control=control.png"

aspect_correct: bool | None = None: Aspect correct resize setting override from the aspect image seed keyword argument, if this is None it was not specified. This value if defined should override any globally defined aspect correct resize setting.

control_images: Sequence[str] | None = None

Optional controlnet guidance path, or a sequence of controlnet guidance paths. This field is only used when the secondary syntax of --image-seeds is encountered.

IE: This parameter is only filled if the keyword argument control is used.

In parses such as:

--image-seeds "img2img.png;control=control.png"

--image-seeds "img2img.png;control=control1.png, control2.png"

--image-seeds "img2img.png;mask=mask.png;control=control.png"

--image-seeds "img2img.png;mask=mask.png;control=control1.png, control2.png"

floyd_image: str | None = None

Optional path to a result from a Deep Floyd IF stage, used only for img2img and inpainting mode with Deep Floyd. This is the only way to specify the image that was output by a stage in that case.

the arguments floyd and control are mutually exclusive.

In parses such as:

--image-seeds "img2img.png;floyd=stage1-output.png"

--image-seeds "img2img.png;mask=mask.png;floyd=stage1-output.png"

There can only ever be one floyd stage image provided.

frame_end: int | None = None: Optional end frame specification for per-image seed slicing.

frame_start: int | None = None: Optional start frame specification for per-image seed slicing.

property has_ip_adapter_images: bool

images: Sequence[str] | None

Optional image paths that will be used for img2img operations or the base image in inpaint operations.

Or controlnet guidance paths, in the case that images: ... syntax is not being used (multi_image_mode==False) and there are multiple provided images, and is_single_spec is True.

A path being a file path, or an HTTP/HTTPS URL.

property is_single_spec: bool

Is this --image-seeds uri a single resource or resource group specification existing within the seed_path attribute of this object?

For instance could it be a img2img definition / sequence of img2img images using the images: ... syntax, or a sequence of controlnet guidance images?

This requires that mask_images, control_images, floyd_image, adapter_images, and latents are all undefined.

Possible parses which trigger this condition are:

--image-seeds "img2img.png"

--image-seeds "control-image.png"

--image-seeds "control-image1.png, control-image2.png"

--image-seeds "images: img2img-1.png, img2img-2.png"

Since this is an ambiguous parse, it must be resolved later with the help of other specified arguments. Such as by the specification of --control-nets, which makes the intention unambiguous.

Returns:: bool

latents: Sequence[str] | None = None

Optional raw latent tensor paths (.pt, .pth, .safetensors files).

In parses such as:

--image-seeds "latents: latents1.pt, latents2.pt"

--image-seeds "img2img.png;latents=latents.pt"

--image-seeds "images: img2img1.png, img2img2.png;latents=latents1.pt, latents2.pt"

Raw latents are loaded as-is without any image processing, resizing, or alignment operations.

mask_images: Sequence[str] | None = None

Optional inpaint mask paths, may be HTTP/HTTPS URLs or file paths.

This may be multiple masks when there are multiple img2img image paths, for example: --image-seeds "images: image1.png, image2.png; mask1.png, mask2.png"

There will always be an equal number of images and mask images.

If a single mask is supplied for multiple images, the mask path is duplicated to match the amount of images.

multi_image_mode: bool

Are there multiple img2img images associated with this image seed?

This indicates that --image-seeds "images: image1.png, image2.png" syntax was used, in order to differentiate from a control image sequence specification.

resize_align: int | None = None

Per image user specified resize alignment for image, mask, and control image components of the --image-seed specification.

This field available in parses such as:

--image-seeds "img2img.png;control=control.png;align=64"

--image-seeds "img2img.png;control=control1.png, control2.png;align=64"

--image-seeds "img2img.png;mask=mask.png;control=control.png;align=64"

--image-seeds "img2img.png;mask=mask.png;control=control1.png, control2.png;align=64"

This will overwrite any default global value (usually 1), the provided value must be divisible by the global value defined by the parser, or a parse error will occur.

resize_resolution: tuple[int, int] | None = None

Per image user specified resize resolution for image, mask, and control image components of the --image-seed specification.

This field available in parses such as:

--image-seeds "img2img.png;512x512"

--image-seeds "img2img.png;mask.png;512x512"

--image-seeds "img2img.png;control=control.png;resize=512x512"

--image-seeds "img2img.png;control=control1.png, control2.png;resize=512x512"

--image-seeds "img2img.png;mask=mask.png;control=control.png;resize=512x512"

--image-seeds "img2img.png;mask=mask.png;control=control1.png, control2.png;resize=512x512"

This should override any globally defined resize value.

uri: str: The original URI string the image seed was parsed from.

class dgenerate.mediainput.MediaReader(path: str, image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>)[source]

Bases: AnimationReader

Thin wrapper around MultiMediaReader which simply reads from a single file instead of multiple files simultaneously.

The interface provided by this object is that of AnimationReader

This object can read any media supported by dgenerate for input and supports frame slicing animated media formats and image processors.

Static images are treated as an animation with a single frame.

With the default path opener, URLs will be downloaded, dgenerate’s temporary web cache will be utilized.

__init__(path: str, image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>)[source]

Raises:

ValueError – if frame_start > frame_end
FrameStartOutOfBounds – if frame_start > total_frames - 1

Parameters:

path – File path or URL
resize_resolution – Resize resolution
aspect_correct – Aspect correct resize enabled?
align – Images which are read are aligned to this amount of pixels, None or 1 will disable alignment.
image_processor – Optional image processor associated with the file
frame_start – inclusive frame slice start frame
frame_end – inclusive frame slice end frame
path_opener – opens a binary file stream from paths.

property frame_end: int: Frame slice end value (inclusive)

property frame_index: int: Current frame index while reading.

property frame_start: int: Frame slice start value (inclusive)

class dgenerate.mediainput.MediaReaderSpec(path: str, image_processor: ImageProcessor | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None)[source]

Bases: object

Used by MultiMediaReader to define resource paths.

__init__(path: str, image_processor: ImageProcessor | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None)[source]

Parameters:

path – File path or URL
resize_resolution – Resize resolution (ignored for tensor files)
aspect_correct – Aspect correct resize enabled? (ignored for tensor files)
align – Images which are read are aligned to this amount of pixels, None or 1 will disable alignment. (ignored for tensor files)
image_processor – Optional image processor associated with the file (ignored for tensor files)

Raises:

ValueError – On align < 1

align: int | None = None

Images which are read are aligned to this amount of pixels, None or 1 will disable alignment.

Note: Alignment is ignored for tensor files.

aspect_correct: bool = True

Aspect correct resize enabled?

Note: Resize operations are ignored for tensor files.

image_processor: ImageProcessor | None = None

Optional image processor associated with the file.

Note: Image processors are ignored for tensor files.

path: str: File path (or HTTP/HTTPS URL with default path_opener)

resize_resolution: tuple[int, int] | None = None

Optional resize resolution.

Note: Resize operations are ignored for tensor files.

class dgenerate.mediainput.MockImageAnimationReader(img: Image, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_repetitions: int = 1, image_processor: ImageProcessor = None)[source]

Bases: ImageProcessorMixin, AnimationReader

Implementation of AnimationReader that repeats a single PIL image as many times as desired in order to mock/emulate an animation.

__init__(img: Image, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_repetitions: int = 1, image_processor: ImageProcessor = None)[source]

Parameters:

img – source image to copy for each frame, the image is immediately copied once upon construction of the mock reader, and then once per frame thereafter. Your copy of the image can be disposed of after the construction of this object.
resize_resolution – the source image will be resized to this dimension with a maintained aspect ratio. This occurs once upon construction, a copy is then yielded for each frame that is read. The provided resolution will be aligned by align if it is not None.
aspect_correct – Should resize operations be aspect correct?
align – Align by this amount of pixels, if the input file is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.
image_repetitions – number of frames that this mock reader provides using a copy of the source image.
image_processor – optionally process the initial image with this image processor, this occurs once.

property total_frames: int

Settable total_frames property.

Returns:: frame count

class dgenerate.mediainput.MockTensorReader(tensor: Tensor, file_source: str, tensor_repetitions: int = 1)[source]

Bases: AnimationReader

Implementation of AnimationReader that yields a single tensor as many times as desired to mock/emulate an animation with tensor data.

This reader is used for .pt, .pth, and .safetensors files containing latent tensors. No image processing, resizing, or alignment operations are performed on tensors.

__init__(tensor: Tensor, file_source: str, tensor_repetitions: int = 1)[source]

Parameters:

tensor – source tensor to yield for each frame
file_source – source filename for the tensor data
tensor_repetitions – number of frames that this mock reader provides using the source tensor

property total_frames: int

Settable total_frames property.

Returns:: frame count

class dgenerate.mediainput.MultiMediaReader(specs: list[~dgenerate.mediainput.MediaReaderSpec], frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>)[source]

Bases: object

Zips together multiple automatically created AnimationReader implementations and allows enumeration over their reads, which are collected into a list of a defined order.

Images when zipped together with animated files will be repeated over the total amount of frames.

The animation with the lowest amount of frames determines the total amount of frames that can be read when animations are involved.

If all paths point to images, then MultiMediaReader.total_frames will be 1.

There is no guarantee that images read from the individual readers are the same size and you must handle that condition.

__init__(specs: list[~dgenerate.mediainput.MediaReaderSpec], frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>)[source]

Raises:

ValueError – if frame_start > frame_end
FrameStartOutOfBounds – if frame_start > total_frames - 1

Parameters:

specs – list of MediaReaderSpec
frame_start – inclusive frame slice start frame
frame_end – inclusive frame slice end frame
path_opener – opens a binary file stream from paths mentioned by MediaReaderSpec

height(idx) → int[source]

Height dimension, (Y dimension) of a specific reader index.

Returns:: height

size(idx) → tuple[int, int][source]

returns (width, height) as a tuple of a specific reader index.

Returns:: (width, height)

width(idx) → int[source]

Width dimension, (X dimension) of a specific reader index.

Returns:: width

property fps: float | None: Frames per second, this will be None if there is only a single frame

property frame_duration: float | None: Duration of a frame in milliseconds, this will be None if there is only a single frame

property frame_end: int: Frame slice end value (inclusive)

property frame_index: int: Current frame index while reading.

property frame_start: int: Frame slice start value (inclusive)

property total_frames: int: Total number of frames readable from this reader.

class dgenerate.mediainput.VideoReader(file: str | BinaryIO, file_source: str, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ImageProcessor = None)[source]

Bases: ImageProcessorMixin, AnimationReader

Implementation AnimationReader that reads Video files with PyAV.

__init__(file: str | BinaryIO, file_source: str, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ImageProcessor = None)[source]

Parameters:

file – a file path or binary file stream
file_source – the source filename for the video data, should be the filename. this is for informational purpose when reading from a stream or a cached file and should be provided in every case even if it is a symbolic value only. It should possess a file extension as it is used to determine file format when reading from a byte stream. PIL.Image.Image objects produced by the reader will have this value set to their filename attribute.
resize_resolution – Progressively resize each frame to this resolution while reading. The provided resolution will be aligned by align if it is not None.
aspect_correct – Should resize operations be aspect correct?
align – Align by this amount of pixels, if the input file is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.
image_processor – optionally process every frame with this image processor

Raises:

MediaIdentificationError – If the video data is an unknown format or corrupt. or if file_source lacks a file extension, it is needed to determine the video file format.

dgenerate.mediainput.create_animation_reader(mimetype: str, file_source: str, file: BinaryIO, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ImageProcessor | None = None) → AnimationReader[source]

Create an animation reader object from mimetype specification and binary file stream.

Images will return a MockImageAnimationReader with a total_frames value of 1, which can then be adjusted by you.

VideoReader or AnimatedImageReader will be returned for Video files and Animated Images respectively.

Raises:

UnknownMimetypeError – on unknown mimetype value
MediaIdentificationError – If the file data is an unknown format or corrupt.

Parameters:

mimetype – one of get_supported_mimetypes()
file – the binary file stream
file_source – the source filename for videos/animated images, should be the filename. this is for informational purpose and should be provided in every case even if it is a symbolic value only. It should possess a file extension. PIL.Image.Image objects produced by the reader will have this value set to their filename attribute.
resize_resolution – Progressively resize each frame to this resolution while reading. The provided resolution will be aligned by align pixels.
align – Align by this amount of pixels, if the input file is not aligned to this amount of pixels, it will be aligned by resizing. Passing None or 1 disables alignment.
aspect_correct – Should resize operations be aspect correct?
image_processor – optionally process every frame with this image processor

Returns:

AnimationReader

dgenerate.mediainput.create_image(path_or_file: BinaryIO | str, file_source: str, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None) → Image[source]

Create an RGB format PIL image from a file path or binary file stream. The image is oriented according to any EXIF directives. Image is aligned to align in every case, specifying None or 1 for align disables alignment.

Raises:

MediaIdentificationError – If the image data is an unknown format or corrupt.

Parameters:

path_or_file – file path or binary IO object
file_source – PIL.Image.Image.filename is set to this value
resize_resolution – Optional resize resolution
aspect_correct – preserve aspect ratio when resize_resolution is specified?
align – Align the image by this amount of pixels, None or 1 indicates no alignment.

Returns:

PIL.Image.Image

dgenerate.mediainput.create_web_cache_file(url, mime_acceptable_desc: str | None = None, mimetype_is_supported: ~typing.Callable[[str], bool] | None = <function mimetype_is_supported>, local_files_only: bool = False) → tuple[str, str][source]

Download a file from a url and add it to dgenerate’s temporary web cache that is available to all concurrent dgenerate processes.

If the file exists in the cache already, return information for the existing file.

Parameters:

url – The url
mime_acceptable_desc – a string describing what mimetype values are acceptable which is used when UnknownMimetypeError is raised. If None is provided, this string will be generated using get_supported_mimetypes()
mimetype_is_supported – a function that test if a mimetype string is supported, if you supply the value None all mimetypes are considered supported.
local_files_only – if True no downloads will be allowed, only cached files and direct paths to files on disk.

Raises:

UnknownMimetypeError – if a mimetype is considered not supported
requests.RequestException – Can raise any exception raised by requests.get for request related errors.

Returns:

tuple(mimetype_str, filepath)

dgenerate.mediainput.fetch_media_data_stream(uri: str, local_files_only: bool = False) → tuple[str, BinaryIO][source]

Get an open stream to a local file, or file at an HTTP or HTTPS URL, with caching for web files.

Caching for downloaded files is multiprocess safe, multiple processes using this module can share the cache simultaneously, the last process alive clears the cache when it exits.

Parameters:

uri – Local file path or URL
local_files_only – If True no downloads will be allowed, only cached files and direct paths to files on disk.

Raises:

UnknownMimetypeError – If a remote file serves an unsupported mimetype value

Returns:

(mime-type string, BinaryIO)

dgenerate.mediainput.frame_slice_count(total_frames: int, frame_start: int, frame_end: int | None = None) → int[source]

Calculate the number of frames resulting from frame slicing.

Parameters:

total_frames – Total frames being sliced from
frame_start – The start frame
frame_end – The end frame

Returns:

int

dgenerate.mediainput.get_control_image_info(uri: str | ~dgenerate.mediainput.ImageSeedParseResult, frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>) → ImageSeedInfo[source]

Get an informational object from a dgenerate --image-seeds uri that is known to be a control image/video specification.

This can consist of a single resource path or a list of comma separated image and video resource paths, which may be files on disk or remote files (http / https).

This method is to be used when it is known that there is only a control image/video specification in the path, and it handles this specification syntax:

--image-seeds "control1.png"

--image-seeds "control1.png, control2.png"

Parameters:

uri – The path string or ImageSeedParseResult
frame_start – slice start
frame_end – slice end
path_opener – a function that opens a file stream from a path, defaults to dgenerate.media.fetch_media_data_stream().

Returns:

ImageSeedInfo

dgenerate.mediainput.get_image_seed_info(uri: str | ~dgenerate.mediainput.ImageSeedParseResult, frame_start: int = 0, frame_end: int | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>) → ImageSeedInfo[source]

Get an informational object from a dgenerate --image-seeds uri.

Parameters:

uri – The uri string or ImageSeedParseResult
frame_start – slice start
frame_end – slice end
path_opener – a function that opens a file stream from a path, defaults to dgenerate.media.fetch_media_data_stream().

Returns:

ImageSeedInfo

dgenerate.mediainput.get_supported_animated_image_mimetypes() → list[str][source]

Get a list of mimetypes that are considered to be supported animated image mimetypes.

Returns:: list of mimetype strings.

dgenerate.mediainput.get_supported_animation_reader_formats()[source]

Supported animation reader formats, file extensions with no period.

Returns:: list of file extensions.

dgenerate.mediainput.get_supported_image_formats()[source]

What file extensions does PIL/Pillow support for reading?

File extensions are returned without a period.

Returns:: list of file extensions

dgenerate.mediainput.get_supported_image_mimetypes() → list[str][source]

Get all supported --image-seeds image mimetypes, including animated image mimetypes

Returns:: list of strings

dgenerate.mediainput.get_supported_mimetypes() → list[str][source]

Get all supported --image-seeds mimetypes, video mimetype may contain a wildcard.

Returns:: list of strings

dgenerate.mediainput.get_supported_static_image_mimetypes() → list[str][source]

Get a list of mimetypes that are considered to be supported static image mimetypes.

Returns:: list of mimetype strings.

dgenerate.mediainput.get_supported_tensor_formats() → list[str][source]

Get supported tensor file formats for latent loading.

Returns:: list of file extensions without periods

dgenerate.mediainput.get_supported_tensor_mimetypes() → list[str][source]

Get supported tensor mimetypes for latent loading.

Returns:: list of mimetype strings

dgenerate.mediainput.get_supported_video_mimetypes() → list[str][source]

Get all supported --image-seeds video mimetypes, may contain a wildcard

Returns:: list of strings

dgenerate.mediainput.get_web_cache_directory() → str[source]

Get the default web cache directory or the value of the environmental variable DGENERATE_WEB_CACHE

Returns:: string (directory path)

dgenerate.mediainput.guess_mimetype(filename) → str | None[source]

Guess the mimetype of a filename.

The filename does not need to exist on disk.

Parameters:: filename – the file name
Returns:: mimetype string or None

dgenerate.mediainput.is_downloadable_url(string) → bool[source]

Does a string represent a URL that can be downloaded from by dgenerate.mediainput?

Parameters:: string – the string
Returns:: True or False

dgenerate.mediainput.is_tensor_file(path: str) → bool[source]

Check if a file path appears to be a tensor file based on extension.

Parameters:: path – file path or URL
Returns:: True if it appears to be a tensor file

dgenerate.mediainput.iterate_control_image(uri: str | ~dgenerate.mediainput.ImageSeedParseResult, frame_start: int = 0, frame_end: int | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | ~collections.abc.Sequence[~dgenerate.imageprocessors.imageprocessor.ImageProcessor] | None = None, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>) → Iterator[ImageSeed][source]

Parse and load a control image/video in an --image-seeds uri and return an iterator that produces ImageSeed objects while progressively reading that file.

One or more ImageSeed objects may be yielded depending on whether an animation is being read.

This can consist of a single resource path or a list of comma separated image and video resource paths, which may be files on disk or remote files (http / https).

This method is to be used when it is known that there is only a controlnet guidance resource specification in the path, and it handles this specification syntax:

--image-seeds "control1.png"

--image-seeds "control1.png, control2.png"

--image-seeds "control1.png, control2.png;512x512"

--image-seeds "control1.png, control2.png;resize=512x512"

--image-seeds "control1.png, control2.png;frame-start=2"

--image-seeds "control1.png, control2.png;frame-start=2;frame-end=10"

--image-seeds "control1.png, control2.png;resize=512x512;frame-start=2;frame-end=10"

The image or images read will be available from the ImageSeed.control_images attribute.

Parameters:

uri – --image-seeds uri or ImageSeedParseResult
frame_start – starting frame, inclusive value
frame_end – optional end frame, inclusive value
resize_resolution – optional global resize resolution. The URI syntax of image seeds allows for overriding this value.
aspect_correct – should the global resize operation be aspect correct by default? The URI syntax for image seeds allows for overriding this value with the aspect keyword argument.
align – Images which are read are aligned to this amount of pixels, None or 1 will disable alignment.
image_processor – optional dgenerate.imageprocessors.ImageProcessor or list of them. A list is used to specify processors for individual images in a multi guidance image specification such as uri = “img1.png, img2.png”. In the case that a multi guidance image specification is used and only one processor is given, that processor will be used on only the first image / video in the specification. Images in a guidance specification with no corresponding processor value will have their processor set to None, specifying extra processors as compared to control guidance image sources will cause ValueError to be raised.
path_opener – opens a binary file stream from paths, defaults to dgenerate.fetch_media_data_stream().

Raises:

ImageSeedError – If any other image inputs are specified, such as mask, control, or floyd. Or if a tensor file is passed in a control guidance image specification, latents input is not supported for controlnet guidance images. Or if too many image processor chains are specified for the amount of images given.
ValueError – On frame_start > frame_end, or align < 1

Returns:

an iterator over ImageSeed objects

dgenerate.mediainput.iterate_image_seed(uri: str | ~dgenerate.mediainput.ImageSeedParseResult, frame_start: int = 0, frame_end: int | None = None, resize_resolution: tuple[int, int] | None = None, aspect_correct: bool = True, align: int | None = None, seed_image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | ~collections.abc.Sequence[~dgenerate.imageprocessors.imageprocessor.ImageProcessor] | None = None, mask_image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | ~collections.abc.Sequence[~dgenerate.imageprocessors.imageprocessor.ImageProcessor] | None = None, control_image_processor: ~dgenerate.imageprocessors.imageprocessor.ImageProcessor | ~collections.abc.Sequence[~dgenerate.imageprocessors.imageprocessor.ImageProcessor] | None = None, check_dimensions_match: bool = True, path_opener: ~typing.Callable[[str], tuple[str, ~typing.BinaryIO]] = <function fetch_media_data_stream>) → Iterator[ImageSeed][source]

Parse and load images/videos/tensors in an --image-seeds uri and return an iterator that produces ImageSeed objects while progressively reading those files.

This method is used to iterate over an --image-seeds uri in the case that the image source mentioned is to be used for img2img / inpaint operations, and handles this syntax:

--image-seeds "img2img.png"

--image-seeds "img2img.png;mask.png"

--image-seeds "img2img.png;mask.png;512x512"

--image-seeds "images: img2img-1.png, img2img-2.png"

--image-seeds "images: img2img-1.png, img2img-2.png; mask1.png, mask2.png"

--image-seeds "images: img2img-1.png, img2img-2.png; mask1.png, mask2.png;512"

Additionally, controlnet guidance resources are handled with keyword arguments:

--image-seeds "img2img.png;control=control1.png, control2.png"

--image-seeds "img2img.png;control=control1.png, control2.png;resize=512x512"

--image-seeds "img2img.png;mask=mask.png;control=control1.png, control2.png"

--image-seeds "img2img.png;mask=mask.png;control=control1.png, control2.png;resize=512x512"

--image-seeds "img2img.png;mask=mask.png;control=control1.png;frame-start=2"

--image-seeds "img2img.png;mask=mask.png;control=control1.png;frame-start=2;frame-end=5"

--image-seeds "images: img2img-1.png, img2img-2.png;control=control1.png, control2.png"

--image-seeds "images: img2img-1.png, img2img-2.png;mask=mask1.png, mask2.png;control=control1.png, control2.png"

IP Adapter Images can be specified in these ways:

--image-seeds "adapter: image.png"

--image-seeds "adapter: adapter1-image.png, adapter2-image.png"

--image-seeds "adapter: image1.png + image2.png"

--image-seeds "adapter: adapter1-image1.png + adapter1-image2.png, adapter2-image1.png + adapter2-image2.png"

--image-seeds "img2img.png;adapter=image.png"

--image-seeds "img2img.png;adapter=adapter1-image.png, adapter2-image.png"

--image-seeds "img2img.png;adapter=image1.png + image2.png"

--image-seeds "img2img.png;adapter=adapter1-image1.png + adapter1-image2.png, adapter2-image1.png + adapter2-image2.png"

--image-seeds "images: img2img-1.png, img2img-2.png;adapter=image.png"

Raw (noisy) latents can be specified in these ways:

--image-seeds "latents: latents1.pt, latents2.pt"

--image-seeds "img2img.png;latents=latents.pt"

--image-seeds "images: img2img1.png, img2img2.png;latents=latents1.pt, latents2.pt"

--image-seeds "latents: latents.safetensors;control=control.png"

The control argument is supported for any IP Adapter image specification.

The mask argument is also supported for img2img with additional IP Adapter images.

Deep Floyd img2img and inpainting mode can be specified in these ways:

--image-seeds "img2img.png;floyd=stage1-image.png"

--image-seeds "img2img.png;mask=mask.png;floyd=stage2-image.png"

Note that all keyword arguments mentioned above can be used together, except for control and floyd, adapter and floyd, or latents and floyd, which are mutually exclusive arguments.

For img2img sources, you may also specify a pt, pth, or safetensors file, this is for passing in latents in place of images in pixel space, image processing will not be applied to these inputs and will be ignored with warnings, this includes resizing, aspect correction, alignment, and image processors. Latents can be generated by using the option --image-format with the value pt, pth, or safetensors.

One or more ImageSeed objects may be yielded depending on whether an animation is being read.

Parameters:

uri – --image-seeds uri or ImageSeedParseResult
frame_start – starting frame, inclusive value
frame_end – optional end frame, inclusive value
resize_resolution – optional global resize resolution. The URI syntax of image seeds allows for overriding this value.
aspect_correct – should the global resize operation be aspect correct by default? The URI syntax for image seeds allows for overriding this value with the aspect keyword argument.
align – Images which are read are aligned to this amount of pixels, None or 1 will disable alignment.
seed_image_processor – optional dgenerate.imageprocessors.ImageProcessor or list of them. A list is used to specify processors for individual images in a multi img2img image specification such as uri = “images: img2img-1.png, img2img-2.png”. In the case that a multi img2img image specification is used and only one processor is given, that processor will be used on only the first image / video in the specification. Images in a multi img2img specification with no corresponding processor value will have their processor set to None, specifying extra processors as compared to img2img sources will cause ValueError to be raised.
mask_image_processor – optional dgenerate.imageprocessors.ImageProcessor or list of them. A list is used to specify processors for individual mask images in a multi inpaint mask specification such as uri = “images: img2img-1.png, img2img-2.png;mask=mask-1.png, mask-2.png”. In the case that a multi inpaint mask specification is used and only one processor is given, that processor will be used on only the first image / video in the specification. Images in an inpaint mask specification with no corresponding processor value will have their processor set to None, specifying extra processors as compared to inpaint mask image sources will cause ValueError to be raised.
control_image_processor – optional dgenerate.imageprocessors.ImageProcessor or list of them. A list is used to specify processors for individual images in a multi guidance image specification such as uri = “img2img.png;control=img1.png, img2.png”. In the case that a multi guidance image specification is used and only one processor is given, that processor will be used on only the first image / video in the specification. Images in a guidance specification with no corresponding processor value will have their processor set to None, specifying extra processors as compared to control guidance image sources will cause ValueError to be raised.
check_dimensions_match – Check the dimensions of input images, mask images, and control images to confirm that they match? For pipelines like stable cascade, this does not matter, input images can be any dimension as they are used as a style reference and not a noise base similar to IP Adapters.
path_opener – a function that opens a file stream from a path, defaults to dgenerate.media.fetch_media_data_stream().

Raises:

ImageSeedError – if multiple images are passed without using the "images: ..." syntax for batching. Or if the "adapter: ..." syntax is used with the floyd keyword argument for floyd stage images. Or if too many image processor chains are specified for the amount of images given.

Returns:

an iterator over ImageSeed objects

dgenerate.mediainput.load_tensor_file(path_or_file: BinaryIO | str, file_source: str) → Tensor[source]

Load a tensor from a .pt, .pth, or .safetensors file.

Parameters:

path_or_file – file path or binary IO object
file_source – source filename for error reporting

Returns:

loaded tensor

Raises:

MediaIdentificationError – if the file cannot be loaded or if the file format is not supported

dgenerate.mediainput.mimetype_is_animated_image(mimetype: str) → bool[source]

Check if a mimetype is one that dgenerate considers an animated image

Parameters:: mimetype – The mimetype string
Returns:: bool

dgenerate.mediainput.mimetype_is_static_image(mimetype: str) → bool[source]

Check if a mimetype is one that dgenerate considers a static image

Parameters:: mimetype – The mimetype string
Returns:: bool

dgenerate.mediainput.mimetype_is_supported(mimetype: str) → bool[source]

Check if dgenerate supports a given input mimetype

Parameters:: mimetype – The mimetype string
Returns:: bool

dgenerate.mediainput.mimetype_is_tensor(mimetype: str) → bool[source]

Check if a mimetype is one that dgenerate considers a tensor file

Parameters:: mimetype – The mimetype string
Returns:: bool

dgenerate.mediainput.mimetype_is_video(mimetype: str) → bool[source]

Check if a mimetype is a video mimetype supported by dgenerate

Parameters:: mimetype – The mimetype string
Returns:: bool

dgenerate.mediainput.parse_image_seed_uri(uri: str, align: int | None = 8) → ImageSeedParseResult[source]

Parse an --image-seeds uri into its constituents

All URI related errors raised by this function derive from ImageSeedError.

Raises:

ValueError – if align < 1
ImageSeedParseError – on syntactical parsing errors
ImageSeedArgumentError – on image seed URI argument errors
ImageSeedFileNotFoundError – when a file mentioned in an image seed does not exist on disk
ValueError – On align < 1

Parameters:

uri – --image-seeds uri
align – do not allow per image seed resize resolutions that are not aligned to this value, setting this value to 1 or None disables alignment checks.

Returns:

ImageSeedParseResult

dgenerate.mediainput.request_mimetype(url, local_files_only: bool = False) → str[source]

Request the mimetype of a file at a URL, if the file exists in the cache, a known mimetype is returned without connecting to the internet. Otherwise, connect to the internet to retrieve the mimetype, this action does not update the cache.

Parameters:

url – The url
local_files_only – If True, do not make a request, only check the cache.

Raises:

dgenerate.webcache.WebFileCacheOfflineModeException – If the web cache is in offline mode and the file data is not found in the cache.

Returns:

mimetype string

dgenerate.mediainput.separate_images_and_tensors(items: Sequence[Image] | Sequence[Tensor]) → tuple[list[Image], list[Tensor]][source]

Separate a sequence of images or tensors into separate sequences.

Note: The input should be homogeneous (all images or all tensors), but this function can handle mixed inputs for validation purposes.

Parameters:: items – Sequence of PIL Images or torch Tensors (should be homogeneous)
Returns:: Tuple of (images, tensors) where each can be empty if no items of that type exist

dgenerate.mediainput.url_aware_basename(path)[source]

Get the os.path.basename of a file path or URL.

Parameters:: path – the path
Returns:: basename

dgenerate.mediainput.url_aware_normpath(path)[source]

Only os.path.normpath a file path if it is not a URL.

Parameters:: path – the path
Returns:: normalized file path or unmodified URL

dgenerate.mediainput.url_aware_splitext(path)[source]

Get the os.path.splitext result for a file path or URL.

Parameters:: path – the path
Returns:: base, ext

dgenerate.mediaoutput module

Media output, handles writing videos, animations, and tensor files (latents).

Provides information about supported output formats including tensor formats for latent data.

exception dgenerate.mediaoutput.UnknownAnimationFormatError[source]

Bases: Exception

Raised by create_animation_writer() when an unknown animation format is provided.

class dgenerate.mediaoutput.AnimatedImageWriter(filename: str, duration: float)[source]

Bases: AnimationWriter

Animation writer for animated images such as GIFs and webp

__init__(filename: str, duration: float)[source]

Parameters:

filename – Filename to write to.
duration – Frame duration, (duration of a single frame) in milliseconds.

end(new_file: str = None)[source]

write(img: Image)[source]

class dgenerate.mediaoutput.AnimationWriter[source]

Bases: object

Interface for animation writers

__init__()[source]

end(new_file: str = None)[source]

write(pil_img_rgb: Image)[source]

class dgenerate.mediaoutput.MultiAnimationWriter(animation_format: str, filename: str, fps: float, allow_overwrites=False)[source]

Bases: AnimationWriter

Splits writes between N files with generated filename suffixes if necessary depending on how many images were written on the first write.

__init__(animation_format: str, filename: str, fps: float, allow_overwrites=False)[source]

Parameters:

animation_format – One of supported_animation_writer_formats()
filename – The desired filename, if multiple images are written a suffix _animation_N will be appended for each file
fps – Frames per second
allow_overwrites – Allow overwrites of existing files? or append _duplicate_N, The overwrite dis-allowance is multiprocess safe between instances of this library.

end(new_file=None)[source]

write(img: Image | Iterable[Image])[source]

class dgenerate.mediaoutput.VideoWriter(filename, fps: float)[source]

Bases: AnimationWriter

Animation writer for MP4 h264 format video

__init__(filename, fps: float)[source]

Parameters:

filename – Filename to write to.
fps – Frame rate, in frames per second.

end(new_file=None)[source]

write(img: Image)[source]

dgenerate.mediaoutput.create_animation_writer(animation_format: str, out_filename: str, fps: float)[source]

Create an animation writer of a given format.

Raises:

UnknownAnimationFormatError – if the provided animation_format is unknown.

Parameters:

animation_format – The animation format, see supported_animation_writer_formats()
out_filename – the output file name
fps – FPS

Returns:

AnimationWriter

dgenerate.mediaoutput.get_supported_animation_writer_formats()[source]

Supported animation writer formats, file extensions with no period.

Returns:: list of file extensions.

dgenerate.mediaoutput.get_supported_static_image_formats()[source]

What file extensions does PIL/Pillow support for output of at least one frame?

File extensions are returned without a period.

Returns:: list of file extensions

dgenerate.mediaoutput.get_supported_tensor_formats() → list[str][source]

Get supported tensor file formats for latents output.

Returns:: List of supported tensor formats

dgenerate.mediaoutput.save_tensor_file(tensor: Tensor | ndarray, path_or_file: BinaryIO | str, file_format: str = 'pt') → None[source]

Save a tensor to disk in the specified format.

Parameters:

tensor – The tensor to save (torch.Tensor or numpy.ndarray)
path_or_file – Path to save to or file-like object
file_format – Format to save in (“pt”, “pth”, or “safetensors”)

Raises:

ValueError – If format is not supported

dgenerate.memoize module

Function memoization wrapper and associated hashing tools.

exception dgenerate.memoize.ObjectCacheKeyError[source]

Bases: KeyError

Raised when an object cannot be found in the object cache.

Or upon adding an object to the cache that already exists in the cache.

class dgenerate.memoize.CachedObjectMetadata(skip=False, **kwargs)[source]

Bases: object

Represents optional metadata for an object cached in ObjectCache

This object is a namespace, you can access the metadata attributes using the dot operator.

__init__(skip=False, **kwargs)[source]

class dgenerate.memoize.ObjectCache(name)[source]

Bases: object

A cache for objects with unique memory IDs.

__init__(name)[source]

cache(key, value, metadata: CachedObjectMetadata | None = None, extra_identities: list[Callable[[Any], Any]] = None)[source]

Add an object to the cache. It must not already exist in the cache.

This method triggers callbacks.

Parameters:

key – Cache key
value – Cached object
metadata – Object metadata
extra_identities – Functions which return members of the cached object, these members can be used to identify the object in cache.

Raises:

ObjectCacheKeyError – If the object already exists in the cache.

clear(collect=True)[source]

Clear the cache and trigger callbacks.

Parameters:: collect – call gc.collect() ?

get(key: str, default: Any | None = None)[source]

Get an object by its cache key.

Parameters:

key – the key
default – default value if non-existant

Returns:

the object, or default

get_hash_key(value) → str[source]

Get the cache key used for an object that exists in the cache.

Parameters:: value – The object
Raises:: ObjectCacheKeyError – if the object does not exist in the cache.
Returns:: hash key

get_metadata(value) → CachedObjectMetadata | None[source]

Get any metadata that exists for an object in the cache, or None if no metadata exists for the object.

Parameters:: value – The object
Returns:: CachedObjectMetadata

register_on_cache(action: Callable[[ObjectCache, Any], None])[source]

Register a callback for when an object enters the cache.

Parameters:: action – callback action, accepts ObjectClass, and the object entering cache.

register_on_clear(action: Callable[[ObjectCache], None])[source]

Register a callback for when the cache is cleared.

Parameters:: action – callback action, accepts ObjectClass as the only parameter

register_on_un_cache(action: Callable[[ObjectCache, Any], None])[source]

Register a callback for when an object exits the cache.

Parameters:: action – callback action, accepts ObjectClass, and the object exiting cache.

un_cache(value)[source]

Remove an object from the cache by its reference.

This method triggers callbacks.

Parameters:: value – Object
Raises:: ObjectCacheKeyError – If the object is not a member of the cache.

values()[source]: Return all cached objects in a list.

dgenerate.memoize.args_cache_key(args_dict: dict[str, Any], custom_hashes: dict[str, Callable[[Any], str]] = None)[source]

Generate a cache key for a functions arguments to use for memoization.

Parameters:

args_dict – The args dictionary of the function
custom_hashes – Custom hash functions for specific argument names if needed

Returns:

string

dgenerate.memoize.clear_object_caches(collect: bool = True)[source]

Call ObjectCache.clear() on every object cache.

Parameters:: collect – call gc.collect() ?

dgenerate.memoize.create_object_cache(cache_name, cache_type: type[~dgenerate.memoize.ObjectCacheType] = <class 'dgenerate.memoize.ObjectCache'>) → ObjectCacheType[source]

Create a new object cache.

Parameters:

cache_name – Cache name.
cache_type – ObjectCache implementation.

Raises:

RuntimeError – If the cache name is taken.

Returns:

ObjectCache

dgenerate.memoize.disable_memoization_context(disabled=True)[source]

Context manager which allows you to temporarily disable memoization on functions decorated with memoize()

The default action is to disable memoization.

Parameters:: disabled – You can use this parameter to allow user configuration of

the memoization state without writing separate code outside of this context block for that. Setting disable=False leaves memoization enabled in the context.

dgenerate.memoize.get_object_cache(cache_name) → ObjectCache[source]

Return an object cache by name.

Parameters:: cache_name – the cache name.
Raises:: RuntimeError – if the cache name does not exist.
Returns:: ObjectCache

dgenerate.memoize.get_object_cache_names() → list[str][source]

Return a list of active object cache names.

Returns:: list of names

dgenerate.memoize.memoize(cache: dict[str, ~typing.Any] | ~dgenerate.memoize.ObjectCache, exceptions: set[str] = None, skip_check: ~typing.Callable[[~typing.Any], bool] = None, hasher: ~typing.Callable[[dict[str, ~typing.Any]], str] = <function args_cache_key>, extra_identities: list[~typing.Callable[[~typing.Any], ~typing.Any]] = None, on_hit: ~typing.Callable[[str, ~typing.Any], None] = None, on_create: ~typing.Callable[[str, ~typing.Any], None] = None)[source]

Decorator used to Memoize a function using a dictionary as a value cache.

Parameters:

cache – The dictionary or ObjectCache to serve as a cache
exceptions – Function arguments to ignore
skip_check – Check the created object itself to determine if caching should proceed, should return True if you want to skip caching of the object.
hasher – Responsible for hashing arguments and argument values
extra_identities – List of functions which return member objects of the cached object, that can be used as extra identifiers for the object in cache, the returned objects can be used to retrieve cache metadata or the hash key for the parent object in the cache via the methods on ObjectCache.
on_hit – Called on cache hit for the wrapped function
on_create – Called on cache miss for the wrapped function

Returns:

decorator

dgenerate.memoize.property_hasher(obj: Any, custom_hashes: dict[str, Callable[[Any], str]] = None, exclude: set[str] | None = None)[source]

Create a hash string from an objects public decorated properties.

Parameters:

obj – the object
custom_hashes – Custom hash functions for specific property names if needed
exclude – Exclude property by name

Returns:

string

dgenerate.memoize.simple_cache_hit_debug(title: str, cache_key: str, cache_hit: Any)[source]

Basic cache hit debug message for memoize() decorator on_hit parameter.

Messages are printed using dgenerate.messages.debug_log()

Example:: on_hit=lambda key, hit: simple_cache_hit_debug("My Object", key, hit)
Debug Prints:: Cache Hit, Loaded My Object: (fully qualified name of hit object), Cache Key: (key)

Parameters:

title – Object Title
cache_key – cache key
cache_hit – cached object

dgenerate.memoize.simple_cache_miss_debug(title: str, cache_key: str, new: Any)[source]

Basic cache hit debug message for memoize() decorator on_create parameter.

Messages are printed using dgenerate.messages.debug_log()

Example:: on_create=lambda key, hit: simple_cache_miss_debug("My Object", key, hit)
Debug Prints:: Cache Miss, Created My Object: (fully qualified name of new object), Cache Key: (key)

Parameters:

title – Object Title
cache_key – cache key
new – newly created object

dgenerate.memoize.struct_hasher(obj: Any, custom_hashes: dict[str, Callable[[Any], str]] = None, exclude: set[str] | None = None, properties_only: bool = False) → str[source]

Create a hash string from a simple objects public attributes.

Parameters:

obj – the object
custom_hashes – Custom hash functions for specific attribute names if needed
exclude – Exclude attributes by name
properties_only – Only include public properties, not methods or other attributes

Returns:

string

dgenerate.memory module

System memory information and memory constraint expressions.

exception dgenerate.memory.MemoryConstraintSyntaxError[source]

Bases: Exception

Thrown by memory_constraints() on syntax errors or if an expression returns a non-boolean value

class dgenerate.memory.SizedConstrainedObjectCache(name)[source]

Bases: ObjectCache

An object cache that can track cache memory use via the cached objects returned metadata.

Your memoized function should return at least: object, dgenerate.memoize.CachedObjectMetadata(size=the_size)

You must return a metadata object with the attribute size at the minimum.

You may attach other metadata to the object as needed.

__init__(name)[source]

enforce_cpu_mem_constraints(constraints: ~typing.Iterable[str], size_var: str, new_object_size: int, mode: ~typing.Callable[[~typing.Iterable], bool] = <built-in function any>)[source]

Clear the cache if these CPU side memory constraints are met.

See: memory_constraints()

The constraint variable cache_size equates to the current cache size.

Parameters:

constraints
size_var – Memory constraint expression variable name containing the new_object_size value.
new_object_size – Size of the new object.
mode – Logical and/or function on constraint expressions, any for or, all for and.

Returns:

True if the cache was cleared, False otherwise

enforce_gpu_mem_constraints(constraints: ~typing.Iterable[str], size_var: str, new_object_size: int, device: str | ~torch.device, mode: ~typing.Callable[[~typing.Iterable], bool] = <built-in function any>)[source]

Clear the cache if these GPU side memory constraints are met.

See: gpu_memory_constraints()

The constraint variable cache_size equates to the current cache size.

Parameters:

constraints
size_var – Memory constraint expression variable name containing the new_object_size value.
new_object_size – Size of the new object.
device – Device to check
mode – Logical and/or function on constraint expressions, any for or, all for and.

Returns:

True if the cache was cleared, False otherwise

property size: Return the current cache size.

dgenerate.memory.bytes_best_human_unit(byte_count: int, delimiter='') → str[source]

Return a string for humans from a byte count using an appropriate unit: IE 1KB, 1MB, 1GB etc.

Parameters:

delimiter – add this string between the value and the unit
byte_count – the byte count

Returns:

formatted string

dgenerate.memory.calculate_chunk_size(file_size)[source]

Calculate the chunk size for downloading / copying a file based on the file size and available memory.

Parameters:: file_size – The size of the file to be downloaded / copied.
Returns:: The calculated chunk size.

dgenerate.memory.get_available_memory(unit: str = 'b')[source]

Get the available memory remaining on the system in a selectable unit.

Parameters:: unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)
Returns:: Requested value.

dgenerate.memory.get_gpu_allocated_memory(device: str | device, unit: str = 'b')[source]

Return the total memory allocated on a GPU device.

Non GPU devices always return 0.

Parameters:

device – The device.
unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)

Returns:

Requested value.

dgenerate.memory.get_gpu_free_memory(device: str | device, unit: str = 'b')[source]

Return the amount of free memory available on a GPU device.

Non GPU devices always return 0.

Parameters:

device – The device.
unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)

Returns:

Requested value.

dgenerate.memory.get_gpu_reserved_memory(device: str | device, unit: str = 'b')[source]

Return the amount of reserved memory on a GPU device.

Non GPU devices always return 0.

Parameters:

device – The device.
unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)

Returns:

Requested value.

dgenerate.memory.get_gpu_total_memory(device: str | device, unit: str = 'b')[source]

Return the total memory processed by a GPU device.

Non GPU devices always return 0.

Parameters:

device – The device.
unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)

Returns:

Requested value.

dgenerate.memory.get_total_memory(unit: str = 'b')[source]

Get the total physical memory on the system.

Parameters:: unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)
Returns:: Requested value.

dgenerate.memory.get_used_memory(unit: str = 'b', pid: int | None = None)[source]

Get the memory used by a process in a selectable unit.

Parameters:

unit – one of (case insensitive): b (bytes), kb (kilobytes), mb (megabytes), gb (gigabytes), kib (kibibytes), mib (mebibytes), gib (gibibytes)
pid – The process PID to retrieve this information from, defaults to the current process.

Returns:

Requested value.

dgenerate.memory.get_used_memory_percent(pid: int | None = None) → float[source]

Get the percentage of memory used by a process as a percentage of already used memory plus available virtual memory.

Parameters:: pid – PID of the process, defaults to the current process.
Returns:: A whole percentage, for example: 25.4

dgenerate.memory.get_used_total_memory_percent(pid: int | None = None) → float[source]

Get the percentage of memory used by a process as a percentage of total system memory.

Parameters:: pid – PID of the process, defaults to the current process.
Returns:: A whole percentage, for example: 25.4

dgenerate.memory.gpu_memory_constraints(expressions: ~collections.abc.Iterable[str], extra_vars: dict[str, int | float] | None = None, mode=<built-in function any>, device: str | ~torch.device = 'cuda:0') → bool[source]

Evaluate a user boolean expression involving a GPU device’s memory in bytes, used memory percent, and available VRAM memory in bytes.

If you pass a non GPU device identifier to this method, it will always return False

Available functions are:

kb(bytes to kilobytes)

mb(bytes to megabytes)

gb(bytes to gigabytes)

kib(bytes to kibibytes)

mib(bytes to mebibytes)

gib(bytes to gibibytes)

Available values are:

used / u (memory currently used by the GPU device in bytes)

used_total_percent / utp (memory used by the GPU device, as percent of total VRAM memory, example: 25.4)

available / a (available memory remaining on the GPU device in bytes that can be used)

total / t (total memory on the GPU device in bytes)

Example expressions:

used > gb(1) (when the device has used more than 1GB of memory)

used_total_percent > 25 (when the device has used more than 25 percent of VRAM memory)

available < gb(2) (when the available memory on the device is less than 2GB)

Expressions may not be longer than 128 characters. However, multiple expressions may be provided.

raise ValueError:

if extra_vars overwrites a reserved variable name, or if device is not a str or torch.device object.

raise MemoryConstraintSyntaxError:

on syntax errors or if the return value of an expression is not a boolean value.

param expressions:

a list of expressions, if expressions is None or empty this function will return False.

param extra_vars:

extra integer or float variables

param mode:

the standard library function ‘any’ (equating to OR all expressions) or the standard library function ‘all’ (equating to AND all expressions). The default is ‘any’ which ORs all expressions.

param device:

GPU device string or torch.device object, defaults to ‘cuda:0’. Can be CUDA (e.g., ‘cuda:0’) or XPU (e.g., ‘xpu:0’) devices.

return:

Boolean result of the expression

dgenerate.memory.is_supported_gpu_device(device: str | device) → bool[source]

Check if a device is a supported GPU device (CUDA or XPU) that can be used with the GPU memory functions in this module.

MPS statistics are unsupported due to using a unified memory model.

Parameters:: device – The device to check (string like ‘cuda:0’, ‘xpu:1’ or torch.device object)
Returns:: True if the device is a supported GPU device, False otherwise

dgenerate.memory.memory_constraint_syntax_check(expression: str)[source]

Syntax check an expression given to memory_constraints()

Parameters:: expression – the expression string
Raises:: MemoryConstraintSyntaxError – on syntax errors.

dgenerate.memory.memory_constraints(expressions: ~collections.abc.Iterable[str], extra_vars: dict[str, int | float] | None = None, mode=<built-in function any>, pid: int | None = None) → bool[source]

Evaluate a user boolean expression involving the processes used memory in bytes, used memory percent, and available system memory in bytes.

Available functions are:

kb(bytes to kilobytes)
mb(bytes to megabytes)
gb(bytes to gigabytes)
kib(bytes to kibibytes)
mib(bytes to mebibytes)
gib(bytes to gibibytes)

Available values are:

used / u (memory currently used by the process in bytes)
used_total_percent / utp (memory used by the process, as percent of total system memory, example: 25.4)
used_percent / up (memory used by the process, as a percent of used + available memory, example 75.4)
available / a (available memory remaining on the system in bytes that can be used without going to the swap)
total / t (total memory on the system in bytes)

Example expressions:

used > gb(1) (when the process has used more than 1GB of memory)
used_total_percent > 25 (when the process has used more than 25 percent of system memory)
used_percent > 25 (when the process has used more than 25 percent of virtual memory available to it)
available < gb(2) (when the available memory on the system is less than 2GB)

Expressions may not be longer than 128 characters. However, multiple expressions may be provided.

Raises:

ValueError – if extra_vars overwrites a reserved variable name
MemoryConstraintSyntaxError – on syntax errors or if the return value of an expression is not a boolean value.

Parameters:

expressions – a list of expressions, if expressions is None or empty this function will return False.
extra_vars – extra integer or float variables
mode – the standard library function ‘any’ (equating to OR all expressions) or the standard library function ‘all’ (equating to AND all expressions). The default is ‘any’ which ORs all expressions.
pid – PID of the process from which to acquire the ‘used’ and ‘used_percent’ variable values from, defaults to the current process.

Returns:

Boolean result of the expression

dgenerate.memory.memory_use_debug_string(pid=None)[source]

Return a debug string using describing the memory consumption of a process and also available system memory.

Example:

“Used Memory: 465.25MB, Available Memory: 50.94GB, Used Percent: 0.91%, Total Memory: 68.64GB, Used Total Percent: 0.68%”

Where:

Used Memory = get_used_memory()
Available Memory = get_available_memory()
Used Percent = get_used_memory_percent()
Total Memory = get_total_memory()
Used Percent Total = get_used_total_memory_percent()

Parameters:: pid – PID of the process to describe, defaults to the current process.
Returns:: formatted string

dgenerate.memory.torch_gc()[source]: Call torch.cuda.empty_cache() and torch.cuda.ipc_collect() for CUDA, and torch.xpu.empty_cache() for XPU devices.

dgenerate.messages module

Library logging / informational output.

dgenerate.messages.add_logging_handler(callback: Callable[[ParamSpecArgs, int, bool, str], None])[source]

Add your own logging handler callback.

Parameters:: callback – Callback accepting (*args, LEVEL, underline (bool), underline_char)

dgenerate.messages.debug_log(*func_or_str: Callable[[], Any] | Any, underline=False, underline_char='=')[source]

Conditionally log strings or possibly expensive functions if LEVEL is set to DEBUG.

Parameters:

func_or_str – objects to be stringified and printed or callables that return said objects
underline – Underline this message?
underline_char – Underline character.

dgenerate.messages.error(*args: Any, underline=False, underline_char='=')[source]

Write an error message to dgenerate’s log

Parameters:

args – args, objects that will be stringified and joined with a space
underline – Underline this message?
underline_char – Underline character

dgenerate.messages.errors_to_null()[source]: Force dgenerate’s error output to a null file.

dgenerate.messages.get_error_file()[source]: Get the file stream or file like object for dgenerate’s error output.

dgenerate.messages.get_message_file()[source]: Get the file stream or file like object for dgenerate’s normal (non error) output.

dgenerate.messages.log(*args: Any, level=1, underline=False, underline_char='=')[source]

Write a message to dgenerate’s log

Parameters:

args – args, objects that will be stringified and joined with a space
level – Log level, one of: INFO, WARNING, ERROR, DEBUG
underline – Underline this message?
underline_char – Underline character

dgenerate.messages.messages_to_null()[source]: Force dgenerate’s normal output to a null file.

dgenerate.messages.pop_level()[source]

Pop dgenerate.messages.LEVEL value last saved by push_level() and assign it to LEVEL.

If no previous level was saved, no-op.

dgenerate.messages.push_level(level)[source]

Set dgenerate.messages.LEVEL and save the previous value to a stack.

Parameters:: level – one of INFO, WARNING, ERROR, , DEBUG

dgenerate.messages.remove_logging_handler(callback: Callable[[ParamSpecArgs, int, bool, str], None])[source]

Remove a logging handler callback by reference.

Parameters:: callback – The previously registered callback

dgenerate.messages.set_error_file(file: TextIO)[source]

Set a file stream or file like object for dgenerate’s error output.

Parameters:: file – The file stream

dgenerate.messages.set_message_file(file: TextIO)[source]

Set a file stream or file like object for dgenerate’s normal (non error) output.

Parameters:: file – The file stream

dgenerate.messages.silence()[source]

Context manager to silence all dgenerate logging / messages.

This will redirect all messages to a null file temporarily.

dgenerate.messages.warning(*args: Any, underline=False, underline_char='=')[source]

Write a warning message to dgenerate’s log

Parameters:

args – args, objects that will be stringified and joined with a space
underline – Underline this message?
underline_char – Underline character

dgenerate.messages.with_level(level)[source]

Context manager which pushes a dgenerate.messages.LEVEL to the stack and pops it when the with context ends.

This affects logging output level within the context.

Parameters:: level – log level

dgenerate.messages.AUTO_FLUSH_MESSAGES = True

Whether to auto flush the output stream when printing to stdout or the output file assigned with set_message_file().

Errors are printed to stderr which is unbuffered by default.

dgenerate.messages.DEBUG = 8: Log Level DEBUG

dgenerate.messages.ERROR = 4: Log Level ERROR

dgenerate.messages.INFO = 1: Log level INFO

dgenerate.messages.LEVEL = 1

Current Log Level (set-able)

Setting to INFO means print all messages except DEBUG messages.

Setting to ERROR means only print ERROR messages.

Setting to WARNING means only print WARNING messages.

Setting to DEBUG means print every message.

Levels are a bitfield, so you can set: LEVEL = WARNING | ERROR etc.

dgenerate.messages.WARNING = 2: Log Level WARNING

dgenerate.pipelinewrapper module

huggingface diffusers pipeline wrapper / driver interface.

All functionality needed from the diffusers library is behind this interface.

exception dgenerate.pipelinewrapper.AdetailerDetectorUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --adetailer-detectors uri

exception dgenerate.pipelinewrapper.ControlNetUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --control-nets uri

exception dgenerate.pipelinewrapper.DiffusionArgumentsHelpException[source]

Bases: Exception

Thrown when a DiffusionArguments attribute that supports passing a help request value (such as DiffusionArguments.scheduler_uri) is passed its help value.

This exception returns the help string to the caller.

exception dgenerate.pipelinewrapper.IPAdapterUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --ip-adapters uri

exception dgenerate.pipelinewrapper.ImageEncoderUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --image-encoder uri

exception dgenerate.pipelinewrapper.InvalidAdetailerDetectorUriError[source]

Bases: InvalidModelUriError

Error in --adetailer-detectors uri

exception dgenerate.pipelinewrapper.InvalidBNBQuantizerUriError[source]

Bases: InvalidModelUriError

Error in --quantizer uri

exception dgenerate.pipelinewrapper.InvalidControlNetUriError[source]

Bases: InvalidModelUriError

Error in --control-nets uri

exception dgenerate.pipelinewrapper.InvalidIPAdapterUriError[source]

Bases: InvalidModelUriError

Error in --ip-adapters uri

exception dgenerate.pipelinewrapper.InvalidImageEncoderUriError[source]

Bases: InvalidModelUriError

Error in --image-encoder uri

exception dgenerate.pipelinewrapper.InvalidLoRAUriError[source]

Bases: InvalidModelUriError

Error in --loras uri

exception dgenerate.pipelinewrapper.InvalidModelFileError[source]

Bases: Exception

Raised when a file is loaded from disk that is an invalid diffusers model format.

This indicates that was a problem loading the primary diffusion model, This could also refer to an SDXL refiner model or Stable Cascade decoder model which are considered primary models.

exception dgenerate.pipelinewrapper.InvalidModelUriError[source]

Bases: Exception

Thrown on model path syntax or logical usage error

exception dgenerate.pipelinewrapper.InvalidSCascadeDecoderUriError[source]

Bases: InvalidModelUriError

Error in --s-cascade-decoder uri

exception dgenerate.pipelinewrapper.InvalidSDNQQuantizerUriError[source]

Bases: InvalidModelUriError

Error in --quantizer uri for SDNQ backend

exception dgenerate.pipelinewrapper.InvalidSDXLRefinerUriError[source]

Bases: InvalidModelUriError

Error in --sdxl-refiner uri

exception dgenerate.pipelinewrapper.InvalidSchedulerNameError[source]

Bases: SchedulerLoadError

Unknown scheduler name used.

exception dgenerate.pipelinewrapper.InvalidT2IAdapterUriError[source]

Bases: InvalidModelUriError

Error in --t2i-adapters uri

exception dgenerate.pipelinewrapper.InvalidTextEncoderUriError[source]

Bases: InvalidModelUriError

Error in --text-encoder* uri

exception dgenerate.pipelinewrapper.InvalidTextualInversionUriError[source]

Bases: InvalidModelUriError

Error in --textual-inversions uri

exception dgenerate.pipelinewrapper.InvalidTransformerUriError[source]

Bases: InvalidModelUriError

Error in --transformer uri

exception dgenerate.pipelinewrapper.InvalidUNetUriError[source]

Bases: InvalidModelUriError

Error in --unet uri

exception dgenerate.pipelinewrapper.InvalidVaeUriError[source]

Bases: InvalidModelUriError

Error in --vae uri

exception dgenerate.pipelinewrapper.LoRAUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --loras uri

exception dgenerate.pipelinewrapper.ModelUriLoadError[source]

Bases: Exception

Thrown when model fails to load from a URI for a reason other than not being found, such as being unsupported.

This exception refers to loadable sub models such as VAEs, LoRAs, ControlNets, Textual Inversions etc.

exception dgenerate.pipelinewrapper.SchedulerArgumentError[source]

Bases: SchedulerLoadError

Scheduler URI argument error.

exception dgenerate.pipelinewrapper.SchedulerLoadError[source]

Bases: Exception

Base class for scheduler loading exceptions.

exception dgenerate.pipelinewrapper.T2IAdapterUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --t2i-adapters uri

exception dgenerate.pipelinewrapper.TextEncoderUriLoadError[source]

Bases: InvalidModelUriError

Error loading --text-encoder* uri

exception dgenerate.pipelinewrapper.TextualInversionUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --textual-inversions uri

exception dgenerate.pipelinewrapper.TransformerUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --transformer uri

exception dgenerate.pipelinewrapper.UNetUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --unet / --second-model-unet uri

exception dgenerate.pipelinewrapper.UnknownQuantizerName[source]

Bases: Exception

Raised upon referencing an unknown quantization backend name.

exception dgenerate.pipelinewrapper.UnsupportedPipelineConfigError[source]

Bases: Exception

Occurs when a diffusers pipeline is requested to be configured in a way that is unsupported by that pipeline.

exception dgenerate.pipelinewrapper.VAEUriLoadError[source]

Bases: ModelUriLoadError

Error while loading model file in --vae uri

Bases: object

Representation of a --adetailer-detectors uri

static help()[source]

static parse(uri: str) → AdetailerDetectorUri[source]

Parse a --adetailer-detectors uri and return an object representing its constituents

Parameters:: uri – string with --adetailer-detectors uri syntax
Raises:: InvalidAdetailerDetectorUriError –
Returns:: AdetailerDetectorUri

__init__(model: str, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None, confidence: float = 0.3, detector_padding: int | tuple[int, int] | tuple[int, int, int, int] | None = None, mask_shape: str | None = None, mask_padding: int | tuple[int, int] | tuple[int, int, int, int] | None = None, mask_blur: int | None = None, mask_dilation: int | None = None, model_masks: bool | None = None, index_filter: Collection[int] | None = None, class_filter: Collection[int | str] | None = None, prompt: str | None = None, negative_prompt: str | None = None, device: str | None = None, size: int | None = None)[source]

get_model_path(local_files_only: bool = False, use_auth_token: str | None = None)[source]

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': 'in'}}

NAMES = ['Adetailer Detector']

OPTION_ARGS = {'mask-shape': ['r', 'rect', 'rectangle', 'c', 'circle', 'ellipse']}

property class_filter: Collection[int | str] | None: Process only these YOLO detection classes.

property confidence: float: Confidence value for YOLO detector model.

property detector_padding: int | tuple[int, int] | tuple[int, int, int, int] | None

Optional detector padding

Option 1: Uniform padding Option 2: (Left/Right, Top/Bottom) Option 3: (Left, Top, Right, Bottom)

property device: str | None: Model device override

property index_filter: Collection[int] | None: Process these YOLO detection indices.

property mask_blur: int | None: Optional mask blur override.

property mask_dilation: int | None: Optional mask dilation override.

property mask_padding: int | tuple[int, int] | tuple[int, int, int, int] | None

Optional mask padding

Option 1: Uniform padding Option 2: (Left/Right, Top/Bottom) Option 3: (Left, Top, Right, Bottom)

property mask_shape: str | None: Optional mask shape override.

property model: str: Model path, huggingface slug, file path

property model_masks: bool | None: Prefer masks generated by the model if available?

property negative_prompt: str | None: Negative prompt override.

property prompt: str | None: Positive prompt override.

property revision: str | None: Model repo revision

property size: int | None: Target size for processing detected areas.

property subfolder: str | None: Model repo subfolder

property weight_name: str | None: Model weight-name

class dgenerate.pipelinewrapper.BNBQuantizerUri(bits: int = 8, bits4_compute_dtype: str | None = None, bits4_quant_type: str = 'fp4', bits4_use_double_quant: bool = False, bits4_quant_storage: str | None = None)[source]

Bases: object

Representation of --quantizer URI.

static help()[source]

static parse(uri: str) → BNBQuantizerUri[source]

__init__(bits: int = 8, bits4_compute_dtype: str | None = None, bits4_quant_type: str = 'fp4', bits4_use_double_quant: bool = False, bits4_quant_storage: str | None = None)[source]

to_config(compute_dtype: str | dtype | None = None) → BitsAndBytesConfig[source]

NAMES = ['bnb', 'bitsandbytes']

OPTION_ARGS = {'bits': [8, 4], 'bits4-compute-dtype': ['float16', 'bfloat16', 'float32', 'float64', 'int8', 'uint8'], 'bits4-quant-storage': ['float16', 'bfloat16', 'float32', 'float64', 'int8', 'uint8'], 'bits4-quant-type': ['fp4', 'nf4']}

Bases: object

Representation of --control-nets URI.

static help()[source]

static parse(uri: str, model_type=ModelType.SD) → ControlNetUri[source]

Parse a --control-nets uri specification and return an object representing its constituents

Parameters:

uri – string with --control-nets uri syntax
model_type – model type that the ControlNet will be attached to.

Raises:

InvalidControlNetUriError –

Returns:

TorchControlNetUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)
scale – controlnet scale
start – controlnet guidance start value
end – controlnet guidance end value
mode – Flux / SDXL Union controlnet mode.
quantizer – –quantizer URI override
model_type – Model type this ControlNet will be attached to.

Raises:

InvalidControlNetUriError – If dtype is passed an invalid data type string, or if model points to a single file and quantizer is specified (not supported).

Load a diffusers.ControlNetModel from this URI.

Parameters:

dtype_fallback – Fallback datatype if dtype was not specified in the URI.
use_auth_token – Optional huggingface API auth token, used for downloading restricted repos that your account has access to.
local_files_only – Avoid connecting to huggingface to download models and only use cached models?
no_cache – If True, force the returned object not to be cached by the memoize decorator.
device_map – device placement strategy for quantized models, defaults to None
model_class – What class of controlnet model should be loaded? if None is specified, load based off ControlNetUri.model_type and provided URI arguments.

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

diffusers.ControlNetModel, diffusers.SD3ControlNetModel, or diffusers.FluxControlNetModel

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

HIDE_ARGS = {'model-type'}

NAMES = ['Control Net']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property end: float: ControlNet guidance end point, fraction of inference / timesteps.

property mode: int | None: Union ControlNet mode.

property model: str: Model path, huggingface slug

property model_type: ModelType: Model type the ControlNet model is expected to attach to.

property quantizer: str | None: –quantizer URI override

property revision: str | None: Model repo revision

property scale: float: ControlNet guidance scale

property start: float: ControlNet guidance start point, fraction of inference / timesteps.

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

class dgenerate.pipelinewrapper.DataType(value)[source]

Bases: Enum

Represents model precision

AUTO = 0: Auto selection.

BFLOAT16 = 3: 16 bit brain floating point.

FLOAT16 = 1: 16 bit floating point.

FLOAT32 = 2: 32 bit floating point.

class dgenerate.pipelinewrapper.DiffusionArguments[source]

Bases: SetFromMixin

Represents all possible arguments for a DiffusionPipelineWrapper call.

static prompt_embedded_arg_checker(name: str, value: Any)[source]

Checks if a class member / value is forbidden to use with a prompt embedded argument specification.

Parameters:

name – the argument name
value – the argument value

describe_pipeline_wrapper_args() → str[source]

Describe the pipeline wrapper arguments in a pretty, human-readable way, with word wrapping depending on console size or a maximum length depending on what stdout currently is.

Returns:: description string.

determine_pipeline_type()[source]

Determine the dgenerate.pipelinewrapper.PipelineType needed to utilize these arguments.

Returns:: dgenerate.pipelinewrapper.PipelineType

get_pipeline_wrapper_kwargs()[source]

Get the arguments dictionary needed to call DiffusionPipelineWrapper

Returns:: dictionary of argument names with values

adetailer_class_filter: Collection[int | str] | None = None: A collection of class IDs and/or class names that indicates what YOLO detection classes to keep. This filter is applied before index-filter. Detections that don’t match any of the specified classes will be ignored. Integers are treated as ID’s, strings are treated as names.

adetailer_detector_padding: int | tuple[int, int] | tuple[int, int, int, int] | None = None

This value specifies the amount of padding that will be added to the detection rectangle which is used to generate a masked area. The default is 0, you can make the mask area around the detected feature larger with positive padding and smaller with negative padding.

Example:

32 (32px Uniform, all sides)

(10, 20) (10px Horizontal, 20px Vertical)

(10, 20, 30, 40) (10px Left, 20px Top, 30px Right, 40px Bottom)

Defaults to 0.

adetailer_index_filter: Collection[int] | None = None: A list index values that indicates what YOLO detection indices to keep, the index values start at zero. Detections are sorted by their top left bounding box coordinate from left to right, top to bottom, by (confidence descending). The order of detections in the image is identical to the reading order of words on a page (english). Inpainting will only be performed on the specified detection indices, if no indices are specified, then inpainting will be performed on all detections. This filter is applied after class-filter.

adetailer_mask_blur: int | None = None: Indicates the level of gaussian blur to apply to the inpaint mask generated by adetailer, which can help with smooth blending of the inpainted feature. Defaults to 4.

adetailer_mask_dilation: int | None = None: Indicates the amount of dilation applied to the generated adetailer inpaint mask, see: cv2.dilate. Defaults to 4.

adetailer_mask_padding: int | tuple[int, int] | tuple[int, int, int, int] | None = None

This value indicates how much padding to place around the masked area when cropping out the image to be inpainted, this value must be large enough to accommodate any feathering on the edge of the mask caused by DiffusionArguments.mask_blur or :py:attr:`.DiffusionArguments.mask_dilation for the best result.

Example:

32 (32px Uniform, all sides)

(10, 20) (10px Horizontal, 20px Vertical)

(10, 20, 30, 40) (10px Left, 20px Top, 30px Right, 40px Bottom)

Defaults to 32.

adetailer_mask_shape: str | None = None: This indicates what mask shape adetailer should attempt to draw around a detected feature, the default value is “rectangle”. You may also specify “circle” to generate an ellipsoid shaped mask, which might be helpful for achieving better blending.

adetailer_model_masks: bool | None = None: Indicates that masks generated by the model itself should be preferred over masks generated from the detection bounding box. If this is True, and the model itself returns mask data, DiffusionArguments.adetailer_mask_shape, DiffusionArguments.adetailer_mask_padding, and DiffusionArguments.adetailer_detector_padding will all be ignored.

adetailer_size: int | None = None: Target size for processing detected areas. When specified, detected areas will always be scaled to this target size (with aspect ratio preserved) for processing, then scaled back to the original size for compositing. This can significantly improve detail quality for small detected features like faces or hands, or reduce processing time for overly large detected areas. The scaling is based on the larger dimension (width or height) of the detected area. The optimal resampling method is automatically selected for both upscaling and downscaling. Must be an integer greater than 1. Defaults to none (process at native resolution).

aspect_correct: bool = False: When resizing input images according to DiffusionArguments.width and DiffusionArguments.height, should the resize be aspect correct?

batch_size: int | None = None: Number of images to produce in a single generation step on the same GPU.

clip_skip: int | None = None: Number of layers to be skipped from CLIP while computing the prompt embeddings. A value of 1 means that the output of the pre-final layer will be used for computing the prompt embeddings. Only supported for model_type values sd and sdxl, including with controlnet_uris defined.

control_images: Sequence[Image] | None = None

ControlNet guidance images to use if controlnet_uris were given to the constructor of DiffusionPipelineWrapper.

Note: Control images must be PIL Images, tensors are not supported since ControlNet/T2I-Adapter operate in pixel space.

All input images involved in a generation must match in dimension.

All incoming ControlNet images will be aligned by 8 automatically, if they need to be aligned by a value higher than this, a warning will be issued to stdout via dgenerate.messages.

decoded_latents_image_processor_uris: Sequence[str] | None = None

One or more image processor URI strings for processing images decoded from incoming latents.

These processors are applied to images that are decoded from latent tensors provided through the DiffusionArguments.images argument when doing img2img with tensor inputs. The processors are applied in sequence after VAE decoding but before the images are used in the pipeline.

The processing flow is: decoded images → pre-resize processing → resize to user dimensions → post-resize processing. This allows both preprocessing the raw decoded images and postprocessing after they are resized to the target dimensions.

If no processors are specified, images are simply resized to user dimensions without additional processing.

deep_cache: bool = False

Enable DeepCache acceleration for the main model? DeepCache caches the intermediate attention layer outputs to speed up the diffusion process. This is beneficial for higher inference steps.

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

deep_cache_branch_id: int | None = None

Controls which branch ID DeepCache should operate on in the UNet for the main model.

This value must be greater than or equal to 0.

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

Supplying any value implies that DiffusionArguments.deep_cache is enabled.

Defaults to 1.

deep_cache_interval: int | None = None

Controls the frequency of caching intermediate outputs in DeepCache for the main model.

This value must be greater than zero.

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

Supplying any value implies that DiffusionArguments.deep_cache is enabled.

Defaults to 5.

denoising_end: float | None = None

Denoising should end at this fraction of total timesteps (0.0 to 1.0).

This is useful for generating noisy latents that can be saved and passed to other models.

Scheduler Compatibility:

SD 1.5 models: Only stateless schedulers are supported (EulerDiscreteScheduler, LMSDiscreteScheduler, EDMEulerScheduler, DPMSolverMultistepScheduler, DDIMScheduler, DDPMScheduler, PNDMScheduler)
SDXL models: All schedulers supported via native denoising_start/denoising_end
SD3/Flux models: FlowMatchEulerDiscreteScheduler and standard schedulers supported

denoising_start: float | None = None

Denoising should start at this fraction of total timesteps (0.0 to 1.0).

This is useful continuing denoising on noisy latents generated with DiffusionArguments.denoising_end

Scheduler Compatibility:

SD 1.5 models: Only stateless schedulers are supported (EulerDiscreteScheduler, LMSDiscreteScheduler, EDMEulerScheduler, DPMSolverMultistepScheduler, DDIMScheduler, DDPMScheduler, PNDMScheduler)
SDXL models: All schedulers supported via native denoising_start/denoising_end
SD3/Flux models: FlowMatchEulerDiscreteScheduler and standard schedulers supported

floyd_image: Image | Tensor | None = None

The output image or tensor of the last stage when performing img2img or inpainting generation with Deep Floyd. When performing txt2img generation DiffusionArguments.image is used.

When a tensor is provided, it represents latent space data from a previous Floyd stage.

Incoming floyd images will be automatically aligned by 8.

freeu_params: tuple[float, float, float, float] | None = None

FreeU is a technique for improving image quality by re-balancing the contributions from the UNet’s skip connections and backbone feature maps.

This can be used with no cost to performance, to potentially improve image quality.

This argument can be used to specify The FreeU parameters: s1, s2, b1, and b2 in that order.

This argument only applies to models that utilize a UNet: SD1.5/2, SDXL, and Kolors

See: https://huggingface.co/docs/diffusers/main/en/using-diffusers/freeu

And: https://github.com/ChenyangSi/FreeU

guidance_rescale: float | None = None

This value is only supported for certain dgenerate.pipelinewrapper.DiffusionPipelineWrapper configurations, an error will be produced when it is unsupported.

Guidance rescale factor proposed by [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf) guidance_scale is defined as φ in equation 16. of [Common Diffusion Noise Schedules and Sample Steps are Flawed](https://arxiv.org/pdf/2305.08891.pdf). Guidance rescale factor should fix overexposure when using zero terminal SNR.

guidance_scale: float | None = None: A higher guidance scale value encourages the model to generate images closely linked to the text DiffusionArguments.prompt at the expense of lower image quality. Guidance scale is enabled when DiffusionArguments.guidance_scale > 1

height: int | None = None

Output image height.

Will be automatically aligned by 8.

If alignments of more than 8 need to be forced, a warning will be issued to stdout via dgenerate.messages.

hi_diffusion: bool = False

Activate HiDiffusion for the primary model?

This can increase the resolution at which the model can output images while retaining quality with no overhead, and possibly improved performance.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

hi_diffusion_no_raunet: bool | None = None

Disable RAU-Net when using HiDiffusion for the primary model?

This disables the Resolution-Aware U-Net component of HiDiffusion.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

hi_diffusion_no_win_attn: bool | None = None

Disable window attention when using HiDiffusion for the primary model?

This disables the MSW-MSA (Multi-Scale Window Multi-Head Self-Attention) component of HiDiffusion.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

image_guidance_scale: float | None = None

This value is only relevant for pix2pix dgenerate.pipelinewrapper.ModelType.

Image guidance scale is to push the generated image towards the initial image DiffusionArguments.image. Image guidance scale is enabled by setting DiffusionArguments.image_guidance_scale > 1. Higher image guidance scale encourages to generate images that are closely linked to the source image DiffusionArguments.image, usually at the expense of lower image quality.

image_seed_strength: float | None = None: Image seed strength, which relates to how much an img2img source (image attribute) is used during generation. Between 0.001 (close to zero but not 0) and 1.0, the closer to 1.0 the less the image is used for generation, IE. the more creative freedom the AI has.

images: Sequence[Image] | Sequence[Tensor] | None = None

Images or tensors for img2img operations, or the base for inpainting operations.

All inputs must be either PIL Images or torch Tensors - mixing both types in the same sequence is not supported.

When tensors are provided, they represent latent space data and bypass VAE encoding. Tensor inputs cannot be resized or processed with image processors.

All input images involved in a generation except for ip_adapter_images must match in dimension, except in the case of Stable Cascade, which can accept multiple images of any dimension for the purpose of image based prompting similar to IP Adapters.

All incoming images will be aligned by 8 automatically, if they need to be aligned by a value higher than this, a warning will be issued.

All other pipelines interpret multiple image inputs as a batching request.

img2img_latents_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing img2img latents before pipeline execution.

These processors are applied to latent tensors provided through the DiffusionArguments.images argument when doing img2img with tensor inputs. The processors are applied in sequence and may occur before VAE decoding (for models that decode img2img latents) or before direct pipeline usage.

inference_steps: int | None = None: The number of denoising steps. More denoising steps usually lead to a higher quality image at the expense of slower inference.

inpaint_crop: bool = False

Enable cropping to mask bounds for inpainting. When enabled, input images will be automatically cropped to the bounds of their masks (plus any padding) before processing, then the generated result will be pasted back onto the original uncropped image. This allows inpainting at higher effective resolutions for better quality results.

Batching Behavior:

Cannot be used with multiple input images/masks in the same call
Each image/mask pair must be processed individually as different masks may have different crop bounds
However, batch_size > 1 is supported for generating multiple variations of a single crop
Multiple images require separate pipeline calls, not batch processing

Auto-enabling:

This is automatically enabled when DiffusionArguments.inpaint_crop_padding, DiffusionArguments.inpaint_crop_feather, or DiffusionArguments.inpaint_crop_masked are specified.

inpaint_crop_feather: int | None = None

Feather value to use when pasting the generated result back onto the original image when DiffusionArguments.inpaint_crop is enabled. Feathering creates smooth transitions from opaque to transparent. Cannot be used together with DiffusionArguments.inpaint_crop_masked.

Specifying this value automatically enables DiffusionArguments.inpaint_crop if it is not already enabled.

Note: Inpaint crop cannot be used with multiple input images. See DiffusionArguments.inpaint_crop for batching details.

inpaint_crop_masked: bool = False

When inpaint_crop is enabled, use the mask when pasting the generated result back onto the original image. This means only the masked areas will be replaced. Cannot be used together with DiffusionArguments.inpaint_crop_feather.

Specifying this value automatically enables DiffusionArguments.inpaint_crop if it is not already enabled.

Note: Inpaint crop cannot be used with multiple input images. See DiffusionArguments.inpaint_crop for batching details.

inpaint_crop_padding: int | tuple[int, int] | tuple[int, int, int, int] | None = None

Padding values to use around mask bounds when inpaint_crop is enabled.

Supported formats:

int: Same padding on all sides
tuple[int, int]: (horizontal, vertical) padding
tuple[int, int, int, int]: (left, top, right, bottom) padding

Specifying this value automatically enables DiffusionArguments.inpaint_crop if it is not already enabled. Default value when DiffusionArguments.inpaint_crop is enabled but no padding is specified: 32 pixels.

Note: Inpaint crop cannot be used with multiple input images. See DiffusionArguments.inpaint_crop for batching details.

ip_adapter_images: Sequence[Sequence[Image]] | None = None

IP Adapter images to use if ip_adapter_uris were given to the constructor of DiffusionPipelineWrapper.

Note: IP Adapter images must be PIL Images, tensors are not supported since IP-Adapter operates in pixel space.

This should be a list of Sequence[PIL.Image]

Each list entry corresponds to an IP adapter URI.

Multiple IP Adapter URIs can be provided, each IP Adapter can get its own set of images.

All incoming IP Adapter images will be aligned by 8 automatically, if they need to be aligned by a value higher than this, a warning will be issued to stdout via dgenerate.messages.

latents: Sequence[Tensor] | None = None: Noisy latents to serve as a starting point for generation, this should be a list of tensors in the format [C, H, W] or [B, C, H, W], A list of tensors with a batch dimension will be concatenated intelligently.

latents_post_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing output latents when outputting to latents.

These processors are applied to latents when DiffusionArguments.output_latents is True. The processors are applied in sequence after the diffusion pipeline generates the latents but before they are returned in the result.

latents_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing raw input latents before pipeline execution.

These processors are applied to latents provided through the DiffusionArguments.latents argument (raw latents used as noise initialization). The processors are applied in sequence before the latents are passed to the diffusion pipeline.

mask_images: Sequence[Image] | None = None

Mask images for inpainting operations.

The amount of img2img images must be equal to the amount of mask_images supplied.

Note: Mask images are always PIL Images, tensor masks are not supported.

All input images involved in a generation except for ip_adapter_images must match in dimension.

All incoming mask images will be aligned by 8 automatically, if they need to be aligned by a value higher than this, a warning will be issued to stdout via dgenerate.messages.

max_sequence_length: int | None = None

Max number of prompt tokens that the T5EncoderModel (text encoder 3) of Stable Diffusion 3 or Flux can handle.

This defaults to 256 for SD3 when not specified, and 512 for Flux.

The maximum value is 512 and the minimum value is 1.

High values result in more resource usage and processing time.

output_latents: bool = False

Whether to output raw latent tensors instead of decoded PIL Images.

When True, the pipeline will return raw latent tensors instead of decoded images. This is useful for saving latent representations or for chaining multiple pipeline operations.

Defaults to False (outputs PIL Images).

pag_adaptive_scale: float | None = None: Adaptive perturbed attention guidance scale.

pag_scale: float | None = None: Perturbed attention guidance scale.

prompt: Prompt | None = None: Primary prompt

prompt_weighter_uri: str | None = None: Default prompt weighter plugin to use for all models.

ras: bool = False

Activate RAS (Region-Adaptive Sampling) for the primary model?

This can increase inference speed with SD3.

See: https://github.com/microsoft/ras

This is supported for: --model-type sd3.

ras_end_step: int | None = None

Ending step for RAS (Region-Adaptive Sampling).

This controls when RAS stops applying its sampling strategy. Must be greater than or equal to 1.

Defaults to the number of inference steps if not specified.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_error_reset_steps: Sequence[int] | None = None

Dense sampling steps to reset accumulated error in RAS.

The dense sampling steps inserted between the RAS steps to reset the accumulated error. A list of step numbers, e.g. [12, 22].

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_high_ratio: float | None = None

Ratio of high value tokens to be cached in RAS.

Based on the metric selected, the ratio of the high value chosen to be cached. Default value is 1.0, but can be set between 0.0 and 1.0 to balance the sample ratio between the main subject and the background.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_index_fusion: bool | None = None

Enable index fusion in RAS (Region-Adaptive Sampling) for the primary model?

This can improve attention computation in RAS for SD3.

See: https://github.com/microsoft/ras

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3, (but not for SD3.5 models)

ras_metric: str | None = None

Metric to use for RAS (Region-Adaptive Sampling).

This controls how RAS measures the importance of tokens for caching. Valid values are “std” (standard deviation) or “l2norm” (L2 norm). Defaults to “std”.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_sample_ratio: float | None = None

Average sample ratio for each RAS step.

For instance, setting this to 0.5 on a sequence of 4096 tokens will result in the noise of averagely 2048 tokens to be updated during each RAS step. Must be between 0.0 and 1.0.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_skip_num_step: int | None = None

Skip steps for RAS (Region-Adaptive Sampling). Controls the number of steps to skip between RAS steps. The actual number of tokens skipped will be rounded down to the nearest multiple of 64 to ensure efficient memory access patterns for attention computation.

When used with DiffusionArguments.ras_skip_num_step_length greater than 0, this value determines how the number of skipped tokens changes over time.

Positive values will increase the number of skipped tokens over time, while negative values will decrease it.

Each value will be tried in turn.

Supplying any values implies DiffusionArguments.ras.

This is supported for: --model-type sd3.

ras_skip_num_step_length: int | None = None

Skip step lengths for RAS (Region-Adaptive Sampling). Controls the length of steps to skip between RAS steps. When set to 0, static dropping is used where the number of skipped tokens remains constant throughout the generation process.

When greater than 0, dynamic dropping is enabled where the number of skipped tokens varies over time based on DiffusionArguments.ras_skip_num_step.

The pattern of skipping will repeat every DiffusionArguments.ras_skip_num_step_length steps.

Each value will be tried in turn.

Supplying any values implies DiffusionArguments.ras.

This is supported for: --model-type sd3.

ras_start_step: int | None = None

Starting step for RAS (Region-Adaptive Sampling).

This controls when RAS begins applying its sampling strategy. Must be greater than or equal to 1.

Defaults to 4 if not specified.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

ras_starvation_scale: float | None = None

Starvation scale for RAS patch selection.

RAS tracks how often a token is dropped and incorporates this count as a scaling factor in the metric for selecting tokens. This scale factor prevents excessive blurring or noise in the final generated image. Larger scaling factor will result in more uniform sampling. Usually set between 0.0 and 1.0.

Supplying any value implies that DiffusionArguments.ras is enabled.

This is supported for: --model-type sd3.

sada: bool = False

Enable SADA (Stability-guided Adaptive Diffusion Acceleration) with default parameters for the primary model.

This is equivalent to setting all SADA parameters to their default values.

See: https://github.com/Ting-Justin-Jiang/sada-icml

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_acc_range: tuple[int, int] | None = None

SADA acceleration range start / end step for the primary model.

Defines the starting step for SADA acceleration.

Starting step must be at least 3 as SADA leverages third-order dynamics.

Defaults to [10,47].

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_lagrange_int: int | None = None

SADA Lagrangian interpolation interval for the primary model.

Interval for Lagrangian interpolation. Must be compatible with sada_lagrange_step (lagrange_step % lagrange_int == 0).

Model-specific defaults:

SD/SD2: 4
SDXL/Kolors: 4
Flux: 4

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_lagrange_step: int | None = None

SADA Lagrangian interpolation step for the primary model.

Step value for Lagrangian interpolation. Must be compatible with sada_lagrange_int (lagrange_step % lagrange_int == 0).

Model-specific defaults:

SD/SD2: 24
SDXL/Kolors: 24
Flux: 20

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_lagrange_term: int | None = None

SADA Lagrangian interpolation terms for the primary model.

Number of terms to use in Lagrangian interpolation. Set to 0 to disable Lagrangian interpolation.

Model-specific defaults:

SD/SD2: 4
SDXL/Kolors: 4
Flux: 3

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_max_downsample: int | None = None

SADA maximum downsample factor for the primary model.

Controls the maximum downsample factor in the SADA algorithm. Lower values can improve quality but may reduce speedup.

Model-specific defaults:

SD/SD2: 1
SDXL/Kolors: 2
Flux: 0

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

SADA is not compatible with HiDiffusion, DeepCache, or TeaCache.

sada_max_fix: int | None = None

SADA maximum fixed memory for the primary model.

Maximum amount of fixed memory to use in SADA optimization.

Model-specific defaults:

SD/SD2: 5120 (5 * 1024)
SDXL/Kolors: 10240 (10 * 1024)
Flux: 0

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_max_interval: int | None = None

SADA maximum interval for optimization for the primary model.

Maximum interval between optimizations in the SADA algorithm.

Defaults to 4.

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_sx: int | None = None

SADA spatial downsample factor X for the primary model.

Controls the spatial downsample factor in the X dimension. Higher values can increase speedup but may affect quality.

Model-specific defaults:

SD/SD2: 3
SDXL/Kolors: 3
Flux: 0 (not used)

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

sada_sy: int | None = None

SADA spatial downsample factor Y for the primary model.

Controls the spatial downsample factor in the Y dimension. Higher values can increase speedup but may affect quality.

Model-specific defaults:

SD/SD2: 3
SDXL/Kolors: 3
Flux: 0 (not used)

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

scheduler_uri: str | None = None: Primary model scheduler URI

sdxl_aesthetic_score: float | None = None: Optional, defaults to 6.0. This argument is used for img2img and inpainting operations only Used to simulate an aesthetic score of the generated image by influencing the positive text condition. Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952).

sdxl_crops_coords_top_left: tuple[int, int] | None = None: Optional SDXL conditioning parameter. DiffusionArguments.sdxl_crops_coords_top_left can be used to generate an image that appears to be “cropped” from the position DiffusionArguments.sdxl_crops_coords_top_left downwards. Favorable, well-centered images are usually achieved by setting DiffusionArguments.sdxl_crops_coords_top_left to (0, 0). Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952).

sdxl_high_noise_fraction: float | None = None

SDXL high noise fraction. This proportion of timesteps/inference steps are handled by the primary model, while the inverse proportion is handled by the refiner model when an SDXL model_type value.

When the refiner is operating in edit mode the number of total inference steps for the refiner will be calculated in a different manner, currently the refiner operates in edit mode during generations involving ControlNets as well as inpainting.

In edit mode, the refiner uses img2img with an image seed strength to add details to the image instead of cooperative denoising, this image seed strength is calculated as (1.0 - DiffusionArguments.sdxl_high_noise_fraction), and the number of inference steps for the refiner is then calculated as (image_seed_strength * inference_steps).

sdxl_negative_aesthetic_score: float | None = None: Negative influence version of DiffusionArguments.sdxl_aesthetic_score

sdxl_negative_crops_coords_top_left: tuple[int, int] | None = None: Negative influence version of DiffusionArguments.sdxl_crops_coords_top_left

sdxl_negative_original_size: tuple[int, int] | None = None

This value is only supported for certain dgenerate.pipelinewrapper.DiffusionPipelineWrapper configurations, an error will be produced when it is unsupported. It is not known to be supported by pix2pix.

Optional SDXL conditioning parameter. To negatively condition the generation process based on a specific image resolution. Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952). For more information, refer to this issue thread: https://github.com/huggingface/diffusers/issues/4208.

sdxl_negative_target_size: tuple[int, int] | None = None

This value is only supported for certain dgenerate.pipelinewrapper.DiffusionPipelineWrapper configurations, an error will be produced when it is unsupported. It is not known to be supported by pix2pix.

Optional SDXL conditioning parameter. To negatively condition the generation process based on a target image resolution. It should be as same as the DiffusionArguments.target_size for most cases. Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952). For more information, refer to this issue thread: https://github.com/huggingface/diffusers/issues/4208.

sdxl_original_size: tuple[int, int] | None = None: Optional SDXL conditioning parameter. If DiffusionArguments.sdxl_original_size is not the same as DiffusionArguments.sdxl_target_size the image will appear to be down- or up-sampled. DiffusionArguments.sdxl_original_size defaults to (width, height) if not specified or the size of any input images provided. Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952).

sdxl_refiner_aesthetic_score: float | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_aesthetic_score

sdxl_refiner_clip_skip: int | None = None: Clip skip override value for the SDXL refiner, which normally defaults to that of DiffusionArguments.clip_skip when it is defined.

sdxl_refiner_crops_coords_top_left: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_crops_coords_top_left

sdxl_refiner_deep_cache: bool | None = None

Enable DeepCache acceleration for the SDXL Refiner?

This is supported for Stable Diffusion XL and Kolors based models.

sdxl_refiner_deep_cache_branch_id: int | None = None

Controls which branch ID DeepCache should operate on in the UNet for the SDXL Refiner.

This value must be greater than or equal to 0.

This is supported for Stable Diffusion XL and Kolors based models.

Supplying any value implies that DiffusionArguments.sdxl_refiner_deep_cache is enabled.

Defaults to 1.

sdxl_refiner_deep_cache_interval: int | None = None

Controls the frequency of caching intermediate outputs in DeepCache for the SDXL Refiner.

This value must be greater than zero.

This is supported for Stable Diffusion XL and Kolors based models.

Supplying any value implies that DiffusionArguments.sdxl_refiner_deep_cache is enabled.

Defaults to 5.

sdxl_refiner_edit: bool | None = None: Force the SDXL refiner to operate in edit mode instead of cooperative denoising mode.

sdxl_refiner_freeu_params: tuple[float, float, float, float] | None = None

FreeU parameters for the SDXL refiner

See: DiffusionArguments.freeu_params for clarification.

sdxl_refiner_guidance_rescale: float | None = None: Override the guidance rescale value used by the SDXL refiner, which is normally set to the value of DiffusionArguments.guidance_rescale.

sdxl_refiner_negative_aesthetic_score: float | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_negative_aesthetic_score

sdxl_refiner_negative_crops_coords_top_left: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_negative_crops_coords_top_left

sdxl_refiner_negative_original_size: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_negative_original_size

sdxl_refiner_negative_target_size: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_negative_target_size

sdxl_refiner_original_size: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_original_size

sdxl_refiner_pag_adaptive_scale: float | None = None: Adaptive perturbed attention guidance scale for the SDXL refiner.

sdxl_refiner_pag_scale: float | None = None: Perturbed attention guidance scale for the SDXL refiner.

sdxl_refiner_sigmas: Sequence[float] | str | None = None

Sigma values, this is supported when using a DiffusionArguments.second_model_scheduler_uri that supports setting sigmas.

These sigma values control the noise schedule specifically for the SDXL refiner’s diffusion process, allowing for customized denoising behavior during the refinement stage. This can be particularly useful for fine-tuning the level of detail and quality in the refined image.

Format: A list of floating point values in descending order, typically ranging from higher values (more noise) to lower values (less noise).

Or: a string expression involving sigmas from the selected scheduler such as sigmas * 0.95, sigmas will be represented as a numpy array, numpy is available through the namespace np, this uses asteval.

sdxl_refiner_target_size: tuple[int, int] | None = None: Override the refiner value usually taken from DiffusionArguments.sdxl_target_size

sdxl_t2i_adapter_factor: float | None = None: SDXL specific T2I adapter factor, this controls the amount of time-steps for which a T2I adapter applies guidance to an image, this is a value between 0.0 and 1.0. A value of 0.5 for example indicates that the T2I adapter is only active for half the amount of time-steps it takes to completely render an image.

sdxl_target_size: tuple[int, int] | None = None: Optional SDXL conditioning parameter. For most cases, DiffusionArguments.sdxl_target_size should be set to the desired height and width of the generated image. If not specified it will default to (width, height) or the size of any input images provided. Part of SDXL’s micro-conditioning as explained in section 2.2 of [https://huggingface.co/papers/2307.01952](https://huggingface.co/papers/2307.01952).

second_model_guidance_scale: float | None = None: Override the guidance scale used by the SDXL refiner or Stable Cascade decoder.

second_model_inference_steps: int | None = None: Override the default amount of inference steps performed by the SDXL refiner or Stable Cascade decoder.

second_model_prompt: Prompt | None = None: Primary prompt for SDXL Refiner or Stable Cascade Decoder.

second_model_prompt_weighter_uri: str | None = None

The URI of a prompt-weighter implementation supported by dgenerate to use with the SDXL refiner or Stable Cascade Decoder.

Defaults to DiffusionArguments.prompt_weighter_uri if not specified.

This corresponds to the --second-model-prompt-weighter argument of the dgenerate command line tool.

second_model_scheduler_uri: str | None = None: SDXL refiner scheduler / Stable Cascade Decoder URI, if not specified, defaults to DiffusionArguments.scheduler

second_model_second_prompt: Prompt | None = None: Secondary Prompt for SDXL Refiner, the Stable Cascade Decoder does not support this argument.

second_prompt: Prompt | None = None: Secondary Prompt for SDXL, SD3, Flux.

seed: int | None = None: An integer to serve as an RNG seed.

sigmas: Sequence[float] | str | None = None

Sigma values, this is supported when using a when using a DiffusionArguments.scheduler_uri that supports setting sigmas.

Sigma values control the noise schedule in the diffusion process, allowing for fine-grained control over how noise is added and removed during image generation. Custom sigma values can be used to achieve specific artistic effects or to optimize the generation process for particular types of images.

Format: A list of floating point values in descending order, typically ranging from higher values (more noise) to lower values (less noise).

Or: a string expression involving sigmas from the selected scheduler such as sigmas * 0.95, sigmas will be represented as a numpy array, numpy is available through the namespace np, this uses asteval.

tea_cache: bool = False

Activate TeaCache for the primary model?

This is supported for Flux, TeaCache uses a novel caching mechanism in the forward pass of the flux transformer to reduce the amount of computation needed to generate an image, this can speed up inference with small amounts of quality loss.

See: https://github.com/ali-vilab/TeaCache

Also see: DiffusionArguments.tea_cache_rel_l1_threshold

This is supported for: --model-type flux*.

tea_cache_rel_l1_threshold: float | None = None

TeaCache relative L1 threshold when DiffusionArguments.tea_cache is enabled.

Higher values mean more speedup.

Defaults to 0.6 (2.0x speedup). 0.25 for 1.5x speedup, 0.4 for 1.8x speedup, 0.6 for 2.0x speedup, 0.8 for 2.25x speedup

See: https://github.com/ali-vilab/TeaCache

Supplying any value implies that DiffusionArguments.tea_cache is enabled.

This is supported for: --model-type flux*.

third_prompt: Prompt | None = None: Tertiary Prompt for SD3.

upscaler_noise_level: int | None = None: Upscaler noise level for the dgenerate.pipelinewrapper.ModelType.UPSCALER_X4 model type only.

vae_slicing: bool = False: Enable VAE slicing?

vae_tiling: bool = False: Enable VAE tiling?

width: int | None = None

Output image width.

Will be automatically aligned by 8.

If alignments of more than 8 need to be forced, a warning will be issued to stdout via dgenerate.messages.

class dgenerate.pipelinewrapper.DiffusionPipelineWrapper(model_path: str, model_type: ModelType | str = ModelType.SD, revision: str | None = None, variant: str | None = None, subfolder: str | None = None, dtype: DataType | str = DataType.AUTO, unet_uri: str | None = None, second_model_unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, lora_fuse_scale: float | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, text_encoder_uris: Sequence[str] | None = None, second_model_text_encoder_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, sdxl_refiner_uri: str | None = None, s_cascade_decoder_uri: str | None = None, quantizer_uri: str | None = None, quantizer_map: Sequence[str] | None = None, second_model_quantizer_uri: str | None = None, second_model_quantizer_map: Sequence[str] | None = None, device: str = 'cpu', safety_checker: bool = False, original_config: str | None = None, second_model_original_config: str | None = None, auth_token: str | None = None, local_files_only: bool = False, model_extra_modules: dict[str, Any] = None, second_model_extra_modules: dict[str, Any] = None, model_cpu_offload: bool = False, model_sequential_offload: bool = False, second_model_cpu_offload: bool = False, second_model_sequential_offload: bool = False, prompt_weighter_loader: PromptWeighterLoader | None = None, latents_processor_loader: LatentsProcessorLoader | None = None, decoded_latents_image_processor_loader: ImageProcessorLoader | None = None, adetailer_detector_uris: Sequence[str] | None = None, adetailer_crop_control_image: bool = False)[source]

Bases: object

Monolithic diffusion pipelines wrapper.

static recall_last_used_main_pipeline() → PipelineCreationResult | None[source]

Return a reference to the last dgenerate.pipelinewrapper.pipelines.TorchPipelineCreationResult for the pipeline that successfully executed an image generation.

This may recreate the pipeline if it is not cached.

If no image generation has occurred, this will return None.

Returns:: dgenerate.pipelinewrapper.pipelines.TorchPipelineCreationResult or None

static recall_last_used_secondary_pipeline() → PipelineCreationResult | None[source]

Return a reference to the last dgenerate.pipelinewrapper.pipelines.TorchPipelineCreationResult for the secondary pipeline (refiner / stable cascade decoder) that successfully executed an image generation.

This may recreate the pipeline if it is not cached.

If no image generation has occurred or no secondary pipeline has been called, this will return None.

Returns:: dgenerate.pipelinewrapper.pipelines.TorchPipelineCreationResult or None

__call__(args: DiffusionArguments | None = None, **kwargs) → PipelineWrapperResult[source]

Call the pipeline and generate a result.

Parameters:

args – Optional DiffusionArguments
kwargs – See DiffusionArguments.get_pipeline_wrapper_kwargs(), any keyword arguments given here will override values derived from the DiffusionArguments object given to the args parameter.

Raises:

InvalidModelFileError –
InvalidModelUriError –
InvalidSchedulerNameError –
dgenerate.OutOfMemoryError –
UnsupportedPipelineConfigError –

Returns:

PipelineWrapperResult

__init__(model_path: str, model_type: ModelType | str = ModelType.SD, revision: str | None = None, variant: str | None = None, subfolder: str | None = None, dtype: DataType | str = DataType.AUTO, unet_uri: str | None = None, second_model_unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, lora_fuse_scale: float | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, text_encoder_uris: Sequence[str] | None = None, second_model_text_encoder_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, sdxl_refiner_uri: str | None = None, s_cascade_decoder_uri: str | None = None, quantizer_uri: str | None = None, quantizer_map: Sequence[str] | None = None, second_model_quantizer_uri: str | None = None, second_model_quantizer_map: Sequence[str] | None = None, device: str = 'cpu', safety_checker: bool = False, original_config: str | None = None, second_model_original_config: str | None = None, auth_token: str | None = None, local_files_only: bool = False, model_extra_modules: dict[str, Any] = None, second_model_extra_modules: dict[str, Any] = None, model_cpu_offload: bool = False, model_sequential_offload: bool = False, second_model_cpu_offload: bool = False, second_model_sequential_offload: bool = False, prompt_weighter_loader: PromptWeighterLoader | None = None, latents_processor_loader: LatentsProcessorLoader | None = None, decoded_latents_image_processor_loader: ImageProcessorLoader | None = None, adetailer_detector_uris: Sequence[str] | None = None, adetailer_crop_control_image: bool = False)[source]

This is a monolithic wrapper around all supported diffusion pipelines which handles txt2img, img2img, and inpainting on demand. It spins up the correct pipelines as needed in order to handle provided pipeline arguments using lazy initialization.

Pipelines and user specified sub models are memoized and their lifetimes are managed via heuristics based on system memory and available resources.

All arguments to this constructor should be provided as keyword arguments, using this constructor in any other fashion could result in breakage inbetween semver compatible versions.

Parameters:

model_path – main model path
model_type – main model type
revision – main model revision
variant – main model variant
subfolder – main model subfolder (huggingface or disk)
dtype – main model dtype
unet_uri – main model UNet URI string
second_model_unet_uri – secondary model unet uri (SDXL Refiner, Stable Cascade decoder)
transformer_uri – Optional transformer URI string for specifying a specific Transformer, currently this is only supported for Stable Diffusion 3 models.
vae_uri – main model VAE URI string
lora_uris – One or more LoRA URI strings
lora_fuse_scale – Optional global LoRA fuse scale value. Once all LoRAs are merged with their individual scales, the merged weights will be fused into the pipeline at this scale. The default value is 1.0.
image_encoder_uri – One or more Image Encoder URI strings, Image Encoders are used with IP Adapters and Stable Cascade
ip_adapter_uris – One or more IP Adapter URI strings
textual_inversion_uris – One or more Textual Inversion URI strings
text_encoder_uris – One or more Text Encoder URIs (“+”, or None for default. Or “null” indicating do not load) for the main model
second_model_text_encoder_uris – One or more Text Encoder URIs (“+”, or None for default. Or “null” indicating do not load) for the secondary model (SDXL Refiner or Stable Cascade decoder)
controlnet_uris – One or more ControlNet URI strings
t2i_adapter_uris – One or more T2IAdapter URI strings
sdxl_refiner_uri – SDXL Refiner model URI string
s_cascade_decoder_uri – Stable Cascade decoder URI string
quantizer_uri – Global –quantizer URI value
quantizer_map – Collection of pipeline submodule names to which quantization should be applied when quantizer_uri is provided. Valid values include: unet, transformer, text_encoder, text_encoder_2, text_encoder_3. If None, all supported modules will be quantized.
second_model_quantizer_uri – Global –second-model-quantizer URI value
second_model_quantizer_map – Collection of pipeline submodule names to which quantization should be applied when second_model_quantizer_uri is provided. Valid values include: unet, transformer, text_encoder, text_encoder_2, text_encoder_3. If None, all supported modules will be quantized.
device – Rendering device string, example: cuda:0 or cuda
safety_checker – Use safety checker model if available? (antiquated, for SD 1/2, Deep Floyd etc.)
original_config – Optional original LDM config .yaml file path when loading a single file checkpoint.
second_model_original_config – Optional original LDM config .yaml file path when loading a single file checkpoint for the secondary model (SDXL Refiner, Stable Cascade Decoder).
auth_token – huggingface authentication token.
local_files_only – Do not attempt to download files from huggingface?
model_extra_modules – Raw extra diffusers modules for the main pipeline
second_model_extra_modules – Raw extra diffusers modules for the secondary pipeline (SDXL Refiner, Stable Cascade decoder)
model_cpu_offload – Use model CPU offloading for the main pipeline via the accelerate module?
model_sequential_offload – Use sequential CPU offloading for the main pipeline via the accelerate module?
second_model_cpu_offload – Use CPU offloading for the SDXL Refiner or Stable Cascade Decoder via the accelerate module?
second_model_sequential_offload – Use sequential CPU offloading for the SDXL Refiner or Stable Cascade Decoder via the accelerate module?
prompt_weighter_loader – Plugin loader for prompt weighter implementations, if you pass None a default instance will be created.
latents_processor_loader – Plugin loader for latents processor implementations, if you pass None a default instance will be created.
decoded_latents_image_processor_loader – Plugin loader for image processor implementations that process images decoded from incoming latents, if you pass None a default instance will be created.
adetailer_detector_uris – adetailer subject detection model URIs, specifying this argument indicates img2img mode implicitly, the pipeline wrapper will accept a single image and perform the adetailer inpainting algorithm on it using the provided detector URIs.
adetailer_crop_control_image – Should adetailer crop any provided ControlNet control image in the same way that it crops the generated mask to the detection area? Otherwise, use the full control image resized down to the size of the detection area. If you enable this and your control image is not the same size as your input image, a warning will be issued and resizing will be used instead of cropping.

Raises:

UnsupportedPipelineConfigError –
InvalidModelUriError –

decode_latents(latents: Sequence[Tensor] | Tensor) → list[Image][source]

Decode latents using the main pipeline’s VAE.

A generation must have occurred at least once for this method to be usable.

You must be using a model type that utilizes a VAE, Stable Cascade and Deep Floyd model types are not supported by this method.

Parameters:: latents – Latents to decode, can be a sequence of tensors (batched), or a single tensor. A single tensor with a batch dimension [B, C, H, W] will be assumed to be a batch of latents and batched if the batch dimension is > 1, [C, H, W] will be assumed to be a single latent tensor. For Flux models, latents should be in unpacked format [B, C, H, W] where C=16.
Raises:: dgenerate.pipelinewrapper.UnsupportedPipelineConfigError – If the decoding the latents is not supported.

gen_dgenerate_command(args: DiffusionArguments | None = None, extra_opts: Sequence[tuple[str] | tuple[str, Any]] | None = None, omit_device: bool = False, overrides: dict[str, Any] = None)[source]

Generate a valid dgenerate command line invocation that reproduces the arguments associated with DiffusionArguments.

This does not reproduce --image-seeds, you must include that value in extra_opts, this is because there is not enough information in DiffusionArguments to accurately reproduce it.

Parameters:

args – DiffusionArguments object to take values from
extra_opts – Extra option pairs to be added to the end of reconstructed options of the dgenerate invocation, this should be a sequence of tuples of length 1 (switch only) or length 2 (switch with args)
omit_device – Omit the --device option? For a shareable configuration it might not make sense to include the device specification. And instead simply fallback to whatever the default device is, which is generally cuda
overrides – pipeline wrapper keyword arguments, these will override values derived from any DiffusionArguments object given to the args argument. See: DiffusionArguments.get_pipeline_wrapper_kwargs

Returns:

A string containing the dgenerate command line needed to reproduce this result.

gen_dgenerate_config(args: DiffusionArguments | None = None, extra_opts: Sequence[tuple[str] | tuple[str, Any]] | None = None, extra_comments: Iterable[str] | None = None, omit_device: bool = False, overrides: dict[str, Any] = None)[source]

Generate a valid dgenerate config file with a single invocation that reproduces the arguments associated with DiffusionArguments.

This does not reproduce --image-seeds, you must include that value in extra_opts, this is because there is not enough information in DiffusionArguments to accurately reproduce it.

Parameters:

args – DiffusionArguments object to take values from
extra_opts – Extra option pairs to be added to the end of reconstructed options of the dgenerate invocation, this should be a sequence of tuples of length 1 (switch only) or length 2 (switch with args)
extra_comments – Extra strings to use as comments after the initial version check directive
omit_device – Omit the --device option? For a shareable configuration it might not make sense to include the device specification. And instead simply fallback to whatever the default device is, which is generally cuda
overrides – pipeline wrapper keyword arguments, these will override values derived from any DiffusionArguments object given to the args argument. See: DiffusionArguments.get_pipeline_wrapper_kwargs

Returns:

The configuration as a string

get_decoded_latents_size(latents: Tensor) → tuple[int, int][source]

Given a latent tensor return the expected decoded image (width, height) in pixels.

Parameters:: latents – Latent tensor of shape [B, C, H, W] or [C, H, W].
Returns:: width, height

recall_main_pipeline() → PipelineCreationResult[source]

Fetch the last used main pipeline creation result, possibly the pipeline will be recreated if no longer in the in memory cache. If there is no pipeline currently created, which will be the case if an image was never generated yet, RuntimeError will be raised.

Raises:: RuntimeError –
Returns:: dgenerate.pipelinewrapper.PipelineCreationResult

recall_secondary_pipeline() → PipelineCreationResult[source]

Fetch the last used refiner / stable cascade decoder pipeline creation result, possibly the pipeline will be recreated if no longer in the in memory cache. If there is no refiner / decoder pipeline currently created, which will be the case if an image was never generated yet or a refiner / decoder model was not specified, RuntimeError will be raised.

Raises:: RuntimeError –
Returns:: dgenerate.pipelinewrapper.PipelineCreationResult

reconstruct_dgenerate_opts(args: DiffusionArguments | None = None, extra_opts: Sequence[tuple[str] | tuple[str, Any]] | None = None, omit_device: bool = False, shell_quote: bool = True, overrides: dict[str, Any] = None) → list[tuple[str] | tuple[str, Any]][source]

Reconstruct dgenerate’s command line arguments from a particular set of pipeline wrapper call arguments.

This does not reproduce --image-seeds, you must include that value in extra_opts, this is because there is not enough information in DiffusionArguments to accurately reproduce it.

Parameters:

args – DiffusionArguments object to take values from
extra_opts – Extra option pairs to be added to the end of reconstructed options, this should be a sequence of tuples of length 1 (switch only) or length 2 (switch with args)
omit_device – Omit the --device option? For a shareable configuration it might not make sense to include the device specification. And instead simply fallback to whatever the default device is, which is generally cuda
shell_quote – Shell quote and format the argument values? or return them raw.
overrides – pipeline wrapper keyword arguments, these will override values derived from any DiffusionArguments object given to the args argument. See: DiffusionArguments.get_pipeline_wrapper_kwargs

Returns:

List of tuples of length 1 or 2 representing the option

property adetailer_crop_control_image: bool: Should adetailer crop any provided control image in the same way that it crops the generated mask to the detection area? Otherwise, use the full control image resized down to the size of the detection area.

property adetailer_detector_uris: Sequence[str] | None: List of supplied --adetailer-detectors URI strings or an empty list.

property auth_token: str | None: Current --auth-token value or None.

property controlnet_uris: Sequence[str] | None: List of supplied --control-nets URI strings or an empty list.

property decoded_latents_image_processor_loader: ImageProcessorLoader: Current decoded latents image processor loader.

property device: str: Currently set --device string.

property dtype: DataType: Currently set --dtype enum value for the main model.

property dtype_string: str: Currently set --dtype string value for the main model.

property image_encoder_uri: str | None: Selected --image-encoder uri for the main model or None.

property ip_adapter_uris: Sequence[str] | None: List of supplied --ip-adapters URI strings or an empty list.

property latents_processor_loader: LatentsProcessorLoader: Current latents processor loader.

property local_files_only: bool: Currently set value for local_files_only.

property lora_fuse_scale: float: Supplied --lora-fuse-scale value.

property lora_uris: Sequence[str] | None: List of supplied --loras uri strings or an empty list.

property model_cpu_offload: bool: Current --model-cpu-offload value.

property model_path: str: Model path for the main model.

property model_sequential_offload: bool: Current --model-sequential-offload value.

property model_type: ModelType: Currently set --model-type enum value.

property model_type_string: str: Currently set --model-type string value.

property original_config: str | None: Current --original-config value.

property prompt_weighter_loader: PromptWeighterLoader: Current prompt weighter loader.

property quantizer_map: Sequence[str] | None: Current --quantizer-map value.

property quantizer_uri: str | None: Current --quantizer value.

property revision: str | None: Currently set --revision for the main model or None.

property s_cascade_decoder_uri: str | None: Model URI for the Stable Cascade decoder or None.

property safety_checker: bool: Safety checker enabled status.

property sdxl_refiner_uri: str | None: Model URI for the SDXL refiner or None.

property second_model_cpu_offload: bool: Current --second-model-cpu-offload value.

property second_model_original_config: str | None: Current --second-model-original-config value.

property second_model_quantizer_map: Sequence[str] | None: Current --second-model-quantizer-map value.

property second_model_quantizer_uri: str | None: Current --second-model-quantizer value.

property second_model_sequential_offload: bool: Current --second-model-sequential-offload value.

property second_model_text_encoder_uris: Sequence[str] | None: List of supplied --second-model-text-encoders URI strings or an empty list.

property second_model_unet_uri: str | None: Selected --second-model-unet uri for the SDXL refiner or Stable Cascade decoder model or None.

property subfolder: str | None: Selected model --subfolder for the main model, (remote repo subfolder or local) or None.

property t2i_adapter_uris: Sequence[str] | None: List of supplied --t2i-adapters URI strings or an empty list.

property text_encoder_uris: Sequence[str] | None: List of supplied --text-encoders URI strings or an empty list.

property textual_inversion_uris: Sequence[str] | None: List of supplied --textual-inversions URI strings or an empty list.

property transformer_uri: str | None: Model URI for the SD3 Transformer or None.

property unet_uri: str | None: Selected --unet uri for the main model or None.

property vae_uri: str | None: Selected --vae uri for the main model or None.

property variant: str | None: Currently set --variant for the main model or None.

class dgenerate.pipelinewrapper.FluxControlNetUnionUriModes(value)[source]

Bases: IntEnum

Represents controlnet modes associated with the Flux Union controlnet.

BLUR = 3

CANNY = 0

DEPTH = 2

GRAY = 5

LQ = 6

POSE = 4

TILE = 1

class dgenerate.pipelinewrapper.IPAdapterUri(model: str, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None, scale: float = 1.0)[source]

Bases: object

Representation of a --ip-adapters uri

static help()[source]

static load_on_pipeline(pipeline: DiffusionPipeline, uris: Iterable[IPAdapterUri | str], use_auth_token: str | None = None, local_files_only: bool = False)[source]

Load IP Adapter weights on to a pipeline using this URI

Parameters:

pipeline – diffusers.DiffusionPipeline
uris – IP Adapter URIs to load on to the pipeline
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug

Raises:

ModelNotFoundError – If the model could not be found.
dgenerate.pipelinewrapper.uris.exceptions.InvalidIPAdapterUriError – On URI parsing errors.
dgenerate.pipelinewrapper.uris.exceptions.IPAdapterUriLoadError – On loading errors.

static parse(uri: str) → IPAdapterUri[source]

Parse a --ip-adapters uri and return an object representing its constituents

Parameters:: uri – string with --ip-adapters uri syntax
Raises:: InvalidIPAdapterUriError –
Returns:: IPAdapterUri

__init__(model: str, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None, scale: float = 1.0)[source]

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['IP Adapter']

property model: str: Model path, huggingface slug, file path

property revision: str | None: Model repo revision

property scale: float: IP Adapter scale

property subfolder: str | None: Model repo subfolder

property weight_name: str | None: Model weight-name

Bases: object

Representation of --image-encoder URI.

static help()[source]

static parse(uri: str) → ImageEncoderUri[source]

Parse a --image-encoder uri and return an object representing its constituents

Parameters:: uri – string with --image-encoder uri syntax
Raises:: InvalidImageEncoderUriError –
Returns:: ImageEncoderUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)

Raises:

InvalidImageEncoderUriError – If model points to a single file, single file loads are not supported. Or if dtype is passed an invalid data type string.

load(dtype_fallback: ~dgenerate.pipelinewrapper.enums.DataType = DataType.AUTO, use_auth_token: str | None = None, local_files_only: bool = False, no_cache: bool = False, image_encoder_class: type[~transformers.models.clip.modeling_clip.CLIPVisionModelWithProjection] | type[~dgenerate.pipelinewrapper.models.SiglipImageEncoder] = <class 'transformers.models.clip.modeling_clip.CLIPVisionModelWithProjection'>) → type[CLIPVisionModelWithProjection] | type[SiglipImageEncoder][source]

Load an Image Encoder Model of type transformers.CLIPVisionModelWithProjection

Parameters:

dtype_fallback – If the URI does not specify a dtype, use this dtype.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug or blob link
no_cache – If True, force the returned object not to be cached by the memoize decorator.
image_encoder_class – Image Encoder class to load.

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

transformers.CLIPVisionModelWithProjection

FILE_ARGS = {'model': {'mode': 'dir'}}

NAMES = ['Image Encoder']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug, file path, or blob link

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

class dgenerate.pipelinewrapper.LoRAUri(model: str, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None, scale: float = 1.0)[source]

Bases: object

Representation of a --loras uri

static help()[source]

static load_on_pipeline(pipeline: DiffusionPipeline, uris: Iterable[LoRAUri | str], fuse_scale: float = 1.0, use_auth_token: str | None = None, local_files_only: bool = False)[source]

Load LoRA weights on to a pipeline using this URI

Parameters:

pipeline – diffusers.DiffusionPipeline
uris – Iterable of LoRAUri or str LoRA URIs to load
fuse_scale – Global scale for the fused LoRAs, all LoRAs are fused together using their individual scale value, and then fused into the main model using this scale.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug

Raises:

dgenerate.ModelNotFoundError – If the model could not be found.
dgenerate.pipelinewrapper.uris.exceptions.InvalidLoRAUriError – On URI parsing errors.
dgenerate.pipelinewrapper.uris.exceptions.LoRAUriLoadError – On loading errors.

static parse(uri: str) → LoRAUri[source]

Parse a --loras uri and return an object representing its constituents

Parameters:: uri – string with --loras uri syntax
Raises:: InvalidLoRAUriError –
Returns:: LoRAUri

__init__(model: str, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None, scale: float = 1.0)[source]

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['LoRA']

property model: str: Model path, huggingface slug, file path

property revision: str | None: Model repo revision

property scale: float: LoRA scale

property subfolder: str | None: Model repo subfolder

property weight_name: str | None: Model weight-name

class dgenerate.pipelinewrapper.ModelType(value)[source]

Bases: Enum

Enum representation of --model-type

FLUX = 13: Flux pipeline

FLUX_FILL = 14: Flux infill / outfill pipeline

FLUX_KONTEXT = 15: Flux Kontext pipeline

IF = 3: Deep Floyd IF stage 1

IFS = 4: Deep Floyd IF superscaler (stage 2)

IFS_IMG2IMG = 5: Deep Floyd IF superscaler (stage 2) image to image / variation mode.

KOLORS = 16: Kolors (SDXL + ChatGLM)

PIX2PIX = 1: Stable Diffusion pix2pix prompt guided editing.

SD = 0: Stable Diffusion, such as SD 1.0 - 2.x

SD3 = 11: Stable Diffusion 3

SD3_PIX2PIX = 12: Stable Diffusion 3 pix2pix prompt guided editing.

SDXL = 2: Stable Diffusion XL

SDXL_PIX2PIX = 6: Stable Diffusion XL pix2pix prompt guided editing.

S_CASCADE = 9: Stable Cascade prior

S_CASCADE_DECODER = 10: Stable Cascade decoder

UPSCALER_X2 = 7: Stable Diffusion X2 upscaler

UPSCALER_X4 = 8: Stable Diffusion X4 upscaler

class dgenerate.pipelinewrapper.PipelineCreationResult(model_path: str, pipeline: DiffusionPipeline, parsed_unet_uri: UNetUri | None, parsed_transformer_uri: TransformerUri | None, parsed_vae_uri: VAEUri | None, parsed_image_encoder_uri: ImageEncoderUri | None, parsed_lora_uris: Sequence[LoRAUri], parsed_ip_adapter_uris: Sequence[IPAdapterUri], parsed_textual_inversion_uris: Sequence[TextualInversionUri], parsed_controlnet_uris: Sequence[ControlNetUri], parsed_t2i_adapter_uris: Sequence[T2IAdapterUri])[source]

Bases: object

__init__(model_path: str, pipeline: DiffusionPipeline, parsed_unet_uri: UNetUri | None, parsed_transformer_uri: TransformerUri | None, parsed_vae_uri: VAEUri | None, parsed_image_encoder_uri: ImageEncoderUri | None, parsed_lora_uris: Sequence[LoRAUri], parsed_ip_adapter_uris: Sequence[IPAdapterUri], parsed_textual_inversion_uris: Sequence[TextualInversionUri], parsed_controlnet_uris: Sequence[ControlNetUri], parsed_t2i_adapter_uris: Sequence[T2IAdapterUri])[source]

call(device: device | str | None = 'cpu', prompt_weighter: PromptWeighter | None = None, **kwargs) → BaseOutput[source]

Call pipeline, see: call_pipeline()

Parameters:

device – move the pipeline to this device before calling
prompt_weighter – Optional prompt weighter for weighted prompt syntaxes
kwargs – forward kwargs to pipeline

Returns:

A subclass of diffusers.utils.BaseOutput

get_pipeline_modules(names: Iterable[str])[source]

Get associated pipeline module such as vae etc, in a dictionary mapped from name to module value.

Possible Module Names:

unet

vae

transformer

text_encoder

text_encoder_2

text_encoder_3

tokenizer

tokenizer_2

tokenizer_3

safety_checker

feature_extractor

image_encoder

adapter

controlnet

scheduler

If the module is not present or a recognized name, a ValueError will be thrown describing the module that is not part of the pipeline.

Raises:: ValueError –
Parameters:: names – module names, such as vae, text_encoder
Returns:: dictionary

load_scheduler(scheduler_uri: str | None)[source]

Load a scheduler onto the pipeline using a URI specification.

Passing None to the URI reloads the original scheduler that the model was loaded with, if no new scheduler has been set since then, this is a no-op.

Parameters:: scheduler_uri – The scheduler URI

set_vae_tiling_and_slicing(vae_tiling: bool, vae_slicing: bool)[source]

Set the VAE tiling and slicing status of the pipeline.

Parameters:

vae_tiling – vae tiling?
vae_slicing – vae slicing?

model_path: str | None: Path the model was loaded from.

parsed_controlnet_uris: Sequence[ControlNetUri]: Parsed ControlNet URIs if any were present

parsed_image_encoder_uri: ImageEncoderUri | None: Parsed ImageEncoder URI if one was present

parsed_ip_adapter_uris: Sequence[IPAdapterUri]: Parsed IP Adapter URIs if any were present

parsed_lora_uris: Sequence[LoRAUri]: Parsed LoRA URIs if any were present

parsed_t2i_adapter_uris: Sequence[T2IAdapterUri]: Parsed T2IAdapter URIs if any were present

parsed_textual_inversion_uris: Sequence[TextualInversionUri]: Parsed Textual Inversion URIs if any were present

parsed_transformer_uri: TransformerUri | None: Parsed Transformer URI if one was present

parsed_unet_uri: UNetUri | None: Parsed UNet URI if one was present

parsed_vae_uri: VAEUri | None: Parsed VAE URI if one was present

property pipeline

class dgenerate.pipelinewrapper.PipelineFactory(model_path: str, model_type: ModelType = ModelType.SD, pipeline_type: PipelineType = PipelineType.TXT2IMG, revision: str | None = None, variant: str | None = None, subfolder: str | None = None, dtype: DataType = DataType.AUTO, unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, lora_fuse_scale: float | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, text_encoder_uris: Sequence[str] | None = None, quantizer_uri: str | None = None, quantizer_map: Sequence[str] | None = None, pag: bool = False, safety_checker: bool = False, original_config: str | None = None, auth_token: str | None = None, device: str = 'cpu', extra_modules: dict[str, Any] | None = None, model_cpu_offload: bool = False, sequential_cpu_offload: bool = False, local_files_only: bool = False)[source]

Bases: object

Turns create_diffusion_pipeline() into a factory that can repeatedly create a pipeline with the same arguments, possibly from cache.

__call__() → PipelineCreationResult[source]

Raises:

InvalidModelFileError –
ModelNotFoundError –
InvalidModelUriError –
InvalidSchedulerNameError –
UnsupportedPipelineConfigError –
dgenerate.NonHFModelDownloadError –
dgenerate.NonHFConfigDownloadError –

Returns:

TorchPipelineCreationResult

__init__(model_path: str, model_type: ModelType = ModelType.SD, pipeline_type: PipelineType = PipelineType.TXT2IMG, revision: str | None = None, variant: str | None = None, subfolder: str | None = None, dtype: DataType = DataType.AUTO, unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, lora_fuse_scale: float | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, text_encoder_uris: Sequence[str] | None = None, quantizer_uri: str | None = None, quantizer_map: Sequence[str] | None = None, pag: bool = False, safety_checker: bool = False, original_config: str | None = None, auth_token: str | None = None, device: str = 'cpu', extra_modules: dict[str, Any] | None = None, model_cpu_offload: bool = False, sequential_cpu_offload: bool = False, local_files_only: bool = False)[source]

class dgenerate.pipelinewrapper.PipelineType(value)[source]

Bases: Enum

Represents possible diffusers pipeline types.

IMG2IMG = 2: Image to image mode. Generation seeded / controlled with an image in some fashion.

INPAINT = 3: Inpainting mode. Generation seeded / controlled with an image and a mask in some fashion.

TXT2IMG = 1: Text to image mode. Prompt only generation.

class dgenerate.pipelinewrapper.PipelineWrapperResult(images: Sequence[Image] | None = None, latents: MutableSequence[Tensor] | None = None)[source]

Bases: object

The result of calling DiffusionPipelineWrapper

__init__(images: Sequence[Image] | None = None, latents: MutableSequence[Tensor] | None = None)[source]

image_grid(cols_rows: tuple[int, int])[source]

Render an image grid from the images in this result.

Raises:

ValueError – if no images are present on this object. This is impossible if this object was produced by DiffusionPipelineWrapper.
ValueError – if this result contains latents instead of images. Image grids can only be created from decoded images, not raw latent tensors.

Parameters:

cols_rows – columns and rows (WxH) desired as a tuple

Returns:

PIL.Image.Image

property has_images: bool

Whether this result contains images.

Returns:: bool

property has_latents: bool

Whether this result contains latents.

Returns:: bool

property image: Image | None

The first image in the batch of requested batch size.

Returns:: PIL.Image.Image

property image_count: int

The number of images produced.

Returns:: int

images: MutableSequence[Image] | None

property latent: Tensor | None

The first latent in the batch of requested batch size.

Returns:: torch.Tensor

latents: MutableSequence[Tensor] | None

property latents_count: int

The number of latents produced.

Returns:: int

property output_count: int

The number of outputs produced (images or latents).

Returns:: int

Bases: object

Representation of --s-cascade-decoder uri

static help()[source]

static parse(uri: str) → SCascadeDecoderUri[source]

Parse an --s-cascade-decoder uri and return an object representing its constituents

Parameters:: uri – string with --s-cascade-decoder uri syntax
Returns:: SCascadeDecoderUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['Stable Cascade Decoder']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

class dgenerate.pipelinewrapper.SDNQQuantizerUri(type: str = 'int8', group_size: int = 0, quant_conv: bool = False, quantized_matmul: bool = False, quantized_matmul_conv: bool = False)[source]

Bases: object

Representation of --quantizer URI for SDNQ backend.

static help()[source]

static parse(uri: str) → SDNQQuantizerUri[source]

__init__(type: str = 'int8', group_size: int = 0, quant_conv: bool = False, quantized_matmul: bool = False, quantized_matmul_conv: bool = False)[source]

to_config(compute_dtype: str | dtype | None = None) → SDNQConfig[source]

NAMES = ['sdnq']

OPTION_ARGS = {'type': ['int8', 'int7', 'int6', 'int5', 'int4', 'int3', 'int2', 'uint8', 'uint7', 'uint6', 'uint5', 'uint4', 'uint3', 'uint2', 'uint1', 'bool', 'float8_e4m3fn', 'float8_e4m3fnuz', 'float8_e5m2', 'float8_e5m2fnuz']}

class dgenerate.pipelinewrapper.SDXLControlNetUnionUriModes(value)[source]

Bases: IntEnum

Represents controlnet modes associated with the SDXL Union controlnet.

ANIME_LINEART = 3

CANNY = 3

DEPTH = 1

HED = 2

LINEART = 3

MLSD = 3

NORMAL = 4

OPENPOSE = 0

PIDI = 2

SCRIBBLE = 2

SEGMENT = 5

TED = 2

Bases: object

Representation of --sdxl-refiner uri

static help()[source]

static parse(uri: str) → SDXLRefinerUri[source]

Parse an --sdxl-refiner uri and return an object representing its constituents

Parameters:: uri – string with --sdxl-refiner uri syntax
Raises:: InvalidSDXLRefinerUriError –
Returns:: SDXLRefinerUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)

Raises:

InvalidSDXLRefinerUriError – If dtype is passed an invalid data type string.

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['SDXL Refiner']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

Bases: object

Representation of --t2i-adapters URI.

static help()[source]

static parse(uri: str) → T2IAdapterUri[source]

Parse a --t2i-adapters uri specification and return an object representing its constituents

Parameters:: uri – string with --t2i-adapters uri syntax
Raises:: InvalidT2IAdapterUriError –
Returns:: T2IAdapterUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)
scale – t2i adapter scale

Raises:

InvalidT2IAdapterUriError – If dtype is passed an invalid data type string.

load(dtype_fallback: DataType = DataType.AUTO, use_auth_token: str | None = None, local_files_only: bool = False, no_cache: bool = False) → T2IAdapter[source]

Load a diffusers.T2IAdapter from this URI.

Parameters:

dtype_fallback – Fallback datatype if dtype was not specified in the URI.
use_auth_token – Optional huggingface API auth token, used for downloading restricted repos that your account has access to.
local_files_only – Avoid connecting to huggingface to download models and only use cached models?
no_cache – If True, force the returned object not to be cached by the memoize decorator.

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

diffusers.T2IAdapter

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['T2I Adapter']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug

property revision: str | None: Model repo revision

property scale: float: T2IAdapter scale

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

Bases: object

Representation of --text-encoders URI.

static help()[source]

static parse(uri: str) → TextEncoderUri[source]

Parse a --text-encoders* uri and return an object representing its constituents

Parameters:: uri – string with --text-encoders* uri syntax
Raises:: InvalidTextEncoderUriError –
Returns:: TorchTextEncoderUri

static supported_encoder_names() → list[str][source]

Parameters:

encoder – encoder class name, for example CLIPTextModel
model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)
mode – model loading mode, for example clip-l for single file ‘’clip-l’’ checkpoints.

Raises:

InvalidTextEncoderUriError – If dtype is passed an invalid data type string, or if model points to a single file and the specified encoder class name does not support loading from a single file.

Load a torch Text Encoder of type transformers.models.clip.CLIPTextModel, transformers.models.clip.CLIPTextModelWithProjection, transformers.models.t5.T5EncoderModel, or diffusers.pipelines.kolors.ChatGLMModel from this URI

Parameters:

variant_fallback – If the URI does not specify a variant, use this variant.
dtype_fallback – If the URI does not specify a dtype, use this dtype.
original_config – Path to original model configuration for single file checkpoints, URL or .yaml file on disk.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug or blob link
no_cache – If True, force the returned object not to be cached by the memoize decorator.
missing_ok – If True, when a VAE is not found inside a single file checkpoint as a sub model, just return None instead of throwing an error.
device_map – device placement strategy for quantized models, defaults to None

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

transformers.models.clip.CLIPTextModel, transformers.models.clip.CLIPTextModelWithProjection, transformers.models.t5.T5EncoderModel, or diffusers.pipelines.kolors.ChatGLMModel

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['Text Encoder']

OPTION_ARGS = {'dtype': ['auto', 'float16', 'bfloat16', 'float32'], 'encoder': ['CLIPTextModel', 'CLIPTextModelWithProjection', 'T5EncoderModel', 'DistillT5EncoderModel', 'ChatGLMModel'], 'mode': ('clip-l', 'clip-l-sd3', 'clip-g-sd3', 'clip-l-sd35-large', 'clip-g-sd35-large', 't5-xxl', 't5-xxl-sd3')}

property dtype: DataType | None: Model dtype (precision)

property encoder: str: Encoder class name such as “CLIPTextModel”

property mode: str | None

Model loading mode for single file checkpoints, for example ‘clip-l’, ‘clip-g’, or ‘t5-xxl’

The default behavior is to extract the sub model from an assumed to be combined checkpoint, which is not compatible with quantization.

property model: str: Model path, huggingface slug

property quantizer: str | None: –quantizer URI override

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

class dgenerate.pipelinewrapper.TextualInversionUri(model: str, token: str | None = None, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None)[source]

Bases: object

Representation of --textual-inversions uri

static help()[source]

static load_on_pipeline(pipeline: DiffusionPipeline, uris: Iterable[TextualInversionUri | str], use_auth_token: str | None = None, local_files_only: bool = False)[source]

Load Textual Inversion weights on to a pipeline using on or more URIs

Parameters:

pipeline – diffusers.DiffusionPipeline
uris – Iterable of TextualInversionUri or str Textual Inversion URIs to load
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug

Raises:

ModelNotFoundError – If the model could not be found.
dgenerate.pipelinewrapper.uris.exceptions.InvalidTextualInversionUriError – On URI parsing errors.
dgenerate.pipelinewrapper.uris.exceptions.TextualInversionUriLoadError – On loading errors.

static parse(uri: str) → TextualInversionUri[source]

Parse a --textual-inversions uri and return an object representing its constituents

Parameters:: uri – string with --textual-inversions uri syntax
Raises:: InvalidTextualInversionUriError –
Returns:: TextualInversionPath

__init__(model: str, token: str | None = None, revision: str | None = None, subfolder: str | None = None, weight_name: str | None = None)[source]

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['Textual Inversion']

property model: str: Model path, huggingface slug, file path

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property token: str | None: Prompt keyword

property weight_name: str | None: Model weight-name

Bases: object

Representation of --transformer URI.

static help()[source]

static parse(uri: str) → TransformerUri[source]

Parse a --transformer uri and return an object representing its constituents

Parameters:: uri – string with --transformer uri syntax
Raises:: InvalidTransformerUriError –
Returns:: TransformerUri

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)

Raises:

InvalidTransformerUriError – If dtype is passed an invalid data type string.

load(variant_fallback: str | None = None, dtype_fallback: ~dgenerate.pipelinewrapper.enums.DataType = DataType.AUTO, original_config: str | None = None, use_auth_token: str | None = None, local_files_only: bool = False, no_cache: bool = False, device_map: str | None = None, transformer_class: type[~diffusers.models.transformers.transformer_sd3.SD3Transformer2DModel] | type[~diffusers.models.transformers.transformer_flux.FluxTransformer2DModel] = <class 'diffusers.models.transformers.transformer_sd3.SD3Transformer2DModel'>) → SD3Transformer2DModel | FluxTransformer2DModel[source]

Load a torch diffusers.SD3Transformer2DModel or diffusers.FluxTransformer2DModel from a URI.

Parameters:

variant_fallback – If the URI does not specify a variant, use this variant.
dtype_fallback – If the URI does not specify a dtype, use this dtype.
original_config – Path to original model configuration for single file checkpoints, URL or .yaml file on disk.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug or blob link
no_cache – If True, force the returned object not to be cached by the memoize decorator.
device_map – device placement strategy for quantized models, defaults to None
transformer_class – Transformer class type.

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

diffusers.SD3Transformer2DModel or diffusers.FluxTransformer2DModel

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['Transformer']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug

property quantizer: str | None: –quantizer URI override

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

Bases: object

Representation of --unet URI.

static help()[source]

static parse(uri: str) → UNetUri[source]

Parse a --unet uri and return an object representing its constituents

Parameters:: uri – string with --unet uri syntax
Raises:: InvalidUNetUriError –
Returns:: TorchUNetPath

Parameters:

model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
dtype – model data type (precision)

Raises:

InvalidUNetUriError – If model points to a single file, single file loads are not supported. Or if dtype is passed an invalid data type string.

load(variant_fallback: str | None = None, dtype_fallback: ~dgenerate.pipelinewrapper.enums.DataType = DataType.AUTO, original_config: str | None = None, use_auth_token: str | None = None, local_files_only: bool = False, no_cache: bool = False, device_map: str | None = None, unet_class=<class 'diffusers.models.unets.unet_2d_condition.UNet2DConditionModel'>)[source]

Load a UNet of type diffusers.UNet2DConditionModel

Parameters:

variant_fallback – If the URI does not specify a variant, use this variant.
dtype_fallback – If the URI does not specify a dtype, use this dtype.
original_config – Path to original model configuration for single file checkpoints, URL or .yaml file on disk.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug or blob link
no_cache – If True, force the returned object not to be cached by the memoize decorator.
device_map – device placement strategy for quantized models, defaults to None
unet_class – UNet class

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

diffusers.UNet2DConditionModel

FILE_ARGS = {'model': {'mode': 'dir'}}

NAMES = ['UNet']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32']}

property dtype: DataType | None: Model dtype (precision)

property model: str: Model path, huggingface slug, file path, or blob link

property quantizer: str | None: –quantizer URI override

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

Bases: object

Representation of --vae URI.

static help()[source]

static parse(uri: str) → VAEUri[source]

Parse a --vae uri and return an object representing its constituents

Parameters:: uri – string with --vae uri syntax
Raises:: InvalidVaeUriError –
Returns:: TorchVAEPath

static supported_encoder_names() → list[str][source]

Parameters:

encoder – encoder class name, for example AutoencoderKL
model – model path
revision – model revision (branch name)
variant – model variant, for example fp16
subfolder – model subfolder
extract – Extract the VAE from a single file checkpoint that contains other models, such as a UNet or Text Encoders.
dtype – model data type (precision)

Raises:

InvalidVaeUriError – If dtype is passed an invalid data type string, or if model points to a single file and the specified encoder class name does not support loading from a single file.

load(dtype_fallback: DataType = DataType.AUTO, original_config: str | None = None, use_auth_token: str | None = None, local_files_only: bool = False, no_cache: bool = False, missing_ok: bool = False) → AutoencoderKL | AsymmetricAutoencoderKL | AutoencoderTiny | ConsistencyDecoderVAE | None[source]

Load a VAE of type diffusers.AutoencoderKL, diffusers.AsymmetricAutoencoderKL, diffusers.AutoencoderKLTemporalDecoder, or diffusers.AutoencoderTiny from this URI

Parameters:

dtype_fallback – If the URI does not specify a dtype, use this dtype.
original_config – Path to original model configuration for single file checkpoints, URL or .yaml file on disk.
use_auth_token – optional huggingface auth token.
local_files_only – avoid downloading files and only look for cached files when the model path is a huggingface slug or blob link
no_cache – If True, force the returned object not to be cached by the memoize decorator.
missing_ok – If True, when a VAE is not found inside a single file checkpoint as a sub model, just return None instead of throwing an error.

Raises:

ModelNotFoundError – If the model could not be found.

Returns:

diffusers.AutoencoderKL, diffusers.AsymmetricAutoencoderKL, diffusers.AutoencoderKLTemporalDecoder, or diffusers.AutoencoderTiny

FILE_ARGS = {'model': {'filetypes': [('Models', ['*.safetensors', '*.pt', '*.pth', '*.cpkt', '*.bin'])], 'mode': ['in', 'dir']}}

NAMES = ['VAE']

OPTION_ARGS = {'dtype': ['float16', 'bfloat16', 'float32'], 'encoder': ['AutoencoderKL', 'AsymmetricAutoencoderKL', 'AutoencoderTiny', 'ConsistencyDecoderVAE']}

property dtype: DataType | None: Model dtype (precision)

property encoder: str: Encoder class name such as “AutoencoderKL”

property extract: False: Extract from a single file checkpoint containing multiple components?

property model: str: Model path, huggingface slug

property revision: str | None: Model repo revision

property subfolder: str | None: Model repo subfolder

property variant: str | None: Model repo revision

dgenerate.pipelinewrapper.call_pipeline(pipeline: DiffusionPipeline, device: device | str | None = 'cpu', prompt_weighter: PromptWeighter = None, **kwargs)[source]

Call a diffusers pipeline, offload the last called pipeline to CPU before doing so if the last pipeline is not being called in succession

Parameters:

pipeline – The pipeline
device – The device to move the pipeline to before calling, it will be moved to this device if it is not already on the device. If the pipeline does not support moving to specific device, such as with sequentially offloaded pipelines which cannot move at all, or cpu offloaded pipelines which can only move to CPU, this argument is ignored.
kwargs – diffusers pipeline keyword arguments
prompt_weighter – Optional prompt weighter for weighted prompt syntaxes

Raises:

dgenerate.OutOfMemoryError – if there is not enough memory on the specified device
UnsupportedPipelineConfiguration – If the pipeline is missing certain required modules, such as text encoders.

Returns:

the result of calling the diffusers pipeline

dgenerate.pipelinewrapper.create_diffusion_pipeline(model_path: str, model_type: ModelType = ModelType.SD, pipeline_type: PipelineType = PipelineType.TXT2IMG, revision: str | None = None, variant: str | None = None, subfolder: str | None = None, dtype: DataType = DataType.AUTO, unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, lora_fuse_scale: float | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, text_encoder_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, quantizer_uri: str | None = None, quantizer_map: Sequence[str] | None = None, pag: bool = False, safety_checker: bool = False, original_config: str | None = None, auth_token: str | None = None, device: str = 'cpu', extra_modules: dict[str, Any] | None = None, model_cpu_offload: bool = False, sequential_cpu_offload: bool = False, local_files_only: bool = False, missing_submodules_ok: bool = False) → PipelineCreationResult[source]

Create a diffusers.DiffusionPipeline in dgenerate’s in memory cacheing system.

Parameters:

model_type – dgenerate.pipelinewrapper.ModelType enum value
model_path – huggingface slug, huggingface blob link, path to folder on disk, path to file on disk
pipeline_type – dgenerate.pipelinewrapper.PipelineType enum value
revision – huggingface repo revision (branch)
variant – model weights name variant, for example ‘fp16’
subfolder – huggingface repo subfolder if applicable
dtype – Optional dgenerate.pipelinewrapper.DataType enum value
unet_uri – Optional --unet URI string for specifying a specific UNet
transformer_uri – Optional --transformer URI string for specifying a specific Transformer, currently this is only supported for Stable Diffusion 3 and Flux models.
vae_uri – Optional --vae URI string for specifying a specific VAE
lora_uris – Optional --loras URI strings for specifying LoRA weights
lora_fuse_scale – Optional --lora-fuse-scale global LoRA fuse scale value. Once all LoRAs are merged with their individual scales, the merged weights will be fused into the pipeline at this scale. The default value is 1.0.
image_encoder_uri – Optional --image-encoder URI for use with IP Adapter weights or Stable Cascade
ip_adapter_uris – Optional --ip-adapters URI strings for specifying IP Adapter weights
textual_inversion_uris – Optional --textual-inversions URI strings for specifying Textual Inversion weights
text_encoder_uris – Optional user specified --text-encoders URIs that will be loaded on to the pipeline in order. A uri value of + or None indicates use default, a string value of null indicates to explicitly not load any encoder all
controlnet_uris – Optional --control-nets URI strings for specifying ControlNet models
t2i_adapter_uris – Optional --t2i-adapters URI strings for specifying T2IAdapter models
quantizer_uri – Optional --quantizer URI value
quantizer_map – Collection of pipeline submodule names to which quantization should be applied when quantizer_uri is provided. Valid values include: unet, transformer, text_encoder, text_encoder_2, text_encoder_3, and controlnet. If None, all supported modules will be quantized, except for controlnet.
pag – Use perturbed attention guidance?
safety_checker – Safety checker enabled? default is False
original_config – Optional original training config .yaml file path when loading a single file checkpoint.
auth_token – Optional huggingface API token for accessing repositories that are restricted to your account
device – Optional --device string, defaults to “cuda”
extra_modules – Extra module arguments to pass directly into diffusers.DiffusionPipeline.from_single_file() or diffusers.DiffusionPipeline.from_pretrained()
model_cpu_offload – This pipeline has model_cpu_offloading enabled?
sequential_cpu_offload – This pipeline has sequential_cpu_offloading enabled?
local_files_only – Only look in the huggingface cache and do not connect to download models?
missing_submodules_ok – It is okay if Text Encoders or VAE is missing from the checkpoint?

Raises:

InvalidModelFileError –
InvalidModelUriError –
InvalidSchedulerNameError –
UnsupportedPipelineConfigError –
dgenerate.ModelNotFoundError –
dgenerate.ConfigNotFoundError –
dgenerate.NonHFModelDownloadError –
dgenerate.NonHFConfigDownloadError –
dgenerate.WebFileCacheOfflineModeException –

Returns:

TorchPipelineCreationResult

dgenerate.pipelinewrapper.destroy_last_called_pipeline(collect=True)[source]

Move to CPU and dereference the globally cached pipeline last called with call_pipeline().

This is a no-op if a pipeline has never been called with call_pipeline()

Parameters:: collect – call gc.collect and dgenerate.memory.torch_gc() if there is a pipeline to dereference?

dgenerate.pipelinewrapper.enable_model_cpu_offload(pipeline: DiffusionPipeline, device: device | str = 'cpu')[source]

Enable sequential model cpu offload on a torch pipeline, in a way dgenerate can keep track of.

Parameters:

pipeline – the pipeline
device – the device

dgenerate.pipelinewrapper.enable_sequential_cpu_offload(pipeline: DiffusionPipeline, device: device | str = 'cpu')[source]

Enable sequential offloading on a torch pipeline, in a way dgenerate can keep track of.

Parameters:

pipeline – the pipeline
device – the device

dgenerate.pipelinewrapper.estimate_pipeline_cache_footprint(model_path: str, model_type: ModelType, revision: str = 'main', variant: str | None = None, subfolder: str | None = None, include_unet_or_transformer: bool = False, include_vae: bool = False, include_text_encoders: bool = False, lora_uris: Sequence[str] | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, safety_checker: bool = False, auth_token: str | None = None, extra_args: dict[str, Any] | None = None, local_files_only: bool = False) → int[source]

Estimate the CPU side cache memory use of a pipeline.

This does not include the UNet / Transformer, VAE, or Text Encoders as those have their own individual caches.

Parameters:

model_path – huggingface slug, blob link, path to folder on disk, path to model file.
model_type – dgenerate.pipelinewrapper.ModelType
revision – huggingface repo revision if using a huggingface slug
variant – model file variant desired, for example “fp16”
subfolder – huggingface repo subfolder if using a huggingface slug this is currently only supported for Stable Diffusion 3 and Flux models.
include_unet_or_transformer – Include the unet / transformer? Under most conditions this is loaded separately and put into a cache of its own.
include_text_encoders – Include text encoders? Under most conditions these are loaded separately and put into a cache of their own.
include_vae – Include VAE? Under most conditions this is loaded separately and put into a cache its own.
lora_uris – optional user specified --loras URIs that will be loaded on to the pipeline
image_encoder_uri – optional user specified --image-encoder URI that will be loaded on to the pipeline
ip_adapter_uris – optional user specified --ip-adapters URIs that will be loaded on to the pipeline
textual_inversion_uris – optional user specified --textual-inversion URIs that will be loaded on to the pipeline
safety_checker – consider the safety checker? dgenerate usually loads the safety checker and then retroactively disables it if needed, so it usually considers the size of the safety checker model.
auth_token – optional huggingface auth token to access restricted repositories that your account has access to.
extra_args – extra_args as to be passed to create_diffusion_pipeline()
local_files_only – Only ever attempt to look in the local huggingface cache? if False the huggingface API will be contacted when necessary.

Returns:

size estimate in bytes.

dgenerate.pipelinewrapper.get_compatible_schedulers(pipeline_cls: type[DiffusionPipeline]) → list[type[SchedulerMixin]][source]

Finds all compatible scheduler classes for a given diffusers pipeline class without instantiating it.

Parameters:: pipeline_cls – The pipeline class, for example diffusers.StableDiffusionPipeline

:return A list of compatible scheduler class types

dgenerate.pipelinewrapper.get_data_type_enum(id_str: DataType | str | None) → DataType[source]

Convert a --dtype string to its DataType enum value

Parameters:: id_str – --dtype string
Raises:: ValueError – if an invalid string value (name) is passed
Returns:: DataType

dgenerate.pipelinewrapper.get_data_type_string(data_type_enum: DataType) → str[source]

Convert a DataType enum value to its --dtype string

Parameters:: data_type_enum – DataType value
Returns:: --dtype string

dgenerate.pipelinewrapper.get_last_called_pipeline() → DiffusionPipeline | None[source]

Get a reference to the globally cached pipeline last called with call_pipeline().

This value may be None if a pipeline was never called.

Returns:: diffusion pipeline object

dgenerate.pipelinewrapper.get_model_type_enum(id_str: ModelType | str) → ModelType[source]

Convert a --model-type string to its ModelType enum value

Parameters:: id_str – --model-type string
Raises:: ValueError – if an invalid string value (name) is passed
Returns:: ModelType

dgenerate.pipelinewrapper.get_model_type_string(model_type_enum: ModelType) → str[source]

Convert a ModelType enum value to its --model-type string

Parameters:: model_type_enum – ModelType value
Returns:: --model-type string

dgenerate.pipelinewrapper.get_pipeline_class(model_type: ModelType = ModelType.SD, pipeline_type: PipelineType = PipelineType.TXT2IMG, unet_uri: str | None = None, transformer_uri: str | None = None, vae_uri: str | None = None, lora_uris: Sequence[str] | None = None, image_encoder_uri: str | None = None, ip_adapter_uris: Sequence[str] | None = None, textual_inversion_uris: Sequence[str] | None = None, controlnet_uris: Sequence[str] | None = None, t2i_adapter_uris: Sequence[str] | None = None, pag: bool = False, help_mode: bool = False) → Type[DiffusionPipeline][source]

Get an appropriate diffusers pipeline class for the provided arguments.

Parameters:

model_type – dgenerate.pipelinewrapper.ModelType enum value
pipeline_type – dgenerate.pipelinewrapper.PipelineType enum value
unet_uri – Optional --unet URI string for specifying a specific UNet
transformer_uri – Optional --transformer URI string for specifying a specific Transformer, currently this is only supported for Stable Diffusion 3 and Flux models.
vae_uri – Optional --vae URI string for specifying a specific VAE
lora_uris – Optional --loras URI strings for specifying LoRA weights
image_encoder_uri – Optional --image-encoder URI for use with IP Adapter weights or Stable Cascade
ip_adapter_uris – Optional --ip-adapters URI strings for specifying IP Adapter weights
textual_inversion_uris – Optional --textual-inversions URI strings for specifying Textual Inversion weights
controlnet_uris – Optional --control-nets URI strings for specifying ControlNet models
t2i_adapter_uris – Optional --t2i-adapters URI strings for specifying T2IAdapter models
pag – Use perturbed attention guidance?
help_mode – Return the class even if it does not support the selected pipeline_type

Raises:

UnsupportedPipelineConfigError –

dgenerate.pipelinewrapper.get_pipeline_modules(pipeline: DiffusionPipeline)[source]

Get all component modules of a torch diffusers pipeline.

Parameters:: pipeline – the pipeline
Returns:: dictionary of modules by name

dgenerate.pipelinewrapper.get_pipeline_type_enum(id_str: PipelineType | str | None) → PipelineType[source]

Get a PipelineType enum value from a string.

Parameters:: id_str – one of: “txt2img”, “img2img”, or “inpaint”
Raises:: ValueError – if an invalid string value (name) is passed
Returns:: PipelineType

dgenerate.pipelinewrapper.get_pipeline_type_string(pipeline_type_enum: PipelineType)[source]

Convert a PipelineType enum value to a string.

Parameters:: pipeline_type_enum – PipelineType value
Returns:: one of: “txt2img”, “img2img”, or “inpaint”

dgenerate.pipelinewrapper.get_quantizer_uri_class(uri: str, exception: type[Exception] = <class 'dgenerate.pipelinewrapper.uris.util.UnknownQuantizerName'>)[source]: Get the URI parser class needed for a particular quantizer URI :param uri: The URI :param exception: Exception type to raise on unsupported quantization backend. :return: Class from dgenerate.pipelinewrapper.uris

dgenerate.pipelinewrapper.get_scheduler_help(pipeline_cls, help_args: bool = False, indent: int = 0)[source]

Generate a help string containing info about a pipline classes compatible schedulers.

Parameters:

pipeline_cls – The pipeline class
help_args – Show individual scheduler arguments that can be specified via URI?
indent – Indent all text output by this amount of spaces.

Returns:

help string

dgenerate.pipelinewrapper.get_scheduler_uri_schema(scheduler: type[SchedulerMixin] | list[type[SchedulerMixin]])[source]

Return a schema describing initialization arguments from a diffusers scheduler type, or list of scheduler types.

This returns a set of schemas keyed by scheduler name, which are identical to the schema format returned by dgenerate.plugin.Plugin.get_accepted_args_schema().

Arguments which cannot be passed through a URI such as class references are omitted.

Parameters:: scheduler – diffusers scheduler type, or list of them.
Returns:: dict schema.

dgenerate.pipelinewrapper.get_torch_device(component: DiffusionPipeline | Module) → device[source]

Get the device that a pipeline or pipeline component exists on.

Parameters:: component – pipeline or pipeline component.
Returns:: torch.device

dgenerate.pipelinewrapper.get_torch_device_string(component: DiffusionPipeline | Module) → str[source]

Get the device string that a pipeline or pipeline component exists on.

Parameters:: component – pipeline or pipeline component.
Returns:: device string

dgenerate.pipelinewrapper.get_torch_dtype(dtype: DataType | dtype | str | None) → dtype | None[source]

Return a torch.dtype datatype from a DataType value, or a string, or a torch.dtype datatype itself.

Passing None results in None being returned.

Passing ‘auto’ or DataType.AUTO results in None being returned.

Parameters:: dtype – DataType, string, torch.dtype, None
Raises:: ValueError – if an invalid string value (name) is passed
Returns:: torch.dtype

dgenerate.pipelinewrapper.get_uri_accepted_args_schema(uri_cls: type) → dict[source]

Get the accepted arguments as a schema dict for a URI class.

This function introspects the __init__ method of the URI class and returns an argument schema.

The HIDE_ARGS static class attribute (list or set) can be used to hide arguments from the schema, in the same fashion as the dgenerate plugin API. This is useful for arguments that are not part of the parseable URI, but are used internally as supplemental arguments when parsing the URI. Arguments such as model_type (indicating parent model type for sub-models) etc. are examples of such arguments.

The OPTION_ARGS static class attribute (dict) can be used to specify a that an arguments values consists of a set of valid options, such as specific strings.

The FILE_ARGS static class attribute (dict) can be used to specify that an argument accepts a file or directory, and can contain metadata about the filetypes.

The static metadata attributes for URI classes are defined identically to plugins, aside from the absent loaded_by_name API. The schema output is identical to dgenerate plugin schema output.

Keyed by argument name, content keys include:

default contains any default value, this key may not exist if the argument has no default value.

types contains all accepted types for the argument in string form.

optional can the argument accept the value None?

options contains a list of valid options for the argument, if the argument is an option argument annotated with the OPTION_ARGS static class attribute. This list may contain None if the argument can accept the value None, i.e. it is optional.

files contains a dict with metadata about what sort of file types (and/or directory) the argument accepts if applicable.

Parameters:: uri_cls – The URI class to introspect, must have an __init__ method.
Returns:: dict

dgenerate.pipelinewrapper.get_uri_help(uri_cls: type, wrap_width: int | None = None) → str | None[source]

Get the help text for a URI class.

This function introspects the help method of the URI class, if it exists.

Returns:: str

dgenerate.pipelinewrapper.get_uri_names(uri_cls: type) → list[source]

Return human names / loading names for a URI type.

Parameters:: uri_cls – the URI class
Returns:: list of names, guaranteed to be len() > 0

dgenerate.pipelinewrapper.is_model_cpu_offload_enabled(module: DiffusionPipeline | Module)[source]

Test if a pipeline or torch neural net module created by dgenerate has model cpu offload enabled.

Parameters:: module – the module object
Returns:: True or False

dgenerate.pipelinewrapper.is_sequential_cpu_offload_enabled(module: DiffusionPipeline | Module)[source]

Test if a pipeline or torch neural net module created by dgenerate has sequential offload enabled.

Parameters:: module – the module object
Returns:: True or False

dgenerate.pipelinewrapper.load_scheduler(pipeline: DiffusionPipeline, scheduler_uri: str | None)[source]

Load a specific compatible scheduler class name onto a huggingface diffusers pipeline object.

Passing None to the URI reloads the original scheduler that the pipeline was loaded with, if no new scheduler has been set since then, this is a no-op.

Raises:

InvalidSchedulerNameError – If an invalid scheduler name is specified specifically.
SchedulerArgumentError – If invalid arguments are supplied to the scheduler via the URI.

Parameters:

pipeline – pipeline object
scheduler_uri – Compatible scheduler URI.

dgenerate.pipelinewrapper.model_type_is_floyd(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an floyd “if” of “ifs” type model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_floyd_if(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an floyd “if” type model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_floyd_ifs(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an floyd “ifs” type model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_flux(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent a Flux model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_kolors(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent a Kolors model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_pix2pix(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an pix2pix type model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_s_cascade(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent a Stable Cascade related model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_sd15(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an SD1.5 model?

These model types may also be able to load SD2 checkpoints, specifically: ModelType.SD can.

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_sd2(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an SD 2.X compatible model?

These model types may also be able to load SD1.5 checkpoints, specifically: ModelType.SD can.

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_sd3(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an SD3 model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_sdxl(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an SDXL model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.model_type_is_upscaler(model_type: ModelType | str) → bool[source]

Does a --model-type string or ModelType enum value represent an upscaler model?

Parameters:: model_type – --model-type string or ModelType enum value
Returns:: bool

dgenerate.pipelinewrapper.pipeline_to(pipeline, device: device | str | None)[source]

Move a diffusers pipeline to a device if possible, in a way that dgenerate can keep track of.

This calls methods associated with updating the cache statistics such as dgenerate.pipelinewrapper.pipeline_off_cpu_update_cache_info() and dgenerate.pipelinewrapper.pipeline_to_cpu_update_cache_info() for you, as well as the associated cache update functions for the pipelines individual components as needed.

If device==None this is a no-op.

Modules which are meta tensors will not be moved (sequentially offloaded modules)

Modules which have model cpu offload enabled will not be moved unless they are moving to “cpu”

Raises:

dgenerate.OutOfMemoryError – if there is not enough memory on the specified device

Parameters:

pipeline – the pipeline
device – the device

Returns:

the moved pipeline

dgenerate.pipelinewrapper.quantizer_help(names: Sequence[str], throw: bool = False, log_error: bool = True) → int[source]

Implements --quantizer-help command line argument.

Parameters:

names – backend names, may be an empty list.
throw – Raise UnknownQuantizerName instead of handling unknown names?
log_error – Log errors to stderr?

Returns:

return code, 0 for success, 1 for failure

dgenerate.pipelinewrapper.scheduler_is_help(name: str | None)[source]

This scheduler URI is simply a request for help?, IE: "help" or "helpargs"?

Parameters:: name – string to test
Returns:: True or False

dgenerate.pipelinewrapper.scheduler_is_help_args(name: str | None)[source]

This scheduler URI is explicitly requesting argument help, IE: "helpargs"

Parameters:: name – string to test
Returns:: True or False

dgenerate.pipelinewrapper.set_vae_tiling_and_slicing(pipeline: DiffusionPipeline, tiling: bool, slicing: bool)[source]

Set the vae_slicing and vae_tiling status on a diffusers pipeline.

Raises:

UnsupportedPipelineConfigError – if the pipeline does not support one or both of the provided values for vae_tiling and vae_slicing

Parameters:

pipeline – pipeline object
tiling – tiling status
slicing – slicing status

dgenerate.pipelinewrapper.supported_data_type_enums() → list[DataType][source]: Return a list of supported DataType enum values

dgenerate.pipelinewrapper.supported_data_type_strings()[source]: Return a list of supported --dtype strings

dgenerate.pipelinewrapper.supported_model_type_enums() → list[ModelType][source]: Return a list of supported ModelType enum values

dgenerate.pipelinewrapper.supported_model_type_strings()[source]: Return a list of supported --model-type strings

dgenerate.pipelinewrapper.text_encoder_help(pipeline_class: type[DiffusionPipeline], indent: int = 0) → str[source]

Describe compatible text encoders for a pipeline in terms of --text-encoders argument position and type.

Parameters:

pipeline_class – Diffusers pipeline class
indent – Text indent level

Returns:

help string

dgenerate.pipelinewrapper.text_encoder_is_help(text_encoder_uris: Sequence[str] | None)[source]

Text encoder uris specification is simply a request for help?, IE: "help"?

Parameters:: text_encoder_uris – list of text encoder URIs to test
Returns:: True or False

dgenerate.pipelinewrapper.uri_hash_with_parser(parser, exclude: set[str] | None = None)[source]

Create a hash function from a particular URI parser function that hashes a URI string.

The URI is parsed and then the object that results from parsing is hashed with dgenerate.memoize.property_hasher().

If the parser returns a string, it is regarded as the hash value instead of being passed to dgenerate.memoize.property_hasher().

Parameters:

parser – The URI parser function
exclude – URI argument names to exclude from hashing

Returns:

a hash function compatible with dgenerate.memoize.memoize()

dgenerate.pipelinewrapper.uri_list_hash_with_parser(parser, exclude: set[str] | None = None)[source]

Create a hash function from a particular URI parser function that hashes a list of URIs.

Parameters:

parser – The URI parser function
exclude – URI argument names to exclude from hashing

Returns:

a hash function compatible with dgenerate.memoize.memoize()

dgenerate.pipelinewrapper.constants module

dgenerate.plugin module

URI based plugin loading system base implementations.

exception dgenerate.plugin.ModuleFileNotFoundError[source]

Bases: FileNotFoundError

Raised by load_modules() if a module could not be found on disk.

exception dgenerate.plugin.PluginArgumentError[source]

Bases: Exception

Raised when a plugin encounters an error in the arguments it is loaded by.

Or errors in arguments used for execution.

exception dgenerate.plugin.PluginNotFoundError[source]

Bases: Exception

Raised when a plugin could not be located by a name.

class dgenerate.plugin.Plugin(loaded_by_name: str | None = None, argument_error_type: type[~dgenerate.plugin.PluginArgumentError] = <class 'dgenerate.plugin.PluginArgumentError'>, **kwargs)[source]

Bases: object

classmethod get_accepted_args(loaded_by_name: str, include_bases: bool = False)[source]

Retrieve the argument signature of a plugin implementation.

Parameters:

loaded_by_name – The name used to load the plugin. Argument signature may vary by name used to load.
include_bases – Include all base classes except Plugin?

Returns:

List of argument descriptors, PluginArg

classmethod get_accepted_args_schema(loaded_by_name: str, include_bases: bool = False)[source]

Reduce the accepted arguments to a schema dict.

Keyed by argument name, content keys include:

default contains any default value, this key may not exist if the argument has no default value.

types contains all accepted types for the argument in string form.

optional can the argument accept the value None?

Parameters:

loaded_by_name – Plugin loaded by name
include_bases – Include all base classes except Plugin?

Returns:

dict

classmethod get_bases() → list[Type[Plugin]][source]

Return a list of base classes, except for Plugin

Returns:: list of class type objects

classmethod get_default_args(loaded_by_name: str) → list[PluginArg][source]

Get the names and values of arguments for this plugin that possess default values.

Parameters:: loaded_by_name – The name used to load the plugin. Default arguments may vary by name used to load.
Returns:: list of arguments with default value: (name, value)

classmethod get_file_args(loaded_by_name: str) → dict[str, dict[str, Any]][source]

Get argument names that have an associated list of option valid file types. i.e. FILE_ARGS.

This returns metadata provided by the plugin about specific arguments which accept a limited set of file types, such as model file types or config file types.

FILE_ARGS should be a dictionary where keys are argument names, for example:

FILE_ARGS = {
    "model": {
        "mode": "in/out"
        "filetypes": [('Model', ['*.safetensors', '*.pt'])]
    }
}

# only accepts directories

FILE_ARGS = {
    "model": {
        "mode": "dir"
    }
}

# accepts input files or directories

FILE_ARGS = {
    "model": {
        "mode": ["in", "dir"]
        "filetypes": [('Model', ['*.safetensors', '*.pt'])]
    }
}

# output files or directories
# (mutually exclusive with "in" mode)

FILE_ARGS = {
    "model": {
        "mode": ["out", "dir"]
        "filetypes": [('Model', ['*.safetensors', '*.pt'])]
    }
}

If the plugin supports multiple names for loading, this can be a dict of dicts, where the outer keys are the names used to load the plugin, and the inner keys are the argument names.

FILE_ARGS = {
    "plugin_name": {
        "model": {
            "mode": "in/out"
            "filetypes": [('Model', ['*.safetensors', '*.pt'])]
        }
    }
}

These values can be inherited.

Parameters:: loaded_by_name – The name used to load the plugin. Argument signature may vary by name used to load.
Returns:: options argument names

classmethod get_help(loaded_by_name: str, wrap_width: int | None = None, include_bases: bool = False) → str[source]

Get formatted help information about the plugin.

This includes any implemented help strings and an auto formatted description of the plugins accepted arguments.

Parameters:

loaded_by_name – The name used to load the plugin. Help may vary depending on how many names the plugin implementation handles and what loading it by a certain name does.
wrap_width – wrap paragraphs to this width.
include_bases – include argument names and inherited help from base classes?

Returns:

Formatted help string

classmethod get_hidden_args(loaded_by_name: str) → set[str][source]

Get argument names that have been explicitly hidden from use or disabled by the plugin for URI use. i.e. HIDE_ARGS.

These may be unsupported arguments inherited from a base class, or just arguments the plugin does not want you to use via a URI.

These arguments can still be passed manually from code in the interest of maintaining a generic interface, but they may be ignored by the plugin implementation.

HIDE_ARGS = ['hide-me', 'dont-use-me']

# or

HIDE_ARGS = {
    "plugin_name": ['hide-me', 'dont-use-me'],
    "alt_plugin_name": ['hide-me-2']
}

Parameters:: loaded_by_name – The name used to load the plugin. Argument signature may vary by name used to load.
Returns:: hidden argument names

classmethod get_names() → list[str][source]

Get the names that this class can be loaded by.

Returns:

classmethod get_option_args(loaded_by_name: str) → dict[str, list][source]

Get argument names that have an associated list of option valid values. i.e. OPTION_ARGS.

This returns metadata provided by the plugin about specific arguments which accept a limited set of values, such as shape names like circle or rectangle` etc.

OPTION_ARGS should be in the form of a dict, where the keys are argument names, and the values are lists of valid values for that argument.

OPTION_ARGS = {
    "shape": ['circle', 'rectangle', 'triangle'],
}

If the plugin supports multiple names for loading, this can be a dict of dicts, where the outer keys are the names used to load the plugin, and the inner keys are the argument names.

OPTION_ARGS = {
    "plugin_name": {
        "shape": ['circle', 'rectangle', 'triangle']
    }
}

These values can be inherited.

Parameters:: loaded_by_name – The name used to load the plugin. Argument signature may vary by name used to load.
Returns:: options argument names

classmethod get_required_args(loaded_by_name: str) → list[PluginArg][source]

Get a list of required arguments for this plugin class.

Parameters:: loaded_by_name – The name used to load the plugin. Required arguments may vary by name used to load.
Returns:: list of argument names

__init__(loaded_by_name: str | None = None, argument_error_type: type[~dgenerate.plugin.PluginArgumentError] = <class 'dgenerate.plugin.PluginArgumentError'>, **kwargs)[source]

Parameters:

loaded_by_name – The name the plugin was loaded by, will be passed by the loader. If None is passed, the first name mentioned by the plugin implementation will be used. This can simplify using some plugin classes directly without loading them through a loader implementation.
argument_error_type – This exception type will be raised upon argument errors (invalid arguments) when loading a plugin using a PluginLoader implementation. It should match the argument_error_type given to the PluginLoader implementation being used to load the inheritor of this class.
kwargs – Additional arguments that may arise when using an ARGS static signature definition with multiple NAMES in your implementation.

argument_error(msg: str)[source]

Return an constructed exception that is suitable for raising as an argument error for this plugin.

Example: raise self.argument_error('oops!')

Parameters:: msg – exception message
Returns:: the exception object, you must raise it.

property loaded_by_name: str

The name the plugin was loaded by.

Returns:: name

class dgenerate.plugin.PluginArg(name: str, type: type = typing.Any, **kwargs)[source]

Bases: object

__init__(name: str, type: type = typing.Any, **kwargs)[source]

name_dashdown() → PluginArg[source]

name_dashup() → PluginArg[source]

parse_by_type(value: str | Any)[source]

type_string()[source]

validate_non_parsed(value: Any)[source]

property base_type: type

property hinted_optional_type

property is_hinted_optional

class dgenerate.plugin.PluginLoader(base_class=<class 'dgenerate.plugin.Plugin'>, description: str = 'plugin', reserved_args: list[~dgenerate.plugin.PluginArg] | None = None, argument_error_type: type[~dgenerate.plugin.PluginArgumentError] = <class 'dgenerate.plugin.PluginArgumentError'>, not_found_error_type: type[~dgenerate.plugin.PluginNotFoundError] = <class 'dgenerate.plugin.PluginNotFoundError'>)[source]

Bases: object

__init__(base_class=<class 'dgenerate.plugin.Plugin'>, description: str = 'plugin', reserved_args: list[~dgenerate.plugin.PluginArg] | None = None, argument_error_type: type[~dgenerate.plugin.PluginArgumentError] = <class 'dgenerate.plugin.PluginArgumentError'>, not_found_error_type: type[~dgenerate.plugin.PluginNotFoundError] = <class 'dgenerate.plugin.PluginNotFoundError'>)[source]

Parameters:

base_class – Base class of plugins, will be used for searching modules.
description – Short plugin description / name, used in exception messages.
reserved_args – Constructor arguments that are used by the plugin class which cannot be redefined by implementors of the plugin class. This should be a list of plugin argument descriptors, PluginArg
argument_error_type – This exception type will be raised when the plugin is loaded with invalid URI arguments.
not_found_error_type – This exception type will be raised when a plugin could not be located by a name specified in a loading URI.

add_class(cls: type[Plugin])[source]

Add an implementation class to this loader.

Raises:: RuntimeError – If the added class specifies a name that already exists in this loader.
Parameters:: cls – the class

add_search_module(module: ModuleType) → list[type[Plugin]][source]

Directly add a module object that will be searched for implementations.

Parameters:: module – the module object
Raises:: ValueError – If module is not a python module object.
Returns:: list of classes that were newly discovered

add_search_module_string(string: str) → list[type[Plugin]][source]

Add a module string (in sys.modules) that will be searched for implementations.

Parameters:: string – the module string
Returns:: list of classes that were newly discovered

get_accepted_args_schema(include_bases: bool = False) → dict[str, dict[str, Any]][source]

Get a Plugin.get_accepted_args_schema() for every plugin class, keyed by callable plugin name.

Parameters:: include_bases – Include base class arguments? This excludes the base Plugin
Returns:: dict

get_all_names() → Sequence[str][source]

Get all plugin names that this loader can see.

Returns:: list of names (strings)

get_available_classes() → list[type[Plugin]][source]

Get classes seen by this plugin loader.

Returns:: list of classes (types)

get_class_by_name(plugin_name: str) → type[Plugin][source]

Get a plugin class by one of its names.

IE: one of the names listed in its NAMES static attribute.

Parameters:: plugin_name – a name associated with a plugin class
Raises:: PluginNotFoundError – If the plugin name could not be found.
Returns:: class (type)

get_help(plugin_name: str, wrap_width: int | None = None, include_bases: bool = False) → str[source]

Get a formatted help string for a plugin by one of its loadable names.

Parameters:

plugin_name – a name associated with the plugin class
wrap_width – wrap paragraphs to this width.
include_bases – include argument names and inherited help from base classes?

Raises:

PluginNotFoundError – If the plugin name could not be found.

Returns:

formatted string

load(uri: str, **kwargs) → Plugin[source]

Load an plugin using a URI string containing its name and arguments.

Parameters:

uri – The URI string
kwargs – default argument values, will be override by arguments specified in the URI

Raises:

ValueError – If uri is None
RuntimeError – If a plugin is discovered to be using a reserved argument name upon loading it.
dgenerate.plugin.PluginArgumentError – If there is an error in the loading arguments for the plugin.
dgenerate.plugin.PluginNotFoundError – If the plugin name mentioned in the URI could not be found.

Returns:

plugin instance

load_plugin_modules(paths: Iterable[str]) → list[type[Plugin]][source]

Modules that will be loaded from disk, or the python environment, and searched for implementations.

Either python files, or module directories containing __init__.py, or names of python modules installed in the environment.

It can be a mix of these.

Raises:: ModuleFileNotFoundError – If a module path could not be found on disk, or when a module could not be loaded from the python environment.
Parameters:: paths – list of folder/file paths, or references to python modules installed in the environment
Returns:: list of classes that were newly discovered

loader_help(names: Sequence[str], plugin_module_paths: Sequence[str] | None = None, title='plugin', title_plural='plugins', throw=False, log_error=True, include_bases: bool = False)[source]

Implements --sub-command-help and --image-processor-help command line options for example.

Parameters:

names – arguments (sub-command names, or empty list)
plugin_module_paths – extra plugin module paths to search
title – plugin title, used in messages
title_plural – plural plugin title, used in messages
throw – throw on error?
log_error – log errors to stderr?
include_bases – include argument names from base classes?

Raises:

PluginNotFoundError – names contained an unknown plugin name
ModuleFileNotFoundError – plugin_module_paths contained a missing module

Returns:

return-code, anything other than 0 is failure

property plugin_module_paths: frozenset[str]

Every module path ever seen by PluginLoader.load_plugin_modules().

Returns:: frozen set

dgenerate.plugin.import_plugins(paths: Iterable[str])[source]

Set plugin paths that will be considered by all plugin loader instances.

Parameters:: paths – environment modules, python script paths, directory paths

dgenerate.plugin.load_modules(paths: Iterable[str]) → list[ModuleType][source]

Load python modules from a folder, directly from a .py file, or from a python module installed in the environment. Cache them so that repeat requests for loading return an already loaded module.

Raises:: ModuleFileNotFoundError – If a module path could not be found on disk, or when a module could not be loaded from the python environment.
Parameters:: paths – list of folder/file paths, or references to python modules installed in the environment
Returns:: list of types.ModuleType

dgenerate.plugin.LOADED_PLUGIN_MODULES: dict[str, ModuleType] = {}: Plugin module in memory cache

dgenerate.plugin.PLUGIN_PATHS = {}

Plugin paths that are considered by all PluginLoader instances.

This should be updated with import_plugins()

dgenerate.prompt module

Prompt representation object / prompt parsing.

exception dgenerate.prompt.PromptEmbeddedArgumentError[source]

Bases: Exception

Error involving a prompt embedded argument other than weighter

Bases: object

Represents a combined positive and optional negative prompt split by a delimiter character.

static copy(prompt: Prompt)[source]

Return a copy of another prompt.

Parameters:: prompt – The prompt to copy.
Returns:: A copy of the provided prompt.

static parse(value: str, delimiter=';', parse_embedded_args: bool = True, embedded_arg_names: list[str] | None = None) → Prompt[source]

Parse the positive and negative prompt from a string and return a prompt object.

Parameters:

value – the string
delimiter – The prompt delimiter character
parse_embedded_args – parse embedded args? < arg: value >
embedded_arg_names – list of embedded argument names to parse, if None, all are parsed.

Raises:

ValueError – if value is None

Returns:

Prompt (returns self)

Parameters:

positive – positive prompt component.
negative – negative prompt component.
delimiter – delimiter for stringification.
weighter – --prompt-weighter plugin URI.
upscaler – --prompt-upscaler plugin URI.
embedded_args – embedded prompt arguments parsed from <argument: value_text>.

__str__()[source]: Return str(self).

copy_embedded_args_from(prompt: Prompt)[source]

set_embedded_args_on(on_object: Any, forbidden_checker: Callable[[str, Any], bool] | None = None, validate_only: bool = False)[source]

Set the other embedded arguments parsed out of a prompt on to an object.

The object should be type hinted using types from dgenerate.types

Specifically, any of:

dgenerate.types.Size
dgenerate.types.Sizes
dgenerate.types.OptionalSize
dgenerate.types.OptionalSizes
dgenerate.types.Padding
dgenerate.types.Paddings
dgenerate.types.OptionalPadding
dgenerate.types.OptionalPaddings
dgenerate.types.Boolean
dgenerate.types.OptionalBoolean
dgenerate.types.Float
dgenerate.types.Floats
dgenerate.types.OptionalFloat
dgenerate.types.OptionalFloats
dgenerate.types.Integer
dgenerate.types.Integers
dgenerate.types.OptionalInteger
dgenerate.types.OptionalIntegers
dgenerate.types.String
dgenerate.types.Strings
dgenerate.types.OptionalString
dgenerate.types.OptionalStrings
dgenerate.types.Name
dgenerate.types.Names
dgenerate.types.OptionalName
dgenerate.types.OptionalNames
dgenerate.types.Uri
dgenerate.types.Uris
dgenerate.types.OptionalUri
dgenerate.types.OptionalUris

Raises:

PromptEmbeddedArgumentError – If there was a problem applying the embedded arguments to the object.

Parameters:

on_object – The object to set values on.
forbidden_checker – This is a function that should return True if an argument name / value is forbidden to use.
validate_only – Only run validation and do not set any values?

property delimiter: str: Positive / Negative delimiter for this prompt, for example “;”

property embedded_args: list[tuple[str, str]]: Other embedded arguments parsed out of the prompt.

property negative: str | None: Negative prompt value.

property positive: str | None: Positive prompt value.

property upscaler: str | Sequence[str] | None: Embedded prompt upscaler URI argument for this prompt if any.

property weighter: str | None: Embedded prompt weighter URI argument for this prompt if any.

dgenerate.promptupscalers module

exception dgenerate.promptupscalers.PromptUpscalerArgumentError[source]

Bases: PluginArgumentError, PromptUpscalerError

Thrown when a dgenerate.promptupscalers.PromptUpscaler implementation is loaded with an invalid argument.

exception dgenerate.promptupscalers.PromptUpscalerError[source]

Bases: Exception

Generic Prompt Upscaler error base exception.

exception dgenerate.promptupscalers.PromptUpscalerNotFoundError[source]

Bases: PluginNotFoundError, PromptUpscalerError

Thrown when a dgenerate.promptupscalers.PromptUpscaler implementation could not be found for a given name.

exception dgenerate.promptupscalers.PromptUpscalerProcessingError[source]

Bases: PromptUpscalerError

Thrown when a dgenerate.promptupscalers.PromptUpscaler implementation runs into an issue processing a prompt.

class dgenerate.promptupscalers.AttentionUpscaler(part: str = 'both', min: int = 0.1, max: int = 0.9, seed: int | None = None, lang: str = 'en', syntax: str = 'sd-embed', **kwargs)[source]

Bases: PromptUpscaler

Add random attention values to your prompt tokens.

This is ment for use with –prompt-weighter plugins such as “sd-embed” or “compel”

The “part” argument indicates which parts of the prompt to act on, possible values are: “both”, “positive”, and “negative”

The “min” argument sets the minimum value for random attention added. The default value is 0.1

The “max” argument sets the maximum value for random attention added. The Default value is 0.9

The “seed” argument can be used to specify a seed for the random attenuation values that are added to your prompt.

The “lang” argument can be used to specify the prompt language, the default value is ‘en’ for english, this can be one of: ‘en’, ‘de’, ‘fr’, ‘es’, ‘it’, ‘nl’, ‘pt’, ‘ru’, ‘zh’.

The “syntax” argument specifies the token attention value syntax, this can be one of “sd-embed” (SD Web UI Syntax) or “compel” (InvokeAI Syntax).

__init__(part: str = 'both', min: int = 0.1, max: int = 0.9, seed: int | None = None, lang: str = 'en', syntax: str = 'sd-embed', **kwargs)[source]

Parameters:: kwargs – child class forwarded arguments

upscale(prompt: Prompt) → Prompt | Sequence[Prompt][source]

Upscale a prompt / prompts and return them modified.

Parameters:: prompt – The incoming prompt or prompts
Returns:: Modified prompt / prompts, you may return multiple prompts (an iterable) to indicate expansion

HIDE_ARGS = ['device']

NAMES = ['attention']

OPTION_ARGS = {'lang': ['en', 'de', 'fr', 'es', 'it', 'nl', 'pt', 'ru', 'zh'], 'part': ['both', 'positive', 'negative'], 'syntax': ['sd-embed', 'compel']}

class dgenerate.promptupscalers.DynamicPromptsUpscaler(part: str = 'both', random: bool = False, seed: int | None = None, variations: int | None = None, wildcards: str | None = None, **kwargs)[source]

Bases: PromptUpscaler

Upscale prompts with the dynamicprompts library.

This upscaler allows you to use a special syntax for combinatorial prompt variations.

See: https://github.com/adieyal/dynamicprompts

The “part” argument indicates which parts of the prompt to act on, possible values are: “both”, “positive”, and “negative”

The “random” argument specifies that instead of strictly combinatorial output, dynamicprompts should produce N random variations of your prompt given the possibilities you have provided.

The “seed” argument can be used to specify a seed for the “random” prompt generation.

The “variations” argument specifies how many variations should be produced when “random” is set to true. This argument cannot be used without specifying “random”. The default value is 1.

The “wildcards” argument can be used to specify a wildcards directory for dynamicprompts wildcard syntax.

__init__(part: str = 'both', random: bool = False, seed: int | None = None, variations: int | None = None, wildcards: str | None = None, **kwargs)[source]

Parameters:: kwargs – child class forwarded arguments

upscale(prompt: Prompt) → Prompt | Sequence[Prompt][source]

Upscale a prompt / prompts and return them modified.

Parameters:: prompt – The incoming prompt or prompts
Returns:: Modified prompt / prompts, you may return multiple prompts (an iterable) to indicate expansion

HIDE_ARGS = ['device']

NAMES = ['dynamicprompts']

OPTION_ARGS = {'part': ['both', 'positive', 'negative']}

class dgenerate.promptupscalers.MagicPromptUpscaler(part: str = 'both', model: str = 'Gustavosta/MagicPrompt-Stable-Diffusion', dtype: str = 'float32', seed: int | None = None, variations: int = 1, max_length: int = 100, temperature: float = 0.7, top_k: int = 50, top_p: float = 1.0, system: str | None = None, preamble: str | None = None, remove_prompt: bool = False, prepend_prompt: bool = False, batch: bool = True, max_batch: int | None = 50, quantizer: str | None = None, block_regex: str | None = None, max_attempts: int = 10, smart_truncate: bool = False, cleanup_config: str | None = None, **kwargs)[source]

Bases: LLMPromptUpscalerMixin, PromptUpscaler

Upscale prompts using magicprompt or other LLMs via transformers.

The “part” argument indicates which parts of the prompt to act on, possible values are: “both”, “positive”, and “negative”

The “model” specifies the model path for magicprompt, the default value is: “Gustavosta/MagicPrompt-Stable-Diffusion”. This can be a folder on disk or a Hugging Face repository slug.

The “dtype” argument specifies the torch dtype (compute dtype) to load the model with, this defaults to: float32, and may be one of: float32, float16, or bfloat16.

The “seed” argument can be used to specify a seed for prompt generation.

The “variations” argument specifies how many variations should be produced.

The “max-length” argument is the max prompt length for a generated prompt, this value defaults to 100.

The “temperature” argument sets the sampling temperature to use when generating prompts. Larger values increase creativity but decrease factuality.

The “top_k” argument sets the “top_k” generation value, i.e. randomly sample from the “top_k” most likely tokens at each generation step. Set this to 1 for greedy decoding.

The “top_p” argument sets the “top_p” generation value, i.e. randomly sample at each generation step from the top most likely tokens whose probabilities add up to “top_p”.

The “system” argument sets the system instruction for the LLM.

The “preamble” argument sets a text input preamble for the LLM, this preamble will be removed from the output generated by the LLM.

The “remove-prompt” argument specifies whether to remove the original prompt from the generated text.

The “prepend-prompt” argument specifies whether to forcefully prepend the original prompt to the generated prompt, this might be necessary if you want a continuation with some models, the original prompt will be prepended with a space at the end.

The “batch” argument enables and disables batching prompt text into the LLM, setting this to False tells the plugin that you only want the LLM to ever process one prompt at a time, this might be useful if you are memory constrained, but processing is much slower.

The “max-batch” argument allows you to adjust how many prompts can be processed by the LLM simultaneously, processing too many prompts at once will run your system out of memory, processing too little prompts at once will be slow. Specifying “None” indicates unlimited batch size.

The “quantizer” argument allows you to specify a quantization backend for loading the LLM, this is the same syntax and supported backends as with the dgenerate –quantizer argument.

The “block-regex” argument is a python syntax regex that will block prompts that match the regex, the prompt will be regenerated until the regex does not match, up to “max-attempts”. This regex is case-insensitive.

The “max-attempts” argument specifies how many times to reattempt to generate a prompt if it is blocked by “block-regex”

The “smart-truncate” argument enables intelligent truncation of the prompt generated by the LLM, i.e. it will remove incomplete sentences from the end of the prompt utilizing spaCy NLP.

The “cleanup-config” argument allows you to specify a custom LLM output cleanup configuration file in .json, .toml, or .yaml format. This file can be used to run custom pattern substitutions or python functions over the LLMs raw output, and overrides the built-in cleanup excluding “smart-truncate” which occurs before your configuration.

__init__(part: str = 'both', model: str = 'Gustavosta/MagicPrompt-Stable-Diffusion', dtype: str = 'float32', seed: int | None = None, variations: int = 1, max_length: int = 100, temperature: float = 0.7, top_k: int = 50, top_p: float = 1.0, system: str | None = None, preamble: str | None = None, remove_prompt: bool = False, prepend_prompt: bool = False, batch: bool = True, max_batch: int | None = 50, quantizer: str | None = None, block_regex: str | None = None, max_attempts: int = 10, smart_truncate: bool = False, cleanup_config: str | None = None, **kwargs)[source]

Parameters:: kwargs – child class forwarded arguments

upscale(prompts: Prompt | Sequence[Prompt]) → Prompt | Sequence[Prompt][source]

Upscale a prompt / prompts and return them modified.

Parameters:: prompt – The incoming prompt or prompts
Returns:: Modified prompt / prompts, you may return multiple prompts (an iterable) to indicate expansion

FILE_ARGS = {'cleanup-config': {'filetypes': [('Cleanup Config', ('*.json', '*.toml', '*.yaml', '*.yml'))], 'mode': 'in'}, 'model': {'mode': 'dir'}}

NAMES = ['magicprompt']

OPTION_ARGS = {'dtype': ['float32', 'float16', 'bfloat16'], 'part': ['both', 'positive', 'negative']}

property accepts_batch: bool: This prompt upscaler can accept a batch of prompts for efficient execution. :return: True, unless the constructor argument batch was passed False

class dgenerate.promptupscalers.PromptUpscaler(loaded_by_name: str, device: str | None = None, local_files_only: bool = False, **kwargs)[source]

Bases: Plugin, ABC

Abstract base class for prompt upscaler implementations.

classmethod inheritable_help(loaded_by_name)[source]

__init__(loaded_by_name: str, device: str | None = None, local_files_only: bool = False, **kwargs)[source]

Parameters:

loaded_by_name – The name the prompt upscaler was loaded by
device – Torch device string for running any models, passing None defaults the device to cpu
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

load_object_cached(tag: str, estimated_size: int, method: Callable, memory_guard_device: str | device | None = 'cpu')[source]

Load a potentially large object into the CPU side prompt_upscaler object cache.

Parameters:

tag – A unique string within the context of the image processor implementation constructor.
estimated_size – Estimated size in bytes of the object in RAM.
method – A method which loads and returns the object.
memory_guard_device – call PromptUpscaler.memory_guard_device() on the specified device before the object is loaded (on cache miss)

Returns:

The loaded object

memory_guard_device(device: str | device, memory_required: int)[source]

Check a specific device against an amount of memory in bytes.

If the device is a gpu device and any of the memory constraints specified by dgenerate.promptupscalers.constants.PROMPT_UPSCALER_GPU_MEMORY_CONSTRAINTS are met on that device, attempt to remove cached objects off a gpu device to free space.

If the device is a cpu and any of the memory constraints specified by dgenerate.promptupscalers.constants.PROMPT_UPSCALER_CACHE_GC_CONSTRAINTS are met, attempt to remove cached prompt upscaler objects off the device to free space. Then, enforce dgenerate.promptupscalers.constants.PROMPT_UPSCALER_CACHE_MEMORY_CONSTRAINTS.

Parameters:

device – the device
memory_required – the amount of memory required on the device in bytes

Returns:

True if an attempt was made to free memory, False otherwise.

set_size_estimate(size_bytes: int)[source]

Set the estimated size of this plugin in bytes for memory management heuristics, this is intended to be used by implementors of the PromptUpscaler plugin class.

For the best memory optimization, this value should be set very shortly before any associated model even enters CPU side ram, IE: before it is loaded at all.

Raises:: ValueError – if size_bytes is less than zero.
Parameters:: size_bytes – the size in bytes

abstractmethod upscale(prompt: Prompt | Sequence[Prompt]) → Prompt | Sequence[Prompt][source]

Upscale a prompt / prompts and return them modified.

Parameters:: prompt – The incoming prompt or prompts
Returns:: Modified prompt / prompts, you may return multiple prompts (an iterable) to indicate expansion

HIDE_ARGS = ['local-files-only']

property accepts_batch

Can this prompt upscaler accept a batch of prompts?

The implementor must override this property, this is a default implementation.

Returns:: Default: False

property device: str: Device that will be used for any text processing models.

property local_files_only: bool: Is this prompt upscaler only going to look for resources such as models in cache / on disk?

property size_estimate: int: Estimated size of the models / objects used by this prompt upscaler. :return: size in bytes

class dgenerate.promptupscalers.PromptUpscalerLoader[source]

Bases: PluginLoader

Loads dgenerate.promptupscalers.PromptUpscaler plugins.

__init__()[source]

Parameters:

base_class – Base class of plugins, will be used for searching modules.
description – Short plugin description / name, used in exception messages.
reserved_args – Constructor arguments that are used by the plugin class which cannot be redefined by implementors of the plugin class. This should be a list of plugin argument descriptors, PluginArg
argument_error_type – This exception type will be raised when the plugin is loaded with invalid URI arguments.
not_found_error_type – This exception type will be raised when a plugin could not be located by a name specified in a loading URI.

load(uri: str, device: str = 'cpu', local_files_only: bool = False, **kwargs) → PromptUpscaler[source]

Parameters:

uri – prompt upscaler URI
device – Device used for any text processing models.
local_files_only – Should the prompt upscaler avoid downloading files from Hugging Face hub and only check the cache or local directories?
kwargs – Additional plugin arguments

Returns:

dgenerate.promptupscalers.PromptUpscaler

class dgenerate.promptupscalers.TranslatePromptUpscaler(input: str, output: str = 'en', part: str = 'both', provider: str = 'argos', batch: bool = True, max_batch: int | None = 50, **kwargs)[source]

Bases: PromptUpscaler

Local language translation using argostranslate or Helsinki-NLP opus (mariana).

Please note that translation models require a one time download, so run at least once with –offline-mode disabled to download the desired model.

argostranslate (argos) offers lightweight translation via CPU inference.

Helsinki-NLP (mariana) offers slightly more heavy duty (accurate) CPU or GPU inference.

The “input” argument indicates the input language code (IETF) e.g. “en”, “zh”, or literal name of the language for example: “english”, “chinese”.

The “output” argument indicates the output language code (IETF), or literal name of the language, this value defaults to “en” (English).

The “provider” argument indicates the translation provider, which may be one of “argos” or “mariana”. The default value is “argos”, indicating argostranslate. argos will only ever use the “cpu” regardless of the current –device or “device” argument value. Mariana will default to using the value of –device which will usually be a GPU.

The “batch” argument enables and disables batching prompt text into the translator, setting this to False tells the plugin that you only want to ever process one prompt at a time, this might be useful if you are memory constrained and using the provider “mariana”, but processing is much slower.

The “max-batch” argument allows you to adjust how many prompts can be processed by the model simultaneously, processing too many prompts at once will run your system out of memory (specifically for the mariana translation provider), processing too little prompts at once will be slow. Specifying “None” indicates unlimited batch size. This argument has no effect on argostranslate performance.

__init__(input: str, output: str = 'en', part: str = 'both', provider: str = 'argos', batch: bool = True, max_batch: int | None = 50, **kwargs)[source]

Parameters:: kwargs – child class forwarded arguments

accepts_batch() → bool[source]

Can this prompt upscaler accept a batch of prompts?

The implementor must override this property, this is a default implementation.

Returns:: Default: False

upscale(prompts: Sequence[Prompt]) → Prompt | Sequence[Prompt][source]

Upscale a prompt / prompts and return them modified.

Parameters:: prompt – The incoming prompt or prompts
Returns:: Modified prompt / prompts, you may return multiple prompts (an iterable) to indicate expansion

NAMES = ['translate']

OPTION_ARGS = {'part': ['both', 'positive', 'negative'], 'provider': ['argos', 'mariana']}

dgenerate.promptupscalers.create_prompt_upscaler(uri: str, device: str = 'cpu', local_files_only: bool = False) → PromptUpscaler[source]

Create a prompt upscaler implementation using the default PromptUpscalerLoader instance.

Parameters:

uri – The prompt upscaler URI
device – Device used for any text processing models.
local_files_only – Should the prompt upscaler avoid downloading files from Hugging Face hub and only check the cache or local directories?

Returns:

A PromptUpscaler implementation

dgenerate.promptupscalers.prompt_upscaler_exists(uri: str)[source]

Check if a prompt upscaler implementation exists for a given URI.

This uses the default PromptUpscalerLoader instance.

Parameters:: uri – The prompt upscaler URI
Returns:: True or False

dgenerate.promptupscalers.prompt_upscaler_help(names: Sequence[str], plugin_module_paths: Sequence[str] | None = None, throw=False, log_error=True)[source]

Implements --prompt-upscaler-help command line option

Parameters:

names – arguments (prompt upscaler names, or empty list)
plugin_module_paths – extra plugin module paths to search
throw – throw on error? or simply print to stderr and return a return code.
log_error – log errors to stderr?

Raises:

Returns:

return-code, anything other than 0 is failure

dgenerate.promptupscalers.prompt_upscaler_name_from_uri(uri: str)[source]

Extract just the implementation name from a prompt upscaler URI.

Parameters:: uri – the URI
Returns:: the implementation name.

dgenerate.promptupscalers.prompt_upscaler_names()[source]

Implementation names for all prompt upscalers implemented by dgenerate, which are visible to the default PromptUpscalerLoader instance.

Returns:: a list of prompt upscaler implementation names.

dgenerate.promptupscalers.upscale_prompts(prompts: Sequence[Prompt], default_upscaler_uri: str | Sequence[str] | None = None, device: str = 'cpu', local_files_only: bool = False)[source]

Apply prompt upscaling to a list of prompts and return a possibly expanded list of prompts.

Parameters:

prompts – Input prompt objects.
default_upscaler_uri – The default upscaler plugin URI, or a list of URIs (chain upscalers together sequentially)
device – Execution device for upscalers that can utilize hardware acceleration
local_files_only – Should all prompt upscalers involved avoid downloading files only check the cache or locally specified files?

Returns:

Altered prompts list

dgenerate.promptupscalers.constants module

dgenerate.promptupscalers.constants.PROMPT_UPSCALER_GPU_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear gpu VRAM upon a prompt upscaler plugin calling dgenerate.promptupscalers.PromptUpscaler.memory_guard_device() on a cuda device, syntax provided via dgenerate.memory.gpu_memory_constraints()

If any of these constraints are met, an effort is made to clear modules off a GPU which are cached for fast repeat usage but are okay to flush.

The only available extra variable is: memory_required, which is the amount of memory the prompt upscaler plugin requested to be available.

dgenerate.promptupscalers.constants.PROMPT_UPSCALER_CACHE_GC_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear objects out of any CPU side cache upon a prompt upscaler plugin calling dgenerate.promptupscalers.PromptUpscaler.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of any named CPU side cache.

The only available extra variable is: memory_required, which is the amount of memory the prompt upscaler plugin requested to be available.

dgenerate.promptupscalers.constants.PROMPT_UPSCALER_CACHE_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear specifically the prompt upscaler object cache upon a prompt upscaler plugin calling dgenerate.promptupscalers.PromptUpscaler.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of the prompt upscaler object cache.

Available extra variables are: memory_required, which is the amount of memory the prompt upscaler plugin requested to be available, and cache_size which is the current size of the prompt upscaler object cache.

dgenerate.promptweighters module

exception dgenerate.promptweighters.PromptWeighterArgumentError[source]

Bases: PluginArgumentError, PromptWeighterError

Thrown when a dgenerate.promptweighters.PromptWeighter implementation is loaded with an invalid argument.

exception dgenerate.promptweighters.PromptWeighterError[source]

Bases: Exception

Generic Prompt Weighter error base exception.

exception dgenerate.promptweighters.PromptWeighterNotFoundError[source]

Bases: PluginNotFoundError, PromptWeighterError

Thrown when a dgenerate.promptweighters.PromptWeighter implementation could not be found for a given name.

exception dgenerate.promptweighters.PromptWeightingUnsupported[source]

Bases: PromptWeighterError

Thrown when a dgenerate.promptweighters.PromptWeighter implementation cannot handle a specific pipeline or combination of pipeline arguments.

class dgenerate.promptweighters.CompelPromptWeighter(syntax: str = 'compel', **kwargs)[source]

Bases: PromptWeighter

Implements prompt weighting syntax for Stable Diffusion 1/2 and Stable Diffusion XL using compel. The default syntax is “compel” which is analogous to the syntax used by InvokeAI.

Specifying the syntax “sdwui” will translate your prompt from Stable Diffusion Web UI syntax into compel / InvokeAI syntax before generating the prompt embeddings.

If you wish to use prompt syntax for weighting tokens that is similar to ComfyUI, Automatic1111, or CivitAI for example, use: ‘compel;syntax=sdwui’

The underlying weighting behavior for tokens is not exactly the same as other software that uses the more common “sdwui” syntax, so your prompt may need adjusting if you are reusing a prompt from those other pieces of software.

You can read about compel here: https://github.com/damian0815/compel

And InvokeAI here: https://github.com/invoke-ai/InvokeAI

This prompt weighter supports the model types:

NOWRAP! –model-type sd –model-type pix2pix –model-type upscaler-x4 –model-type sdxl –model-type sdxl-pix2pix –model-type s-cascade –model-type flux –model-type flux-fill –model-type flux-kontext

The secondary prompt option for SDXL and Flux –second-prompts is supported by this prompt weighter implementation. However, –second-model-second-prompts is not supported and will be ignored with a warning message.

For Flux models, the main prompt is processed by the T5 text encoder, while the secondary prompt (style prompt) is processed by the CLIP text encoder to generate pooled embeddings.

__init__(syntax: str = 'compel', **kwargs)[source]

Parameters:

loaded_by_name – The name the prompt weighter was loaded by
model_type – Model type enum dgenerate.ModelType
dtype – Data type enum dgenerate.DataType
device – The device the prompt weighter should operate on
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

cleanup()[source]: Perform any cleanup required after translating the pipeline arguments to embeds

get_extra_supported_args() → list[str][source]

Overridable method.

Return a list of extra supported prompt arguments that are not typically supported by the given model that embed generation is occurring for.

This can be used to make use of the --second-prompts, --third-prompts, --second-model-second-prompts, or --second-model-third-prompts arguments for additional textual inputs to the prompt weighter plugin.

This works even if the model you are generating embeds for would normally reject these inputs.

Particularly for arguments that dgenerate.pipelinewrapper.DiffusionPipelineWrapper will automatically reject based on the underlying pipeline argument signature.

You should return a list of pipeline argument names, the values can be any of the following:

prompt_2
negative_prompt_2
prompt_3
negative_prompt_3
clip_skip

If you return a value outside the set of values listed here, a RuntimeError will be raised by dgenerate.pipelinewrapper.DiffusionPipelineWrapper upon attempting to consume the values from this method when calling a pipeline.

Returns:: List of pipeline argument names.

translate_to_embeds(pipeline: DiffusionPipeline, args: dict[str, Any])[source]

Override me to implement.

Translate the pipeline prompt arguments to prompt_embeds and pooled_prompt_embeds as needed.

Parameters:

pipeline – The pipeline object
args – Call arguments to the pipeline

Returns:

args, supplemented with prompt embedding arguments

NAMES = ['compel']

OPTION_ARGS = {'syntax': ['compel', 'sdwui']}

class dgenerate.promptweighters.LLM4GENPromptWeighter(encoder: str = 'xl-all', projector: str = 'Shui-VL/LLM4GEN-models', projector_subfolder: str | None = None, projector_revision: str | None = None, projector_weight_name: str = 'projector.pth', weighter: str | None = None, llm_dtype: str = 'float32', llm_quantizer: str | None = None, token: str | None = None, **kwargs)[source]

Bases: PromptWeighter

LLM4GEN prompt weighter specifically for Stable Diffusion 1.5, See: https://github.com/YUHANG-Ma/LLM4GEN

Stable Diffusion 2.* is not supported.

This prompt weighter supports the model types:

NOWRAP! –model-type sd –model-type pix2pix –model-type upscaler-x4

You may use the –second-prompts argument of dgenerate to pass a prompt explicitly to the T5 rankgen encoder, which uses the primary prompt by default otherwise.

The “encoder” argument specifies the T5 rankgen encoder model variant.

The encoder variant specified must be one of:

NOWRAP! * base-all * large-all * xl-all * xl-pg19

The “projector” argument specifies a Hugging Face repo or file path to the LLM4GEN projector (CAM) model.

The “projector-revision” argument specifies the revision of the Hugging Face projector repository, for example “main”.

The “projector-subfolder” argument specifies the subfolder for the projector file in a Hugging Face repository.

The “projector-weight-name” argument specifies the weight name of the projector file in a Hugging Face repository.

The “weighter” argument can be used to specify a prompt weighter that will be used for CLIP embedding generation, this may be one of “sd-embed” or “compel”. Weighting does not occur for the rankgen encoder, and if you do not pass –second-prompts to dgenerate while using this argument, the rankgen encoder will receive the primary prompt with all weighting syntax filtered out. This automatic filtering only occurs when you specify “weighter” without specifying –second-prompts to dgenerate.

The “llm-dtype” argument specifies the precision for the rankgen encoder and llm4gen CAM projector model, changing this to ‘float16’ or ‘bfloat16’ will cut memory use in half at the possible cost of output quality.

The “llm-quantizer” argument specifies the quantization backend to use when loading the rankgen encoder, this argument uses dgenerate –quantizer syntax.

The “token” argument allows you to explicitly specify a Hugging Face auth token for downloads.

NOWRAP! @misc{liu2024llm4genleveragingsemanticrepresentation,

title={LLM4GEN: Leveraging Semantic Representation of LLMs for Text-to-Image Generation}, author={Mushui Liu and Yuhang Ma and Xinfeng Zhang and Yang Zhen and Zeng Zhao and Zhipeng Hu and Bai Liu and Changjie Fan}, year={2024},

}

__init__(encoder: str = 'xl-all', projector: str = 'Shui-VL/LLM4GEN-models', projector_subfolder: str | None = None, projector_revision: str | None = None, projector_weight_name: str = 'projector.pth', weighter: str | None = None, llm_dtype: str = 'float32', llm_quantizer: str | None = None, token: str | None = None, **kwargs)[source]

Parameters:

loaded_by_name – The name the prompt weighter was loaded by
model_type – Model type enum dgenerate.ModelType
dtype – Data type enum dgenerate.DataType
device – The device the prompt weighter should operate on
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

cleanup()[source]: Perform any cleanup required after translating the pipeline arguments to embeds

get_extra_supported_args() → list[str][source]

Overridable method.

Return a list of extra supported prompt arguments that are not typically supported by the given model that embed generation is occurring for.

This can be used to make use of the --second-prompts, --third-prompts, --second-model-second-prompts, or --second-model-third-prompts arguments for additional textual inputs to the prompt weighter plugin.

This works even if the model you are generating embeds for would normally reject these inputs.

Particularly for arguments that dgenerate.pipelinewrapper.DiffusionPipelineWrapper will automatically reject based on the underlying pipeline argument signature.

You should return a list of pipeline argument names, the values can be any of the following:

prompt_2
negative_prompt_2
prompt_3
negative_prompt_3
clip_skip

If you return a value outside the set of values listed here, a RuntimeError will be raised by dgenerate.pipelinewrapper.DiffusionPipelineWrapper upon attempting to consume the values from this method when calling a pipeline.

Returns:: List of pipeline argument names.

translate_to_embeds(pipeline: DiffusionPipeline, args: dict[str, Any])[source]

Override me to implement.

Translate the pipeline prompt arguments to prompt_embeds and pooled_prompt_embeds as needed.

Parameters:

pipeline – The pipeline object
args – Call arguments to the pipeline

Returns:

args, supplemented with prompt embedding arguments

FILE_ARGS = {'projector': {'mode': 'dir'}}

NAMES = ['llm4gen']

OPTION_ARGS = {'encoder': ['base-all', 'large-all', 'xl-all', 'xl-pg19'], 'llm_dtype': ['float32', 'float16', 'bfloat16']}

class dgenerate.promptweighters.PromptWeighter(loaded_by_name: str, model_type: ModelType, dtype: DataType, device: str | None = None, local_files_only: bool = False, **kwargs)[source]

Bases: Plugin, ABC

Abstract base class for prompt weighter implementations.

classmethod inheritable_help(loaded_by_name)[source]

static move_text_encoders(pipeline, device: str)[source]

Utility for moving all of a pipelines text encoders to a device.

Parameters:

pipeline – The diffusion pipeline.
device – The desired device.

__init__(loaded_by_name: str, model_type: ModelType, dtype: DataType, device: str | None = None, local_files_only: bool = False, **kwargs)[source]

Parameters:

loaded_by_name – The name the prompt weighter was loaded by
model_type – Model type enum dgenerate.ModelType
dtype – Data type enum dgenerate.DataType
device – The device the prompt weighter should operate on
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

cleanup()[source]: Perform any cleanup required after translating the pipeline arguments to embeds

get_extra_supported_args() → list[str][source]

Overridable method.

Return a list of extra supported prompt arguments that are not typically supported by the given model that embed generation is occurring for.

This can be used to make use of the --second-prompts, --third-prompts, --second-model-second-prompts, or --second-model-third-prompts arguments for additional textual inputs to the prompt weighter plugin.

This works even if the model you are generating embeds for would normally reject these inputs.

Particularly for arguments that dgenerate.pipelinewrapper.DiffusionPipelineWrapper will automatically reject based on the underlying pipeline argument signature.

You should return a list of pipeline argument names, the values can be any of the following:

prompt_2
negative_prompt_2
prompt_3
negative_prompt_3
clip_skip

If you return a value outside the set of values listed here, a RuntimeError will be raised by dgenerate.pipelinewrapper.DiffusionPipelineWrapper upon attempting to consume the values from this method when calling a pipeline.

Returns:: List of pipeline argument names.

load_object_cached(tag: str, estimated_size: int, method: Callable, memory_guard_device: str | device | None = 'cpu')[source]

Load a potentially large object into the CPU side prompt_weighter object cache.

Parameters:

tag – A unique string within the context of the image processor implementation constructor.
estimated_size – Estimated size in bytes of the object in RAM.
method – A method which loads and returns the object.
memory_guard_device – call PromptWeighter.memory_guard_device() on the specified device before the object is loaded (on cache miss)

Returns:

The loaded object

memory_guard_device(device: str | device, memory_required: int)[source]

Check a specific device against an amount of memory in bytes.

If the device is a gpu device and any of the memory constraints specified by dgenerate.promptweighters.constants.PROMPT_WEIGHTER_GPU_MEMORY_CONSTRAINTS are met on that device, attempt to remove cached objects off a gpu device to free space.

If the device is a cpu and any of the memory constraints specified by dgenerate.promptweighters.constants.PROMPT_WEIGHTER_CACHE_GC_CONSTRAINTS are met, attempt to remove cached prompt weighter objects off the device to free space. Then, enforce dgenerate.promptweighters.constants.PROMPT_WEIGHTER_CACHE_MEMORY_CONSTRAINTS.

Parameters:

device – the device
memory_required – the amount of memory required on the device in bytes

Returns:

True if an attempt was made to free memory, False otherwise.

set_size_estimate(size_bytes: int)[source]

Set the estimated size of this plugin in bytes for memory management heuristics, this is intended to be used by implementors of the PromptWeighter plugin class.

For the best memory optimization, this value should be set very shortly before any associated model even enters CPU side ram, IE: before it is loaded at all.

Raises:: ValueError – if size_bytes is less than zero.
Parameters:: size_bytes – the size in bytes

abstractmethod translate_to_embeds(pipeline: DiffusionPipeline, args: dict[str, Any])[source]

Override me to implement.

Translate the pipeline prompt arguments to prompt_embeds and pooled_prompt_embeds as needed.

Parameters:

pipeline – The pipeline object
args – Call arguments to the pipeline

Returns:

args, supplemented with prompt embedding arguments

HIDE_ARGS = ['model-type', 'dtype', 'local-files-only', 'device']

property device: str: The device the prompt weighter operates on.

property dtype: DataType: Embeddings data type.

property local_files_only: bool: Is this prompt weighter only going to look for resources such as model files in cache / on disk?

property model_type: ModelType: Model type that will use the embeddings.

property size_estimate: int: Estimated size of the models / objects used by this prompt weighter. :return: size in bytes

class dgenerate.promptweighters.PromptWeighterLoader[source]

Bases: PluginLoader

Loads dgenerate.promptweighters.PromptWeighter plugins.

__init__()[source]

Parameters:

base_class – Base class of plugins, will be used for searching modules.
description – Short plugin description / name, used in exception messages.
reserved_args – Constructor arguments that are used by the plugin class which cannot be redefined by implementors of the plugin class. This should be a list of plugin argument descriptors, PluginArg
argument_error_type – This exception type will be raised when the plugin is loaded with invalid URI arguments.
not_found_error_type – This exception type will be raised when a plugin could not be located by a name specified in a loading URI.

load(uri: str, device: str | None = None, local_files_only: bool = False, **kwargs) → PromptWeighter[source]

Parameters:

uri – prompt weighter URI
device – The device the prompt weighter should operate on
local_files_only – Should the prompt weighter avoid downloading files from Hugging Face hub and only check the cache or local directories?
kwargs – Additional plugin arguments

Returns:

dgenerate.promptweighters.PromptWeighter

class dgenerate.promptweighters.SdEmbedPromptWeighter(**kwargs)[source]

Bases: PromptWeighter

Implements prompt weighting syntax for Stable Diffusion 1/2, Stable Diffusion XL, and Stable Diffusion 3, and Flux using sd_embed.

sd_embed uses a Stable Diffusion Web UI compatible prompt syntax.

See: https://github.com/xhinker/sd_embed

NOWRAP! @misc{sd_embed_2024,

author = {Shudong Zhu(Andrew Zhu)}, title = {Long Prompt Weighted Stable Diffusion Embedding}, howpublished = {url{https://github.com/xhinker/sd_embed}}, year = {2024},

}

This prompt weighter supports the model types:

NOWRAP! –model-type sd –model-type pix2pix –model-type upscaler-x4 –model-type sdxl –model-type sdxl-pix2pix –model-type s-cascade –model-type sd3 –model-type flux –model-type flux-fill –model-type flux-kontext

The secondary prompt option for SDXL –second-prompts is supported by this prompt weighter implementation. However, –second-model-second-prompts is not supported and will be ignored with a warning message.

The secondary prompt option for SD3 –second-prompts is not supported by this prompt weighter implementation. Neither is –third-prompts. The prompts from these arguments will be ignored.

The secondary prompt option for Flux –second-prompts is supported by this prompt weighter.

Flux does not support negative prompting in either prompt.

__init__(**kwargs)[source]

Parameters:

loaded_by_name – The name the prompt weighter was loaded by
model_type – Model type enum dgenerate.ModelType
dtype – Data type enum dgenerate.DataType
device – The device the prompt weighter should operate on
local_files_only – if True, the plugin should never try to download models from the internet automatically, and instead only look for them in cache / on disk.
kwargs – child class forwarded arguments

cleanup()[source]: Perform any cleanup required after translating the pipeline arguments to embeds

get_extra_supported_args() → list[str][source]

Overridable method.

Return a list of extra supported prompt arguments that are not typically supported by the given model that embed generation is occurring for.

This can be used to make use of the --second-prompts, --third-prompts, --second-model-second-prompts, or --second-model-third-prompts arguments for additional textual inputs to the prompt weighter plugin.

This works even if the model you are generating embeds for would normally reject these inputs.

Particularly for arguments that dgenerate.pipelinewrapper.DiffusionPipelineWrapper will automatically reject based on the underlying pipeline argument signature.

You should return a list of pipeline argument names, the values can be any of the following:

prompt_2
negative_prompt_2
prompt_3
negative_prompt_3
clip_skip

If you return a value outside the set of values listed here, a RuntimeError will be raised by dgenerate.pipelinewrapper.DiffusionPipelineWrapper upon attempting to consume the values from this method when calling a pipeline.

Returns:: List of pipeline argument names.

translate_to_embeds(pipeline: DiffusionPipeline, args: dict[str, Any])[source]

Override me to implement.

Translate the pipeline prompt arguments to prompt_embeds and pooled_prompt_embeds as needed.

Parameters:

pipeline – The pipeline object
args – Call arguments to the pipeline

Returns:

args, supplemented with prompt embedding arguments

NAMES = ['sd-embed']

dgenerate.promptweighters.create_prompt_weighter(uri: str, model_type: ModelType, dtype: DataType, device: str | None = None, local_files_only: bool = False) → PromptWeighter[source]

Create a prompt weighter implementation using the default PromptWeighterLoader instance.

Parameters:

uri – The prompt weighter URI
model_type – Model type the prompt weighter is expected to handle
dtype – The dtype of the pipeline
device – The device the prompt weighter should operate on
local_files_only – Should the prompt weighter avoid downloading files from Hugging Face hub and only check the cache or local directories?

Returns:

A PromptWeighter implementation

dgenerate.promptweighters.prompt_weighter_exists(uri: str)[source]

Check if a prompt weighter implementation exists for a given URI.

This uses the default PromptWeighterLoader instance.

Parameters:: uri – The prompt weighter URI
Returns:: True or False

dgenerate.promptweighters.prompt_weighter_help(names: Sequence[str], plugin_module_paths: Sequence[str] | None = None, throw=False, log_error=True)[source]

Implements --prompt-weighter-help command line option

Parameters:

names – arguments (prompt weighter names, or empty list)
plugin_module_paths – extra plugin module paths to search
throw – throw on error? or simply print to stderr and return a return code.
log_error – log errors to stderr?

Raises:

Returns:

return-code, anything other than 0 is failure

dgenerate.promptweighters.prompt_weighter_name_from_uri(uri: str)[source]

Extract just the implementation name from a prompt weighter URI.

Parameters:: uri – the URI
Returns:: the implementation name.

dgenerate.promptweighters.prompt_weighter_names()[source]

Implementation names for all prompt weighters implemented by dgenerate, which are visible to the default PromptWeighterLoader instance.

Returns:: a list of prompt weighter implementation names.

dgenerate.promptweighters.constants module

dgenerate.promptweighters.constants.PROMPT_WEIGHTER_GPU_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear cuda VRAM upon a prompt weighter plugin calling dgenerate.promptweighters.PromptWeighter.memory_guard_device() on a cuda device, syntax provided via dgenerate.memory.gpu_memory_constraints()

If any of these constraints are met, an effort is made to clear modules off a GPU which are cached for fast repeat usage but are okay to flush.

The only available extra variable is: memory_required, which is the amount of memory the prompt weighter plugin requested to be available.

dgenerate.promptweighters.constants.PROMPT_WEIGHTER_CACHE_GC_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear objects out of any CPU side cache upon a prompt weighter plugin calling dgenerate.promptweighters.PromptWeighter.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of any named CPU side cache.

The only available extra variable is: memory_required, which is the amount of memory the prompt weighter plugin requested to be available.

dgenerate.promptweighters.constants.PROMPT_WEIGHTER_CACHE_MEMORY_CONSTRAINTS = ['memory_required > (available * 0.70)']

Cache constraint expressions for when to attempt to clear specifically the prompt weighter object cache upon a prompt weighter plugin calling dgenerate.promptweighters.PromptWeighter.memory_guard_device() on the cpu, syntax provided via dgenerate.memory.memory_constraints()

If any of these constraints are met, an effort is made to clear objects out of the prompt weighter object cache.

Available extra variables are: memory_required, which is the amount of memory the prompt weighter plugin requested to be available, and cache_size which is the current size of the prompt weighter object cache.

dgenerate.pygments module

This module provides a pygments lexer for the dgenerate config / shell language.

This can be used for syntax highlighting.

class dgenerate.pygments.DgenerateLexer(*args, **kwds)[source]

Bases: RegexLexer

pygments lexer for dgenerate configuration / script

get_tokens_unprocessed(text, stack=('root',))

Split text into (tokentype, text) pairs.

stack is the initial stack (default: ['root'])

aliases = ['dgenerate']: A list of short, unique identifiers that can be used to look up the lexer from a list, e.g., using get_lexer_by_name().

filenames = ['*.dgen']: A list of fnmatch patterns that match filenames which contain content for this lexer. The patterns in this list should be unique among all lexers.

name = 'DgenerateLexer': Full name of the lexer, in human-readable form

dgenerate.spacycache module

Tools for downloading spaCy models to arbitrary locations, compatible with dgenerate’s frozen environment.

exception dgenerate.spacycache.SpacyModelNotFoundError[source]

Bases: ModelNotFoundError

Raised when a spacy model cannot be loaded, due to being unable to locate it either online or in the cache.

dgenerate.spacycache.disable_offline_mode()[source]

Disable global offline mode for the spacy cache.

This will allow network requests to be made again.

dgenerate.spacycache.enable_offline_mode()[source]

Enable global offline mode for the spacy cache.

This will prevent any network requests from being made, and will only use files that are already in the spacy cache.

dgenerate.spacycache.get_spacy_cache_directory() → str[source]

Get the default spacy model cache directory.

Or the value of the environmental variable DGENERATE_CACHE joined with spacy.

Returns:: string (directory path)

dgenerate.spacycache.is_offline_mode() → bool[source]

Check if the global offline mode for the spacy cache is enabled.

Returns:: True if offline mode is enabled, False otherwise.

dgenerate.spacycache.load_spacy_model(name: str, *, vocab: Vocab | bool = True, disable: str | Iterable[str] = [], enable: str | Iterable[str] = [], exclude: str | Iterable[str] = [], config: dict[str, Any] | Config = {}, local_files_only: bool = False) → Language[source]

Load a spaCy model, possibly downloading it if needed.

Parameters:

name – Name of the spaCy model.
vocab – A Vocab object. If True, a vocab is created.
disable – Name(s) of pipeline component(s) to disable. Disabled pipes will be loaded but won’t be run unless explicitly enabled using nlp.enable_pipe.
enable – Name(s) of pipeline component(s) to enable. All other pipes will be disabled but can be enabled later using nlp.enable_pipe.
exclude – Name(s) of pipeline component(s) to exclude. Excluded components won’t be loaded.
config – Config overrides as a nested dict or a dict keyed by section values in dot notation.
local_files_only – Avoid connecting to the internet? look in the cache only.

Returns:

The loaded nlp object. spacy.Language

dgenerate.spacycache.offline_mode_context(enabled=True)[source]

Context manager to temporarily enable or disable global offline mode for the spacy cache.

Parameters:: enabled – If True, enables offline mode. If False, disables it.

dgenerate.resources module

Package resources, version, pre-release and latest release information, icon, etc.

This module can be imported without incurring a large import overhead.

class dgenerate.resources.CurrentReleaseInfo(version: str, commit: str | None, branch: str | None, pre_release: bool)[source]

Bases: object

classmethod json_load(fo: IO[str])[source]

__init__(version: str, commit: str | None, branch: str | None, pre_release: bool)[source]

copy()[source]

json_dump(fo: IO[str])[source]

json_dumps() → str[source]

branch: str

commit: str

pre_release: bool

version: str

class dgenerate.resources.LatestReleaseInfo(tag_name: str, release_name: str, release_url: str)[source]

Bases: object

Latest release info from github.

__init__(tag_name: str, release_name: str, release_url: str)[source]

release_name: str

release_url: str

tag_name: str

class dgenerate.resources.VersionComparison(value)[source]

Bases: Enum

Version comparison result.

SAME = 2

V1_NEWER = 0

V2_NEWER = 1

dgenerate.resources.check_latest_release() → LatestReleaseInfo | None[source]

Get the latest software release for this software.

Returns:: ReleaseInfo

dgenerate.resources.compare_versions(version1: str, version2: str) → VersionComparison[source]

Python PEP 440 version comparison utility.

Parameters:

version1 – left version
version2 – right version

Returns:

VersionComparison

dgenerate.resources.get_icon_data() → bytes[source]: Get dgenerates .ico icon file as an array of bytes. :return: bytes

dgenerate.resources.get_icon_path() → str[source]: Get a path to dgenerates .ico icon file. :return: file path

dgenerate.resources.get_release_info() → CurrentReleaseInfo[source]: Return release information, commit and branch will be None inside the development environment.

dgenerate.resources.version()[source]: Code version. In the form MAJOR.MINOR.PATCH.

dgenerate.renderloop module

The main dgenerate render loop, which implements the primary functionality of dgenerate.

exception dgenerate.renderloop.RenderLoopConfigError[source]

Bases: Exception

Raised by RenderLoopConfig.check() on configuration errors.

class dgenerate.renderloop.AnimationETAEvent(origin, frame_index: int, total_frames: int, eta: timedelta)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when there is an update about the estimated finish time of an animation being generated.

__init__(origin, frame_index: int, total_frames: int, eta: timedelta)[source]

eta: timedelta: Current estimated time to complete the animation.

frame_index: int: Frame index at which the ETA was calculated.

total_frames: int: Total frames needed for the animation to complete.

class dgenerate.renderloop.AnimationFileFinishedEvent(origin: RenderLoop, path: str, config_filename: str, starting_event: StartingAnimationFileEvent)[source]

Bases: Event

Generated in the event stream of RenderLoop.events()

Occurs when an animation (video or animated image) has finished being written to disk.

__init__(origin: RenderLoop, path: str, config_filename: str, starting_event: StartingAnimationFileEvent)[source]

config_filename: str | None: Path to a dgenerate config file if output_configs is enabled.

path: str: Path on disk where the video/animated image was saved.

starting_event: StartingAnimationFileEvent: Animation StartingAnimationFileEvent related to this file finished event.

class dgenerate.renderloop.AnimationFinishedEvent(origin, starting_event: StartingAnimationEvent)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are done generating.

This occurs whether an animation was written to disk or not.

__init__(origin, starting_event: StartingAnimationEvent)[source]

starting_event: StartingAnimationEvent: Animation StartingAnimationEvent related to this file finished event.

class dgenerate.renderloop.ImageFileSavedEvent(origin: RenderLoop, generated_event: ImageGeneratedEvent, path: str, config_filename: str | None = None)[source]

Bases: Event

Generated in the event stream of RenderLoop.events()

Occurs when an image file is written to disk.

__init__(origin: RenderLoop, generated_event: ImageGeneratedEvent, path: str, config_filename: str | None = None)[source]

config_filename: str | None = None: Path to a dgenerate config file if output_configs is enabled.

generated_event: ImageGeneratedEvent: The ImageGeneratedEvent for the image that was saved.

path: str: Path to the saved image.

class dgenerate.renderloop.ImageGeneratedEvent(origin: RenderLoop, image: Image | None, latents: Tensor | None, generation_step: int, batch_index: int, suggested_directory: str, suggested_filename: str, diffusion_args: DiffusionArguments, image_seed: ImageSeed, command_string: str, config_string: str)[source]

Bases: Event

Generated in the event stream of RenderLoop.events()

Occurs when an image is generated (but not saved yet).

__init__(origin: RenderLoop, image: Image | None, latents: Tensor | None, generation_step: int, batch_index: int, suggested_directory: str, suggested_filename: str, diffusion_args: DiffusionArguments, image_seed: ImageSeed, command_string: str, config_string: str)[source]

batch_index: int: The index in the image batch for this image. Will only every be greater than zero if RenderLoopConfig.batch_size > 1 and RenderLoopConfig.batch_grid_size is None.

command_string: str

Reproduction of a command line that can be used to reproduce this image.

This does not include the --device argument.

config_string: str

Reproduction of a dgenerate config file that can be used to reproduce this image.

This does not include the --device argument.

diffusion_args: DiffusionArguments: Diffusion argument object, contains dgenerate.pipelinewrapper.DiffusionPipelineWrapper arguments used to produce this image.

property frame_index: int | None: The frame index if this is an animation frame. Also available through image_seed.frame_index, though here for convenience.

generation_step: int: The current generation step. (zero indexed)

image: Image | None: The generated image. Will be None if latent output is being used.

image_seed: ImageSeed | None: If an --image-seeds specification was used in the generation of this image, this object represents that image seed and contains the images that contributed to the generation of this image.

property is_animation_frame: bool: Is this image a frame in an animation?

property is_image_output: bool: Is this event representing image output?

property is_latents: bool: Is this event representing latents tensor output?

latents: Tensor | None: The generated latents tensor. Will be None if image output is being used.

suggested_directory: str

A suggested directory path for saving this image in.

A value of '.' may be present, this indicates the current working directory.

suggested_filename: str: A suggested filename for saving this image as. This filename will be unique to the render loop run / configuration. This filename will not contain RenderLoopConfig.output_path, it is the suggested filename by itself.

class dgenerate.renderloop.RenderLoop(config: RenderLoopConfig | None = None, image_processor_loader: ImageProcessorLoader | None = None, prompt_weighter_loader: PromptWeighterLoader | None = None, model_extra_modules: dict[str, Any] = None, second_model_extra_modules: dict[str, Any] = None, disable_writes: bool = False)[source]

Bases: object

Render loop which implements the bulk of dgenerate’s rendering capability.

This object handles the scatter gun iteration over requested diffusion parameters, the generation of animations, and writing images and media to disk or providing those to library users through callbacks.

__init__(config: RenderLoopConfig | None = None, image_processor_loader: ImageProcessorLoader | None = None, prompt_weighter_loader: PromptWeighterLoader | None = None, model_extra_modules: dict[str, Any] = None, second_model_extra_modules: dict[str, Any] = None, disable_writes: bool = False)[source]

Parameters:

config – RenderLoopConfig or dgenerate.arguments.DgenerateArguments. If None is provided, a RenderLoopConfig instance will be created and assigned to RenderLoop.config.
image_processor_loader – dgenerate.imageprocessors.ImageProcessorLoader. If None is provided, an instance will be created and assigned to RenderLoop.image_processor_loader.
prompt_weighter_loader – dgenerate.promptweighters.PromptWeighterLoader. If None is provided, an instance will be created and assigned to RenderLoop.prompt_weighter_loader.
model_extra_modules – Extra raw diffusers modules to use in the creation of the main model pipeline.
second_model_extra_modules – Extra raw diffusers modules to use in the creation of any refiner or stable cascade decoder model pipeline.
disable_writes – Disable or enable all writes to disk, if you intend to only ever use the event stream of the render loop when using dgenerate as a library, this is a useful option. RenderLoop.written_images and RenderLoop.written_animations will not be available if writes to disk are disabled.

Run the render loop, and iterate over a stream of event objects produced by the render loop.

This calls RenderLoopConfig.check() on a copy of your configuration prior to running.

Event objects are of the union type RenderLoopEvent

The exceptions mentioned here are those you may encounter upon iterating, they will not occur upon simple acquisition of the event stream iterator.

Raises:

Returns:

RenderLoopEventStream

run()[source]

Run the diffusion loop, this calls RenderLoopConfig.check() on a copy of your configuration prior to running.

Raises:

config: RenderLoopConfig: Render loops generation related configuration.

disable_writes: bool = False

Disable or enable all writes to disk, if you intend to only ever use the event stream of the render loop when using dgenerate as a library, this is a useful option.

RenderLoop.written_images and RenderLoop.written_animations will not be available if writes to disk are disabled.

property generation_step: Returns the current generation step, (zero indexed)

image_processor_loader: ImageProcessorLoader: Responsible for loading any image processors referenced in the render loop configuration.

model_extra_modules: dict[str, Any] = None: Extra raw diffusers modules to use in the creation of the main model pipeline.

property pipeline_wrapper: DiffusionPipelineWrapper

Get the last used dgenerate.pipelinewrapper.DiffusionPipelineWrapper instance.

Will be None if RenderLoop.run() has never been called.

Returns:: dgenerate.pipelinewrapper.DiffusionPipelineWrapper or None

prompt_weighter_loader: PromptWeighterLoader: Responsible for loading any prompt weighters referenced in the render loop configuration.

second_model_extra_modules: dict[str, Any] = None: Extra raw diffusers modules to use in the creation of any refiner or stable cascade decoder model pipeline.

property written_animations: Iterable[str]: Iterable over animation filenames written by the last run

property written_images: Iterable[str]: Iterable over image filenames written by the last run

class dgenerate.renderloop.RenderLoopConfig[source]

Bases: SetFromMixin

This object represents configuration for RenderLoop.

It nearly directly maps to dgenerate’s command line arguments.

See subclass dgenerate.arguments.DgenerateArguments

__init__()[source]

apply_prompt_upscalers()[source]

Apply requested prompt upscaling operations to all prompts in the configuration.

This potentially modifies the configuration in place, specifically the prompt arguments.

Raises:

dgenerate.promptupscalers.PromptUpscalerNotFoundError –
dgenerate.promptupscalers.PromptUpscalerArgumentError –
dgenerate.promptupscalers.PromptUpscalerProcessingError –

calculate_generation_steps() → int[source]

Calculate the number of generation steps that this configuration results in.

This factors in diffusion parameter combinations as well as scheduler combinations.

Returns:: int

check(attribute_namer: Callable[[str], str] | None = None)[source]

Check the configuration for type and logical usage errors, set defaults for certain values when needed and not specified.

This may modify the configuration.

Parameters:: attribute_namer – Callable for naming attributes mentioned in exception messages

copy() → RenderLoopConfig[source]

Create a deep copy of this RenderLoopConfig instance.

Returns:: RenderLoopConfig instance that is a deep copy of this instance.

is_output_latents() → bool[source]

Check if the current image_format results in outputting latents.

Returns:: True if the image output format indicates to output latents.

iterate_diffusion_args(**overrides) → Iterator[DiffusionArguments][source]

Iterate over dgenerate.pipelinewrapper.DiffusionArguments argument objects using every combination of argument values provided for that object by this configuration.

Parameters:: overrides – use key word arguments to override specific attributes of this object with a new list value.
Returns:: an iterator over dgenerate.pipelinewrapper.DiffusionArguments

adetailer_class_filter: Collection[int | str] | None = None: A collection of class IDs and/or class names that indicates what YOLO detection classes to keep. This filter is applied before index-filter. Detections that don’t match any of the specified classes will be ignored. Integers are treated as ID’s, strings are treated as names.

adetailer_crop_control_image: bool | None = None

Should adetailer crop any control image the same way that it crops the mask?

This is only relevant when using adetailer with ControlNet models.

When enabled, control images will be cropped to match the detected region before being passed to the inpainting pipeline. This can help ensure that the control guidance is properly aligned with the area being inpainted.

When disabled (default), control images will be resized to match the cropped region size without cropping.

This corresponds to the --adetailer-crop-control-image argument of the dgenerate command line tool.

adetailer_detector_paddings: Sequence[int | tuple[int, int] | tuple[int, int, int, int]] | None = None

One or more adetailer detector padding values.

This value specifies the amount of padding that will be added to the detection rectangle which is used to generate a masked area. The default is 0, you can make the mask area around the detected feature larger with positive padding and smaller with negative padding.

Example:

32 (32px Uniform, all sides)

(10, 20) (10px Horizontal, 20px Vertical)

(10, 20, 30, 40) (10px Left, 20px Top, 30px Right, 40px Bottom)

Defaults to [0].

adetailer_detector_uris: Sequence[str] | None = None

One or more adetailer YOLO detector model URIs. Corresponds directly to –adetailer-detectors.

Specification of this argument enables the adetailer inpainting algorithm and requires the use of RenderLoopConfig.image_seeds

adetailer_index_filter: Collection[int] | None = None: A list index values that indicates what YOLO detection indices to keep, the index values start at zero. Detections are sorted by their top left bounding box coordinate from left to right, top to bottom, by (confidence descending). The order of detections in the image is identical to the reading order of words on a page (english). Inpainting will only be performed on the specified detection indices, if no indices are specified, then inpainting will be performed on all detections. This filter is applied after class-filter.

adetailer_mask_blurs: Sequence[int] | None = None: Indicates the level of gaussian blur to apply to the inpaint mask generated by adetailer, which can help with smooth blending of the inpainted feature. Defaults to [4].

adetailer_mask_dilations: Sequence[int] | None = None: Indicates the amount of dilation applied to the generated adetailer inpaint mask, see: cv2.dilate. Defaults to [4].

adetailer_mask_paddings: Sequence[int | tuple[int, int] | tuple[int, int, int, int]] | None = None

One or more adetailer mask padding values.

This value indicates how much padding to place around the masked area when cropping out the image to be inpainted, this value must be large enough to accommodate any feathering on the edge of the mask caused by RenderLoopConfig.adetailer_mask_blurs or RenderLoopConfig.adetailer_mask_dilations for the best result.

Example:

32 (32px Uniform, all sides)

(10, 20) (10px Horizontal, 20px Vertical)

(10, 20, 30, 40) (10px Left, 20px Top, 30px Right, 40px Bottom)

Defaults to [32].

adetailer_mask_shapes: Sequence[str] | None = None

One or more adetailer mask shapes to try.

This indicates what mask shape adetailer should attempt to draw around a detected feature, the default value is “rectangle”. You may also specify “circle” to generate an ellipsoid shaped mask, which might be helpful for achieving better blending.

Valid values are: (“r”, “rect”, “rectangle”), or (“c”, “circle”, “ellipse”)

adetailer_model_masks: bool | None = None: Indicates that masks generated by the model itself should be preferred over masks generated from the detection bounding box. If this is True, and the model itself returns mask data, RenderLoopConfig.adetailer_mask_shape, RenderLoopConfig.adetailer_mask_padding, and RenderLoopConfig.adetailer_detector_padding will all be ignored.

adetailer_sizes: Sequence[int] | None = None: One or more target sizes for processing detected areas. When specified, detected areas will always be scaled to this target size (with aspect ratio preserved) for processing, then scaled back to the original size for compositing. This can significantly improve detail quality for small detected features like faces or hands, or reduce processing time for overly large detected areas. The scaling is based on the larger dimension (width or height) of the detected area. The optimal resampling method is automatically selected for both upscaling and downscaling. Each value must be an integer greater than 1. Defaults to none (process at native resolution).

animation_format: str = 'mp4': Format for any rendered animations, see: dgenerate.mediaoutput.supported_animation_writer_formats(). This value may also be set to ‘frames’ to indicate that only individual frames should be output and no animation file coalesced. This corresponds to the --animation-format argument of the dgenerate command line tool.

auth_token: str | None = None: Optional huggingface API token which will allow the download of restricted repositories that your huggingface account has been granted access to. This corresponds to the --auth-token argument of the dgenerate command line tool.

batch_grid_size: tuple[int, int] | None = None: Optional image grid size specification for when batch_size is greater than one. This is the --batch-grid-size argument of the dgenerate command line tool.

batch_size: int | None = None: Image generation batch size, --batch-size argument of dgenerate command line tool.

clip_skips: Sequence[int] | None = None: List of clip skip values. Clip skip is the number of layers to be skipped from CLIP while computing the prompt embeddings. A value of 1 means that the output of the pre-final layer will be used for computing the prompt embeddings. Only supported for model_type values sd and sdxl, including with controlnet_uris defined.

control_image_processors: Sequence[str] | None = None: Corresponds to the --control-image-processors argument of the dgenerate command line tool verbatim, including the grouping syntax using the “+” symbol, the plus symbol should be used as its own list element, IE: it is a token.

controlnet_uris: Sequence[str] | None = None: Optional user specified ControlNet URIs, this corresponds to the --control-nets argument of the dgenerate command line tool.

deep_cache: bool = False

Activate DeepCache for the main model?

DeepCache caches intermediate attention layer outputs to speed up the diffusion process. This is beneficial for higher inference steps.

See: https://github.com/horseee/DeepCache

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

deep_cache_branch_ids: Sequence[int] | None = None

Branch IDs to try for DeepCache for the main model.

Controls which branches of the UNet attention blocks the caching is applied to. Advanced usage only.

This value must be greater than or equal to 0.

Each value supplied will be tried in turn.

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

Supplying any value implies that RenderLoopConfig.deep_cache is enabled.

(default: 1)

deep_cache_intervals: Sequence[int] | None = None

Cache intervals to try for DeepCache for the main model.

Controls how frequently the attention layers are cached during the diffusion process. Lower values cache more frequently, potentially resulting in more speedup but using more memory.

This value must be greater than zero.

Each value supplied will be tried in turn.

This is supported for Stable Diffusion, Stable Diffusion XL, Stable Diffusion Upscaler X4, Kolors, and Pix2Pix variants.

Supplying any value implies that RenderLoopConfig.deep_cache is enabled.

(default: 5)

denoising_end: float | None = None

Denoising should end at this fraction of total timesteps (0.0 to 1.0).

This is useful for generating noisy latents that can be saved and passed to other models.

Scheduler Compatibility:

SD 1.5 models: Only stateless schedulers are supported (EulerDiscreteScheduler, LMSDiscreteScheduler, EDMEulerScheduler, DPMSolverMultistepScheduler, DDIMScheduler, DDPMScheduler, PNDMScheduler)
SDXL models: All schedulers supported via native denoising_start/denoising_end
SD3/Flux models: FlowMatchEulerDiscreteScheduler and standard schedulers supported

This corresponds to the --denoising-end argument of the dgenerate command line tool.

denoising_start: float | None = None

Denoising should start at this fraction of total timesteps (0.0 to 1.0).

This is useful continuing denoising on noisy latents generated with RenderLoopConfig.denoising_end

Scheduler Compatibility:

SD 1.5 models: Only stateless schedulers are supported (EulerDiscreteScheduler, LMSDiscreteScheduler, EDMEulerScheduler, DPMSolverMultistepScheduler, DDIMScheduler, DDPMScheduler, PNDMScheduler)
SDXL models: All schedulers supported via native denoising_start/denoising_end
SD3/Flux models: FlowMatchEulerDiscreteScheduler and standard schedulers supported

This corresponds to the --denoising-start argument of the dgenerate command line tool.

device: str = 'cpu'

Processing device specification, for example “cuda” or “cuda:N” where N is an alternate GPU id as reported by nvidia-smi if you want to specify a specific GPU. This corresponds to the --device argument of the dgenerate command line tool.

The default device on MacOS is “mps”.

“xpu” is an option for intel GPUs, for which device indices are also supported.

dtype: DataType = 0: Primary model data type specification, IE: integer precision. Default is auto selection. Lower precision datatypes result in less GPU memory usage. This corresponds to the --dtype argument of the dgenerate command line tool.

frame_end: int | None = None: Optional end frame inclusive frame slice for any rendered animations. This corresponds to the --frame-end argument of the dgenerate command line tool.

frame_start: int = 0: Start frame inclusive frame slice for any rendered animations. This corresponds to the --frame-start argument of the dgenerate command line tool.

freeu_params: Sequence[tuple[float, float, float, float]] | None = None

FreeU is a technique for improving image quality by re-balancing the contributions from the UNet’s skip connections and backbone feature maps.

This can be used with no cost to performance, to potentially improve image quality.

This argument can be used to specify The FreeU parameters: s1, s2, b1, and b2 in that order.

You can specify the FreeU parameters as a list / sequence of tuples that will be tried in turn for generation.

This argument only applies to models that utilize a UNet: SD1.5/2, SDXL, and Kolors

See: https://huggingface.co/docs/diffusers/main/en/using-diffusers/freeu

And: https://github.com/ChenyangSi/FreeU

guidance_rescales: Sequence[float] | None = None: List of floating point guidance rescale values which are supported by some pipelines, (there will be an error if it is unsupported upon running), this corresponds to the --guidance-rescales argument of the dgenerate command line tool.

guidance_scales: Sequence[float]: List of floating point guidance scales, this corresponds to the --guidance-scales argument of the dgenerate command line tool.

hi_diffusion: bool = False

Activate HiDiffusion for the primary model?

This can increase the resolution at which the model can output images while retaining quality with no overhead, and possibly improved performance.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

hi_diffusion_no_raunet: bool | None = None

Disable RAU-Net when using HiDiffusion for the primary model?

This disables the Resolution-Aware U-Net component of HiDiffusion.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

hi_diffusion_no_win_attn: bool | None = None

Disable window attention when using HiDiffusion for the primary model?

This disables the MSW-MSA (Multi-Scale Window Multi-Head Self-Attention) component of HiDiffusion.

See: https://github.com/megvii-research/HiDiffusion

This is supported for: --model-type sd, sdxl, and kolors.

image_encoder_uri: str | None = None

Optional user specified Image Encoder URI when using IP Adapter models or Stable Cascade. This corresponds to the --image-encoder argument of the dgenerate command line tool.

If none of your specified --ip-adapters URIs point to a model which contains an Image Encoder model, you will need to specify one manually using this argument.

image_format: str = 'png'

Format for any images that are written including animation frames.

Anything other than “png”, “jpg”, or “jpeg” is not compatible with output_metadata=True and a RenderLoopConfigError will be raised upon running the render loop if output_metadata=True and this value is not one of those mentioned formats.

image_guidance_scales: Sequence[float] | None = None: Optional list of floating point image guidance scales, used for pix2pix model types, this corresponds to the --image-guidance-scales argument of the dgenerate command line tool.

image_seed_strengths: Sequence[float] | None = None: Optional list of floating point image seed strengths, this corresponds to the --image-seed-strengths argument of the dgenerate command line tool.

image_seeds: Sequence[str] | None = None: List of --image-seeds URI strings.

img2img_latents_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing img2img latents before pipeline execution.

These processors are applied to latent tensors provided through the RenderLoopConfig.image_seeds argument when doing img2img with tensor inputs. The processors are applied in sequence and may occur before VAE decoding (for models that decode img2img latents) or before direct pipeline usage.

This corresponds to the --img2img-latents-processors argument of the dgenerate command line tool.

inference_steps: Sequence[int]: List of inference steps values, this corresponds to the --inference-steps argument of the dgenerate command line tool.

inpaint_crop: bool = False

Enable cropping to mask bounds for inpainting. When enabled, input images will be automatically cropped to the bounds of their masks (plus any padding) before processing, then the generated result will be pasted back onto the original uncropped image. This allows inpainting at higher effective resolutions for better quality results.

Note: Inpaint crop cannot be used with multiple input images. Each image/mask pair must be processed individually for optimal cropping, as different masks may have different bounds. However, batch_size > 1 is supported for generating multiple variations of a single crop.

This corresponds to the --inpaint-crop argument of the dgenerate command line tool.

inpaint_crop_feathers: Sequence[int] | None = None

One or more feather values to use when pasting the generated result back onto the original image for inpaint cropping. Specifying this automatically enables RenderLoopConfig.inpaint_crop. Each value will be tried in turn (combinatorial). Feathering creates smooth transitions from opaque to transparent. Cannot be used together with RenderLoopConfig.inpaint_crop_masked.

This corresponds to the --inpaint-crop-feathers argument of the dgenerate command line tool.

inpaint_crop_masked: bool = False

Use the mask when pasting the generated result back onto the original image for inpaint cropping. Specifying this automatically enables RenderLoopConfig.inpaint_crop. This means only the masked areas will be replaced. Cannot be used together with RenderLoopConfig.inpaint_crop_feathers.

This corresponds to the --inpaint-crop-masked argument of the dgenerate command line tool.

inpaint_crop_paddings: Sequence[int | tuple[int, int] | tuple[int, int, int, int]] | None = None

One or more padding values to use around mask bounds for inpaint cropping. Each value will be tried in turn (combinatorial). Specifying this automatically enables RenderLoopConfig.inpaint_crop.

Padding can be specified as: - A single integer (e.g., 32) for uniform padding on all sides - “WIDTHxHEIGHT” format (e.g., “10x20”) for horizontal and vertical padding - “LEFTxTOPxRIGHTxBOTTOM” format (e.g., “5x10x5x15”) for specific side padding

This corresponds to the --inpaint-crop-paddings argument of the dgenerate command line tool.

ip_adapter_uris: Sequence[str] | None = None: Optional user specified IP Adapter URIs, this corresponds to the --ip-adapters argument of the dgenerate command line tool.

latents: Sequence[Tensor] | None = None: Optional list of tensors containing noisy latents to use as starting points for diffusion. These latents can be generated by using –denoising-end with –image-format pt/pth/safetensors to save intermediate noisy latents from a previous generation. This allows for advanced workflows where you can pass partially denoised latents between different models or generation stages.

latents_post_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing output latents when outputting to latents.

These processors are applied to latents when RenderLoopConfig.image_format is set to a tensor format (pt, pth, safetensors). The processors are applied in sequence after the diffusion pipeline generates the latents but before they are returned in the result.

This corresponds to the --latents-post-processors argument of the dgenerate command line tool.

latents_processors: Sequence[str] | None = None

One or more latents processor URI strings for processing raw input latents before pipeline execution.

These processors are applied to latents provided through the RenderLoopConfig.latents argument (raw latents used as noise initialization). The processors are applied in sequence before the latents are passed to the diffusion pipeline.

This corresponds to the --latents-processors argument of the dgenerate command line tool.

lora_fuse_scale: float | None = None

Optional global LoRA fuse scale, this corresponds to the --lora-fuse-scale argument of the dgenerate command line tool.

LoRA weights are merged into the main model at this scale.

When specifying multiple LoRA models, they are fused together into one set of weights using their individual scale values, after which they are fused into the main model at this scale value.

The default value when None is specified is 1.0.

lora_uris: Sequence[str] | None = None: Optional user specified LoRA URIs, this corresponds to the --loras argument of the dgenerate command line tool.

mask_image_processors: Sequence[str] | None = None: Corresponds to the --mask-image-processors argument of the dgenerate command line tool verbatim.

max_sequence_length: int | None = None

Max number of prompt tokens that the T5EncoderModel (text encoder 3) of Stable Diffusion 3 or Flux can handle.

This defaults to 256 for SD3 when not specified, and 512 for Flux.

The maximum value is 512 and the minimum value is 1.

High values result in more resource usage and processing time.

model_cpu_offload: bool = False: Force model cpu offloading for the main pipeline, this may reduce memory consumption and allow large models to run when they would otherwise not fit in your GPUs VRAM. Inference will be slower. Mutually exclusive with RenderLoopConfig.model_sequential_offload

model_path: str | None = None: Primary diffusion model path, model_path argument of dgenerate command line tool.

model_sequential_offload: bool = False: Force sequential model offloading for the main pipeline, this may drastically reduce memory consumption and allow large models to run when they would otherwise not fit in your GPUs VRAM. Inference will be much slower. Mutually exclusive with RenderLoopConfig.model_cpu_offload

model_type: ModelType = 0: Corresponds to the --model-type argument of the dgenerate command line tool.

no_aspect: bool = False: Should Seed, Mask, and Control guidance images specified in RenderLoopConfig.image_seeds definitions (--image-seeds) have their aspect ratio ignored when being resized to RenderLoopConfig.output_size (--output-size) ?

no_frames: bool = False: Should individual frames not be output when rendering an animation? defaults to False. This corresponds to the --no-frames argument of the dgenerate command line tool. Using this option when RenderLoopConfig.animation_format is equal to "frames" will cause a RenderLoopConfigError be raised.

offline_mode: bool = False: Avoid ever connecting to the internet to download anything? this can be used if all your models / media are cached or if you are only ever using resources that exist on disk already. Corresponds to the --offline-mode argument of the dgenerate command line tool.

original_config: str | None = None: This option can be used to supply an original LDM config .yaml file that was provided with a single file checkpoint.

output_auto1111_metadata: bool = False: Write Automatic1111 compatible metadata to the metadata of all written images? this data is not written to animated files, only PNGs and JPEGs. This corresponds to the --output-metadata argument of the dgenerate command line tool.

output_configs: bool = False: Output a config text file next to every generated image or animation? this file will contain configuration that is pipeable to dgenerate stdin, which will reproduce said file. This corresponds to the --output-configs argument of the dgenerate command line tool.

output_metadata: bool = False: Write config text to the metadata of all written images? this data is not written to animated files, only PNGs and JPEGs. This corresponds to the --output-metadata argument of the dgenerate command line tool.

output_overwrite: bool = False: Allow overwrites of files? or avoid this with a file suffix in a multiprocess safe manner? This corresponds to the --output-overwrite argument of the dgenerate command line tool.

output_path: str = 'output': Render loop write folder, where images and animations will be written. This corresponds to the --output-path argument of the dgenerate command line tool.

output_prefix: str | None = None: Output filename prefix, add an optional prefix string to all written files. This corresponds to the --output-prefix argument of the dgenerate command line tool.

output_size: tuple[int, int] | None = None: Desired output size, sizes not aligned by 8 pixels will result in an error message. This corresponds to the --output-size argument of the dgenerate command line tool.

pag: bool = False: Use perturbed attention guidance?

pag_adaptive_scales: Sequence[float] | None = None: List of floating point adaptive perturbed attention guidance scales, this corresponds to the --pag-adaptive-scales argument of the dgenerate command line tool.

pag_scales: Sequence[float] | None = None: List of floating point perturbed attention guidance scales, this corresponds to the --pag-scales argument of the dgenerate command line tool.

parsed_image_seeds: Sequence[ImageSeedParseResult] | None = None: The results of parsing URIs mentioned in RenderLoopConfig.image_seeds, will only be available if RenderLoopConfig.check() has been called.

post_processors: Sequence[str] | None = None: Corresponds to the --post-processors argument of the dgenerate command line tool verbatim.

prompt_upscaler_uri: str | Sequence[str] | None = None

The URI of a prompt-upscaler implementation supported by dgenerate.

This may also be a list of URIs, the prompt upscalers will be chained together sequentially.

This corresponds to the --prompt-upscaler argument of the dgenerate command line tool.

prompt_weighter_uri: str | None = None

The URI of a prompt-weighter implementation supported by dgenerate.

This corresponds to the --prompt-weighter argument of the dgenerate command line tool.

prompts: Sequence[Prompt]: List of prompt objects, this corresponds to the --prompts argument of the dgenerate command line tool.

quantizer_map: Sequence[str] | None = None: Collection of pipeline submodule names to which quantization should be applied when :py:attr`RenderLoopConfig.quantizer_uri` is provided. Valid values include: unet, transformer, text_encoder, text_encoder_2, text_encoder_3. If None, all supported modules will be quantized.

quantizer_uri: str | None = None

Global quantizer URI for main pipline, this corresponds to the --quantizer argument of the dgenerate command line tool.

The quantization backend and settings specified by this URI will be used globally on the the most appropriate models associated with the main diffusion pipeline.

ras: bool = False

Activate RAS (Region-Adaptive Sampling) for the primary model?

This can increase inference speed with SD3.

See: https://github.com/microsoft/ras

This is supported for: --model-type sd3.

ras_end_steps: Sequence[int] | None = None

Ending steps to try for RAS (Region-Adaptive Sampling).

This controls when RAS stops applying its sampling strategy. Must be greater than or equal to 1. Defaults to the number of inference steps if not specified.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_error_reset_steps: Sequence[Sequence[int]] | None = None

Error reset step patterns to try for RAS (Region-Adaptive Sampling).

The dense sampling steps inserted between the RAS steps to reset the accumulated error. Should be a list of lists of step numbers, e.g. [[12, 22], …].

Each list will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_high_ratios: Sequence[float] | None = None

High ratios to try for RAS (Region-Adaptive Sampling).

Based on the metric selected, the ratio of the high value chosen to be cached. Default value is 1.0, but can be set between 0 and 1 to balance the sample ratio between the main subject and the background.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_index_fusion: bool | None = None

Enable index fusion in RAS (Region-Adaptive Sampling) for the primary model?

This can improve attention computation in RAS for SD3.

See: https://github.com/microsoft/ras

Setting to True implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3, (but not for SD3.5 models)

ras_metrics: Sequence[str] | None = None

One or more RAS metrics to try.

This controls how RAS measures the importance of tokens for caching. Valid values are “std” (standard deviation) or “l2norm” (L2 norm). Defaults to “std”.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_sample_ratios: Sequence[float] | None = None

Sample ratios to try for RAS (Region-Adaptive Sampling).

For instance, setting this to 0.5 on a sequence of 4096 tokens will result in the noise of averagely 2048 tokens to be updated during each RAS step. Must be between 0 and 1.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_skip_num_step_lengths: Sequence[int] | None = None

Skip step lengths to try for RAS (Region-Adaptive Sampling).

This controls the length of steps to skip between RAS steps. Must be greater than or equal to 0.

When set to 0, static dropping is used where the same number of tokens are skipped at each step (except for error reset steps and steps before RenderLoopConfig.ras_start_steps).

When greater than 0, dynamic dropping is used where the number of skipped tokens varies over time based on RenderLoopConfig.ras_skip_num_steps. The pattern repeats every RenderLoopConfig.ras_skip_num_step_lengths steps.

Each value will be tried in turn.

Supplying any value implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

(default: 0)

ras_skip_num_steps: Sequence[int] | None = None

Skip steps to try for RAS (Region-Adaptive Sampling).

This controls the number of steps to skip between RAS steps.

The actual number of tokens skipped will be rounded down to the nearest multiple of 64. This ensures efficient memory access patterns for the attention computation.

When used with RenderLoopConfig.ras_skip_num_step_lengths > 0, this value determines how much to increase/decrease the number of skipped tokens over time. A positive value will increase the number of skipped tokens, while a negative value will decrease it.

Each value will be tried in turn.

Supplying any value implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

(default: 0)

ras_start_steps: Sequence[int] | None = None

Starting steps to try for RAS (Region-Adaptive Sampling).

This controls when RAS begins applying its sampling strategy. Must be greater than or equal to 1. Defaults to 4 if not specified.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

ras_starvation_scales: Sequence[float] | None = None

Starvation scales to try for RAS (Region-Adaptive Sampling).

RAS tracks how often a token is dropped and incorporates this count as a scaling factor in the metric for selecting tokens. This scale factor prevents excessive blurring or noise in the final generated image. Larger scaling factor will result in more uniform sampling. Usually set between 0.0 and 1.0.

Each value will be tried in turn.

Supplying any values implies that RenderLoopConfig.ras is enabled.

This is supported for: --model-type sd3.

revision: str = 'main': Repo revision selector for the main model when loading from a huggingface repository. This corresponds to the --revision argument of the dgenerate command line tool.

s_cascade_decoder_uri: str | None = None: Stable Cascade model URI, --s-cascade-decoder argument of dgenerate command line tool.

sada: bool = False

Enable SADA (Stability-guided Adaptive Diffusion Acceleration) with default parameters for the primary model.

Specifying this alone is equivalent to setting all SADA parameters to their model-specific default values:

SD/SD2: sada_max_downsamples=1, sada_sxs=3, sada_sys=3, sada_lagrange_terms=4, sada_lagrange_ints=4, sada_lagrange_steps=24, sada_max_fixes=5120
SDXL/Kolors: sada_max_downsamples=2, sada_sxs=3, sada_sys=3, sada_lagrange_terms=4, sada_lagrange_ints=4, sada_lagrange_steps=24, sada_max_fixes=10240
Flux: sada_max_downsamples=0, sada_lagrange_terms=3, sada_lagrange_ints=4, sada_lagrange_steps=20, sada_max_fixes=0

SADA is not compatible with HiDiffusion, DeepCache, or TeaCache.

sada_acc_ranges: Sequence[tuple[int, int]] | None = None

SADA acceleration range start / end steps for the primary model.

Defines the starting / ending step for SADA acceleration.

Starting step must be at least 3 as SADA leverages third-order dynamics.

Defaults to [[10,47]].

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_lagrange_ints: Sequence[int] | None = None

SADA Lagrangian interpolation interval for the primary model.

Interval for Lagrangian interpolation. Must be compatible with sada_lagrange_step (lagrange_step % lagrange_int == 0).

Model-specific defaults:

SD/SD2: 4
SDXL/Kolors: 4
Flux: 4

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_lagrange_steps: Sequence[int] | None = None

SADA Lagrangian interpolation step for the primary model.

Step value for Lagrangian interpolation. Must be compatible with sada_lagrange_int (lagrange_step % lagrange_int == 0).

Model-specific defaults:

SD/SD2: 24
SDXL/Kolors: 24
Flux: 20

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_lagrange_terms: Sequence[int] | None = None

SADA Lagrangian interpolation terms for the primary model.

Number of terms to use in Lagrangian interpolation. Set to 0 to disable Lagrangian interpolation.

Model-specific defaults:

SD/SD2: 4
SDXL/Kolors: 4
Flux: 3

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_max_downsamples: Sequence[int] | None = None

SADA maximum downsample factor for the primary model.

Controls the maximum downsample factor in the SADA algorithm. Lower values can improve quality but may reduce speedup.

Model-specific defaults:

SD/SD2: 1
SDXL/Kolors: 2
Flux: 0

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_max_fixes: Sequence[int] | None = None

SADA maximum fixed memory for the primary model.

Maximum amount of fixed memory to use in SADA optimization.

Model-specific defaults:

SD/SD2: 5120 (5 * 1024)
SDXL/Kolors: 10240 (10 * 1024)
Flux: 0

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_max_intervals: Sequence[int] | None = None

SADA maximum interval for optimization for the primary model.

Maximum interval between optimizations in the SADA algorithm.

Defaults to 4.

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_sxs: Sequence[int] | None = None

SADA spatial downsample factor X for the primary model.

Controls the spatial downsample factor in the X dimension. Higher values can increase speedup but may affect quality.

Model-specific defaults:

SD/SD2: 3
SDXL/Kolors: 3
Flux: 0 (not used)

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

sada_sys: Sequence[int] | None = None

SADA spatial downsample factor Y for the primary model.

Controls the spatial downsample factor in the Y dimension. Higher values can increase speedup but may affect quality.

Model-specific defaults:

SD/SD2: 3
SDXL/Kolors: 3
Flux: 0 (not used)

See: https://github.com/Ting-Justin-Jiang/sada-icml

Supplying any SADA parameter implies that SADA is enabled.

This is supported for: --model-type sd, sdxl, kolors, flux*.

Each value supplied will be tried in turn.

safety_checker: bool = False: Enable safety checker? --safety-checker

scheduler_uri: str | Sequence[str] | None = None

Optional primary model scheduler/sampler class name specification, this corresponds to the --scheduler argument of the dgenerate command line tool. Setting this to ‘help’ will yield a help message to stdout describing scheduler names compatible with the current configuration upon running. Passing ‘helpargs’ will yield a help message with a list of overridable arguments for each scheduler and their typical defaults.

This may be a list of schedulers, indicating to try each scheduler in turn.

sdxl_aesthetic_scores: Sequence[float] | None = None: Optional list of SDXL aesthetic-score conditioning values, this corresponds to the --sdxl-aesthetic-scores argument of the dgenerate command line tool.

sdxl_crops_coords_top_left: Sequence[tuple[int, int]] | None = None: Optional list of SDXL top-left-crop-coords micro-conditioning parameters, this corresponds to the --sdxl-crops-coords-top-left argument of the dgenerate command line tool.

sdxl_high_noise_fractions: Sequence[float] | None = None: Optional list of SDXL refiner high noise fractions (floats), this value is the fraction of inference steps that the base model handles, the inverse proportion of the provided fraction is handled by the refiner model. This corresponds to the --sdxl-high-noise-fractions argument of the dgenerate command line tool.

sdxl_negative_aesthetic_scores: Sequence[float] | None = None: Optional list of negative influence SDXL aesthetic-score conditioning values, this corresponds to the --sdxl-negative-aesthetic-scores argument of the dgenerate command line tool.

sdxl_negative_crops_coords_top_left: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL top-left crop coords micro-conditioning parameters, this corresponds to the --sdxl-negative-crops-coords-top-left argument of the dgenerate command line tool.

sdxl_negative_original_sizes: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL original-size micro-conditioning parameters, this corresponds to the --sdxl-negative-original-sizes argument of the dgenerate command line tool.

sdxl_negative_target_sizes: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL target-size micro-conditioning parameters, this corresponds to the --sdxl-negative-target-sizes argument of the dgenerate command line tool.

sdxl_original_sizes: Sequence[tuple[int, int]] | None = None: Optional list of SDXL original-size micro-conditioning parameters, this corresponds to the --sdxl-original-sizes argument of the dgenerate command line tool.

sdxl_refiner_aesthetic_scores: Sequence[float] | None = None: Optional list of SDXL-refiner override aesthetic-score conditioning values, this corresponds to the --sdxl-refiner-aesthetic-scores argument of the dgenerate command line tool.

sdxl_refiner_clip_skips: Sequence[int] | None = None: Clip skip override values for the SDXL refiner, which normally defaults to the clip skip value for the main model when it is defined.

sdxl_refiner_crops_coords_top_left: Sequence[tuple[int, int]] | None = None: Optional list of SDXL-refiner override top-left-crop-coords micro-conditioning parameters, this corresponds to the --sdxl-refiner-crops-coords-top-left argument of the dgenerate command line tool.

sdxl_refiner_deep_cache: bool | None = None

Activate DeepCache for the SDXL Refiner?

See: RenderLoopConfig.deep_cache

This is supported for Stable Diffusion XL and Kolors based models.

sdxl_refiner_deep_cache_branch_ids: Sequence[int] | None = None

Branch IDs to try for DeepCache for the SDXL Refiner.

Controls which branches of the UNet attention blocks the caching is applied to. Advanced usage only.

This value must be greater than or equal to 0.

Each value supplied will be tried in turn.

This is supported for Stable Diffusion XL and Kolors based models.

Supplying any value implies that RenderLoopConfig.sdxl_refiner_deep_cache is enabled.

(default: 1)

sdxl_refiner_deep_cache_intervals: Sequence[int] | None = None

Cache intervals to try for DeepCache for the SDXL Refiner.

Controls how frequently the attention layers are cached during the diffusion process. Lower values cache more frequently, potentially resulting in more speedup but using more memory.

This value must be greater than zero.

Each value supplied will be tried in turn.

This is supported for Stable Diffusion XL and Kolors based models.

Supplying any value implies that RenderLoopConfig.sdxl_refiner_deep_cache is enabled.

(default: 5)

sdxl_refiner_edit: bool | None = None: Force the SDXL refiner to operate in edit mode instead of cooperative denoising mode.

sdxl_refiner_freeu_params: Sequence[tuple[float, float, float, float]] | None = None

List / sequence of FreeU parameters to try for the SDXL refiner.

See: RenderLoopConfig.freeu_params for clarification.

sdxl_refiner_guidance_rescales: Sequence[float] | None = None: Optional list of guidance rescale value overrides for the SDXL refiner or Stable Cascade decoder, this corresponds to the --sdxl-refiner-guidance-rescales argument of the dgenerate command line tool.

sdxl_refiner_negative_aesthetic_scores: Sequence[float] | None = None: Optional list of negative influence SDXL-refiner override aesthetic-score conditioning values, this corresponds to the --sdxl-refiner-negative-aesthetic-scores argument of the dgenerate command line tool.

sdxl_refiner_negative_crops_coords_top_left: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL-refiner top-left crop coords micro-conditioning parameters, this corresponds to the --sdxl-refiner-negative-crops-coords-top-left argument of the dgenerate command line tool.

sdxl_refiner_negative_original_sizes: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL-refiner override original-size micro-conditioning parameters, this corresponds to the --sdxl-refiner-negative-original-sizes argument of the dgenerate command line tool.

sdxl_refiner_negative_target_sizes: Sequence[tuple[int, int]] | None = None: Optional list of negative influence SDXL-refiner override target-size micro-conditioning parameters, this corresponds to the --sdxl-refiner-negative-target-sizes argument of the dgenerate command line tool.

sdxl_refiner_original_sizes: Sequence[tuple[int, int]] | None = None: Optional list of SDXL-refiner override original-size micro-conditioning parameters, this corresponds to the --sdxl-refiner-original-sizes argument of the dgenerate command line tool.

sdxl_refiner_pag: bool | None = None: Use perturbed attention guidance in the SDXL refiner?

sdxl_refiner_pag_adaptive_scales: Sequence[float] | None = None: List of floating point adaptive perturbed attention guidance scales to try with the SDXL refiner, this corresponds to the --sdxl-refiner-pag-adaptive-scales argument of the dgenerate command line tool.

sdxl_refiner_pag_scales: Sequence[float] | None = None: List of floating point perturbed attention guidance scales to try with the SDXL refiner, this corresponds to the --sdxl-refiner-pag-scales argument of the dgenerate command line tool.

sdxl_refiner_sigmas: Sequence[Sequence[float] | str] | None = None

One or more lists of sigma values to try with the SDXL refiner. This is supported when using a RenderLoopConfig.second_model_scheduler_uri that supports setting sigmas.

Or: string expressions involving sigmas from the selected scheduler such as sigmas * 0.95, sigmas will be represented as a numpy array, numpy is available through the namespace np, this uses asteval.

Lists of floats and strings representing expressions can be intermixed.

Sigma values control the noise schedule in the diffusion process, allowing for fine-grained control over how noise is added and removed during image generation.

This corresponds to the --sdxl-refiner-sigmas command line argument, which accepts multiple comma-separated lists of floating point values, or singular values, or expressions denoted with: expr: ....

You do not need to specify expr: when passing this value in the library, simply pass a string instead of a list of floats.

Example: [[1.0,2.0,3.0], 'sigmas * 0.95']

sdxl_refiner_target_sizes: Sequence[tuple[int, int]] | None = None: Optional list of SDXL-refiner override target-size micro-conditioning parameters, this corresponds to the --sdxl-refiner-target-sizes argument of the dgenerate command line tool.

sdxl_refiner_uri: str | None = None: SDXL Refiner model URI, --sdxl-refiner argument of dgenerate command line tool.

sdxl_t2i_adapter_factors: Sequence[float] | None = None: Optional list of SDXL specific T2I adapter factors to try, this controls the amount of time-steps for which a T2I adapter applies guidance to an image, this is a value between 0.0 and 1.0. A value of 0.5 for example indicates that the T2I adapter is only active for half the amount of time-steps it takes to completely render an image.

sdxl_target_sizes: Sequence[tuple[int, int]] | None = None: Optional list of SDXL target-size micro-conditioning parameters, this corresponds to the --sdxl-target-sizes argument of the dgenerate command line tool.

second_model_cpu_offload: bool | None = None: Force model cpu offloading for the SDXL refiner or Stable Cascade decoder pipeline, this may reduce memory consumption and allow large models to run when they would otherwise not fit in your GPUs VRAM. Inference will be slower. Mutually exclusive with RenderLoopConfig.second_model_sequential_offload

second_model_guidance_scales: Sequence[float] | None = None: Optional list of guidance scale value overrides for the SDXL refiner or Stable Cascade decoder, this corresponds to the --second-model-guidance-scales argument of the dgenerate command line tool.

second_model_inference_steps: Sequence[int] | None = None: Optional list of inference steps value overrides for the SDXL refiner, this corresponds to the --second-model-inference-steps argument of the dgenerate command line tool.

second_model_original_config: str | None = None: This option can be used to supply an original LDM config .yaml file that was provided with a single file checkpoint for the secondary model, i.e. the SDXL Refiner or Stable Cascade Decoder.

second_model_prompt_upscaler_uri: str | Sequence[str] | None = None

The URI of a prompt-upscaler implementation supported by dgenerate to use with the SDXL refiner or Stable Cascade decoder.

Defaults to RenderLoopConfig.prompt_upscaler_uri if not specified.

This may also be a list of URIs, the prompt upscalers will be chained together sequentially.

This corresponds to the --second-model-prompt-upscaler argument of the dgenerate command line tool.

second_model_prompt_weighter_uri: str | None = None

The URI of a prompt-weighter implementation supported by dgenerate to use with the SDXL refiner or Stable Cascade decoder.

Defaults to RenderLoopConfig.prompt_weighter_uri if not specified.

This corresponds to the --second-model-prompt-weighter argument of the dgenerate command line tool.

second_model_prompts: Sequence[Prompt] | None = None: Optional list of SDXL refiner or Stable Cascade decoder prompt overrides, this corresponds to the --second-model-prompts argument of the dgenerate command line tool.

second_model_quantizer_map: Sequence[str] | None = None: Collection of secondary pipeline submodule names to which quantization should be applied when :py:attr`RenderLoopConfig.second_model_quantizer_uri` is provided. Valid values include: unet, transformer, text_encoder, text_encoder_2, text_encoder_3. If None, all supported modules will be quantized.

second_model_quantizer_uri: str | None = None

Global quantizer URI for secondary pipeline (SDXL Refiner or Stable Cascade decoder), this corresponds to the --second-model-quantizer argument of the dgenerate command line tool.

The quantization backend and settings specified by this URI will be used globally on the the most appropriate models associated with the secondary diffusion pipeline (SDXL Refiner, Stable Cascade Decoder).

second_model_scheduler_uri: str | Sequence[str] | None = None

Optional SDXL Refiner / Stable Cascade Decoder model scheduler/sampler class name specification, this corresponds to the --second-model-scheduler argument of the dgenerate command line tool. Setting this to ‘help’ will yield a help message to stdout describing scheduler names compatible with the current configuration upon running. Passing ‘helpargs’ will yield a help message with a list of overridable arguments for each scheduler and their typical defaults.

This may be a list of schedulers, indicating to try each scheduler in turn.

second_model_second_prompt_upscaler_uri: str | Sequence[str] | None = None

The URI of a prompt-upscaler implementation supported by dgenerate to use with the SDXL refiner --second-prompts value.

Or rather RenderLoopConfig.second_model_second_prompts

Defaults to RenderLoopConfig.prompt_upscaler_uri if not specified.

This may also be a list of URIs, the prompt upscalers will be chained together sequentially.

This corresponds to the --second-model-second-prompt-upscaler argument of the dgenerate command line tool.

second_model_second_prompts: Sequence[Prompt] | None = None: Optional list of SDXL refiner secondary prompt overrides, this corresponds to the --second-model-second-prompts argument of the dgenerate command line tool. The Stable Cascade Decoder does not support this argument.

second_model_sequential_offload: bool | None = None: Force sequential model offloading for the SDXL refiner or Stable Cascade decoder pipeline, this may drastically reduce memory consumption and allow large models to run when they would otherwise not fit in your GPUs VRAM. Inference will be much slower. Mutually exclusive with RenderLoopConfig.second_model_cpu_offload

second_model_text_encoder_uris: Sequence[str] | None = None: Optional user specified Text Encoder URIs, this corresponds to the --second-model-text-encoders argument of the dgenerate command line tool. This specifies text encoders for the SDXL refiner or Stable Cascade decoder.

second_model_unet_uri: str | None = None: Optional user specified second UNet URI, this corresponds to the --second-model-unet argument of the dgenerate command line tool. This UNet uri will be used for the SDXL refiner or Stable Cascade decoder model.

second_prompt_upscaler_uri: str | Sequence[str] | None = None

The URI of a prompt-upscaler implementation supported by dgenerate that applies to RenderLoopConfig.second_prompts

Defaults to RenderLoopConfig.prompt_upscaler_uri if not specified.

This may also be a list of URIs, the prompt upscalers will be chained together sequentially.

This corresponds to the --second-prompt-upscaler argument of the dgenerate command line tool.

second_prompts: Sequence[Prompt] | None = None: Optional list of SD3 / Flux secondary prompts, this corresponds to the --second-prompts argument of the dgenerate command line tool.

seed_image_processors: Sequence[str] | None = None: Corresponds to the --seed-image-processors argument of the dgenerate command line tool verbatim.

seeds: Sequence[int]: List of integer seeds, this corresponds to the --seeds argument of the dgenerate command line tool.

seeds_to_images: bool = False: Should RenderLoopConfig.seeds be interpreted as seeds for each image input instead of combinatorial? this includes control images.

sigmas: Sequence[Sequence[float] | str] | None = None

One or more lists of sigma values to try. This is supported when using a RenderLoopConfig.scheduler_uri that supports setting sigmas.

Or: string expressions involving sigmas from the selected scheduler such as sigmas * 0.95, sigmas will be represented as a numpy array, numpy is available through the namespace np, this uses asteval.

Lists of floats and strings representing expressions can be intermixed.

Sigma values control the noise schedule in the diffusion process, allowing for fine-grained control over how noise is added and removed during image generation.

This corresponds to the --sigmas command line argument, which accepts multiple comma-separated lists of floating point values, or singular values, or expressions denoted with: expr: ....

You do not need to specify expr: when passing this value in the library, simply pass a string instead of a list of floats.

Example: [[1.0,2.0,3.0], 'sigmas * 0.95']

subfolder: str | None = None: Primary model subfolder argument, --subfolder argument of dgenerate command line tool.

t2i_adapter_uris: Sequence[str] | None = None: Optional user specified T2IAdapter URIs, this corresponds to the --t2i-adapters argument of the dgenerate command line tool.

tea_cache: bool = False

Activate TeaCache for the primary model?

This is supported for Flux, teacache uses a novel caching mechanism in the forward pass of the flux transformer to reduce the amount of computation needed to generate an image, this can speed up inference with small amounts of quality loss.

See: https://github.com/ali-vilab/TeaCache

Also see: RenderLoopConfig.tea_cache_rel_l1_thresholds

This is supported for: --model-type flux*.

tea_cache_rel_l1_thresholds: Sequence[float] | None = None

TeaCache relative L1 thresholds to try when RenderLoopConfig.tea_cache is enabled.

This should be one or more float values between 0.0 and 1.0, each value will be tried in turn. Higher values mean more speedup.

Defaults to 0.6 (2.0x speedup). 0.25 for 1.5x speedup, 0.4 for 1.8x speedup, 0.6 for 2.0x speedup, 0.8 for 2.25x speedup

See: https://github.com/ali-vilab/TeaCache

Supplying any value implies that RenderLoopConfig.tea_cache is enabled.

This is supported for: --model-type flux*.

text_encoder_uris: Sequence[str] | None = None: Optional user specified Text Encoder URIs, this corresponds to the --text-encoders argument of the dgenerate command line tool.

textual_inversion_uris: Sequence[str] | None = None: Optional user specified Textual Inversion URIs, this corresponds to the --textual-inversions argument of the dgenerate command line tool.

third_prompt_upscaler_uri: str | Sequence[str] | None = None

The URI of a prompt-upscaler implementation supported by dgenerate that applies to RenderLoopConfig.third_prompts

Defaults to RenderLoopConfig.prompt_upscaler_uri if not specified.

This may also be a list of URIs, the prompt upscalers will be chained together sequentially.

This corresponds to the --third-prompt-upscaler argument of the dgenerate command line tool.

third_prompts: Sequence[Prompt] | None = None: Optional list of SD3 tertiary prompts, this corresponds to the --third-prompts argument of the dgenerate command line tool. Flux does not support this argument.

transformer_uri: str | None = None

Optional user specified Transformer URI, this corresponds to the --transformer argument of the dgenerate command line tool.

This is currently only supported for Stable Diffusion 3 and Flux models.

unet_uri: str | None = None: Optional user specified UNet URI, this corresponds to the --unet argument of the dgenerate command line tool.

upscaler_noise_levels: Sequence[int] | None = None: Optional list of integer upscaler noise levels, this corresponds to the --upscaler-noise-levels argument of the dgenerate command line tool that is used for the dgenerate.pipelinewrapper.ModelType.UPSCALER_X4 model type only.

vae_slicing: bool = False: Enable VAE slicing? --vae-slicing

vae_tiling: bool = False: Enable VAE tiling? --vae-tiling

vae_uri: str | None = None: Optional user specified VAE URI, this corresponds to the --vae argument of the dgenerate command line tool.

variant: str | None = None: Primary model weights variant string. This corresponds to the --variant argument of the dgenerate command line tool.

class dgenerate.renderloop.StartingAnimationEvent(origin, total_frames: int, fps: float, frame_duration: float)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are about to start being generated.

This occurs whether an animation is going to be written to disk or not.

__init__(origin, total_frames: int, fps: float, frame_duration: float)[source]

fps: float: FPS of the generated file.

frame_duration: float: Frame duration of the generated file, (the time a frame is visible in milliseconds)

total_frames: int: Number of frames written.

class dgenerate.renderloop.StartingAnimationFileEvent(origin, path: str, total_frames: int, fps: float, frame_duration: float)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a sequence of images that belong to an animation are about to start being written to a file.

__init__(origin, path: str, total_frames: int, fps: float, frame_duration: float)[source]

fps: float: FPS of the generated file.

frame_duration: float: Frame duration of the generated file, (the time a frame is visible in milliseconds)

path: str: File path where the animation will reside.

total_frames: int: Number of frames written.

class dgenerate.renderloop.StartingGenerationStepEvent(origin, generation_step: int, total_steps: int)[source]

Bases: Event

Common event stream object produced by the events() event stream of a render loop.

Occurs when a generation step is starting, a generation step may produce multiple images and or an animation.

__init__(origin, generation_step: int, total_steps: int)[source]

generation_step: int: The generation step number.

total_steps: int: The total number of steps that are needed to complete the render loop.

dgenerate.renderloop.gen_seeds(n: int) → list[int][source]

Generate a list of N random seed integers

Parameters:: n – number of seeds to generate
Returns:: list of integer seeds

dgenerate.renderloop.RenderLoopEvent

Possible events from the event stream created by RenderLoop.events()

dgenerate.renderloop.RenderLoopEventStream

Event stream created by RenderLoop.events()

dgenerate.textprocessing module

Text processing, console text rendering, and parsing utilities. URI parser, and reusable tokenization.

exception dgenerate.textprocessing.ConceptUriParseError[source]

Bases: Exception

Raised by ConceptUriParser.parse() on parsing errors.

exception dgenerate.textprocessing.ShellParseSyntaxError[source]

Bases: Exception

Raised by shell_parse() on syntax errors.

exception dgenerate.textprocessing.TimeDeltaParseError[source]

Bases: Exception

Raised by parse_timedelta() on parse errors.

exception dgenerate.textprocessing.TokenizedSplitSyntaxError[source]

Bases: Exception

Raised by tokenized_split() on syntax errors.

exception dgenerate.textprocessing.UnquoteSyntaxError[source]

Bases: Exception

Raised by unquote() on parsing errors.

class dgenerate.textprocessing.ArgparseParagraphFormatter(*args, **kwargs)[source]

Bases: HelpFormatter

Argparse formatter which preserves paragraphs in help text.

This formatter also underlines option text for better visual segregation.

__init__(*args, **kwargs)[source]

class dgenerate.textprocessing.BasicMaskShape(value)[source]

Bases: Enum

Represents a basic mask shape

ELLIPSE = 1

RECTANGLE = 0

class dgenerate.textprocessing.ConceptUri(concept: str, args: dict[str, str])[source]

Bases: object

Represents a parsed concept URI.

__init__(concept: str, args: dict[str, str])[source]

copy()[source]

args: dict[str, str | list[str]]: Provided keyword arguments with their (string) values.

concept: str: The primary concept mentioned in the URI.

class dgenerate.textprocessing.ConceptUriParser(concept_name: str, known_args: Iterable[str], args_lists: bool | Iterable[str] | None = None, args_raw: bool | Iterable[str] | None = None, delimiter: str = ';')[source]

Bases: object

Parser for dgenerate concept paths with arguments, IE: concept;arg1="a";arg2="b"

Used for --vae, --loras etc. as well as image processor plugin module arguments.

__init__(concept_name: str, known_args: Iterable[str], args_lists: bool | Iterable[str] | None = None, args_raw: bool | Iterable[str] | None = None, delimiter: str = ';')[source]

Raises:

ValueError – if duplicate argument names are specified.

Parameters:

concept_name – Concept name, used in error messages
known_args – valid arguments for the parser, must be unique

parse(uri: str) → ConceptUri[source]

Parse a string.

Parameters:

uri – the string

Raises:

ConceptUriParseError – on parsing errors
ValueError – if uri is None

Returns:

ConceptPath

args_lists: bool | set[str] | None

True indicates all arguments can accept a comma separated list.

None or False indicates no arguments can accept a comma separated list.

Assigning a set containing argument names indicates only the specified arguments can accept a comma separated list.

When an argument is parsed as a comma separated list, its value/type in ConceptUri.args will be that of a list.

args_raw: bool | set[str] | None

True indicates all argument values are returned without any unquoting or processing into lists.

None or False indicates no argument values skip extended processing.

Assigning a set containing argument names indicates only the specified arguments skip extended processing (unquoting or splitting).

concept_name: str: Name / title string for this concept. Used in parse error exceptions.

delimiter: str: URI argument delimiter, the default is semicolon.

known_args: set[str]: Unique recognized keyword arguments

dgenerate.textprocessing.contains_space(string: str) → bool[source]

Check if a string contains any whitespace characters including newlines

Parameters:: string – the string
Returns:: bool

dgenerate.textprocessing.dashdown(string: str) → str[source]

Replace ‘-’ with ‘_’

Parameters:: string – the string
Returns:: modified string

dgenerate.textprocessing.dashup(string: str) → str[source]

Replace ‘_’ with ‘-’

Parameters:: string – the string
Returns:: modified string

dgenerate.textprocessing.debug_format_args(args_dict: dict[str, Any], value_transformer: Callable[[str, Any], str] | None = None, max_value_len: int = 256, as_kwargs: bool = False)[source]

Format function arguments in a way that can be printed for debug messages.

Parameters:

args_dict – argument dictionary
value_transformer – transform values in the argument dictionary
max_value_len – Max length of a formatted value before it is turned into a class and id string only
as_kwargs – Format the string as python keyword arguments instead of a dictionary

Returns:

formatted string

dgenerate.textprocessing.expand_escape_code(code: str) → str[source]

Expand a single escape code character.

For example: expand_escape_code('n') will return ``’

‘``.

param code:

The escape character code.

return:

Expanded character.

dgenerate.textprocessing.format_dgenerate_config(lines: Iterator[str], indentation=' ') → Iterator[str][source]

A very rudimentary code formatter for dgenerate configuration / script.

Does not handle breaking jinja control blocks on to a new line if multiple start blocks exist on the same line.

Parameters:

lines – iterator over lines
indentation – level of indentation for top level jinja control blocks

Returns:

formatted code

Formats a --image-seeds URI to its shortest possible string form.

Raises:

ValueError – if inpaint_image is specified without seed_image. if keyword arguments are present without seed_image or control_images. if frame_start or frame_end are negative values. if frame_start is greater than frame_end. if adapter_images are used with floyd_image. if latents are used with floyd_image. if both control_images and floyd_image are specified. if resize is specified when only latents are provided. if aspect=False is specified when only latents are provided. if frame_start or frame_end is specified when only latents are provided. if too many mask images are provided. if too few mask images are provided. if no arguments are provided.

Parameters:

seed_images – Seed image path(s)
mask_images – Inpaint image path(s)
control_images – Control image path(s)
adapter_images – Adapter image path(s)
latents – Raw latent tensor path(s) (.pt, .pth, .safetensors files)
floyd_image – Path to a Floyd image
resize – Optional resize dimension (WxH string)
aspect – Preserve aspect ratio?
frame_start – Optional frame start index
frame_end – Optional frame end index

Returns:

The generated --image-seeds URI string

dgenerate.textprocessing.format_size(size: int | Iterable[int])[source]

Join together an iterable of integers with the character x

Parameters:: size – the iterable
Returns:: formatted string

dgenerate.textprocessing.has_unescaped_quotes(string, double: bool = True, single: bool = True) → bool[source]

Does a string contain unescaped quotes?

Parameters:

string – The string
double – Detect double quotes?
single – Detect single quotes?

Returns:

True or False.

dgenerate.textprocessing.indent_text(text, initial_indent: str | None = None, subsequent_indent: str | None = None)[source]

Indent consecutive lines of text.

Parameters:

text – Text to be indented
initial_indent – String of characters to be used for the initial indentation
subsequent_indent – String of characters to be used for the subsequent indentation

Returns:

Indented text

dgenerate.textprocessing.is_escape_code(code: str) → bool[source]

Does a character represent an escape code?

Parameters:: code – The character.
Returns:: True or False.

dgenerate.textprocessing.is_quoted(string: str) → bool[source]

Return True if a string is quoted with an identical starting and end quote.

Parameters:: string – the string
Returns:: True or False

dgenerate.textprocessing.justify_left(string: str)[source]

Justify text to the left.

Parameters:: string – string with text
Returns:: left justified text

dgenerate.textprocessing.long_text_wrap_width() → int[source]

Return the current terminal width or the default value of 150 characters for text-wrapping purposes.

This can be affected by the environmental variable COLUMNS.

Raises:: ValueError – if the environmental variable COLUMNS is not an integer value or is less than 0.
Returns:: int

dgenerate.textprocessing.oxford_comma(elements: Collection[str], conjunction: str) → str[source]

Join a sequence of strings with commas, end with an oxford comma and conjunction if needed.

Parameters:

elements – strings
conjunction – “and”, “or”

Returns:

a joined string

dgenerate.textprocessing.parse_basic_mask_shape(string: str) → BasicMaskShape[source]

Parse a basic mask shape from a string.

r -> RECTANGLE rect -> RECTANGLE rectangle -> RECTANGLE c -> ELLIPSE circle -> ELLIPSE ellipse -> ELLIPSE

Parameters:: string – mask shape
Returns:: :py:enum:`BasicMaskShape`

dgenerate.textprocessing.parse_dimensions(string)[source]

Parse a dimensions tuple from a string, integers seperated by the character ‘x’

Parameters:: string – the string
Raises:: ValueError – On non integer dimension values.
Returns:: a tuple representing the dimensions

dgenerate.textprocessing.parse_image_size(string)[source]

Parse an image size tuple from a string, 2 integers seperated by the character ‘x’, or a single integer specifying both dimensions.

Parameters:: string – the string
Raises:: ValueError – On non integer dimension values, or if more than 2 dimensions are provided, or if the product of the dimensions is 0.
Returns:: a tuple representing the dimensions

dgenerate.textprocessing.parse_timedelta(string: str | None) → timedelta[source]

Parse a datetime.timedelta object from an arguments string.

Passing ‘forever’, an empty string, or None will result in this function returning datetime.timedelta.max

Accepts all named arguments of datetime.timedelta

parse_time_delta('days=1; seconds=30')

Raises:: TimeDeltaParseError – on parse errors
Parameters:: string – the arguments string
Returns:: datetime.timedelta

dgenerate.textprocessing.parse_version(string: str) → tuple[int, int, int][source]

Parse a SemVer version string into a tuple of 3 integers

Parameters:: string – the version string
Returns:: tuple of three ints

dgenerate.textprocessing.quote(string: str, char='"') → str[source]

Wrap a string with a quote character.

Double quotes by default.

This is not equivalent to shell quoting.

Parameters:

string – the string
char – the quote character to use

Returns:

the quoted string

Quote any str type values containing spaces, or str type values containing spaces within a list, or list of lists/tuples.

The entire content of the data structure is stringified by this process.

This is not equivalent to shell quoting.

Parameters:: value_or_struct – value or (list of values, and or lists/tuples containing values)
Returns:: input data structure with strings quoted if needed

dgenerate.textprocessing.remove_tail_comments(string) → tuple[bool, str][source]

Remove trailing comments from a dgenerate config line

Will not remove a comment if it is the only thing on the line.

Considers strings and comment escape sequences.

Parameters:: string – the string
Returns:: (removed anything?, stripped string)

dgenerate.textprocessing.remove_terminal_escape_sequences(string)[source]

Remove any terminal escape sequences from a string.

Parameters:: string – the string
Returns:: the clean string

dgenerate.textprocessing.shell_expandvars(string: str) → str[source]

Expand shell variables of form $var, ${var}, %var% in a string.

Escaped characters \$ and \% are ignored.

Unknown variables expand to an empty string.

Supported formats: - Unix-style: $VAR or ${VAR} - Windows-style: %VAR%

Parameters:: string – Input string containing variables to expand.
Returns:: String with expanded variables.

dgenerate.textprocessing.shell_parse(string, expand_home: bool = True, expand_vars: bool = True, expand_glob: bool = True, expand_vars_func: ~typing.Callable[[str], str] = <function shell_expandvars>, glob_hidden: bool = False, glob_recursive: bool = False) → list[str][source]

Shell command line parsing, implements basic home directory expansion, globbing, and environmental variable expansion.

Globbing and home directory expansion do not occur inside strings.

The shell syntax implemented is compatible with the dgenerate config shell syntax, and is unique to dgenerate.

Only stand-alone string arguments undergo POSIX like quote removal and escape code evaluation. The characters: $, %, , ‘, and “, are always resolved in double-quote strings when they are escaped with , meaning that the backslash used to escape them will be removed. In single quoted strings they are not resolved, i.e. the backslash will remain in the string. This process occurs only for string tokens surrounded by white space.

Strings that are intermixed with other tokens, for example: plugin;argument=”test” are not processed at all, the intermixed tokens are lexed as into a single token with no quote removal or escape code expansion. This allows for internal parsing on complex arguments such as URI values to occur without the loss of quoting and escaping information. dgenerate URIs implement custom escaping and quoting rules.

# basic glob
shell_parse('command *.png')

# recursive glob
shell_parse('command dir/**/*.png')

# home directory
shell_parse('command ~')

# home directory of user test
shell_parse('command ~test')

# everything under home directory
shell_parse('command ~/*')

# append text to every glob result (back expansion)
# the quotes are always removed from
# the outside of the appended text
shell_parse('command *".png"') # -> ['command', 'file.png', 'file2.png', ...]

# append text to every glob result (back expansion)
# the quotes are always removed from
# the outside of the appended text
shell_parse("command *'.png'") # -> ['command', 'file.png', 'file2.png', ...]

# environmental variable syntax 1
shell_parse('command $ENVVAR')

# environmental variable syntax 2
shell_parse('command %ENVVAR%')

Parameters:

string – String to parse
expand_home – Expand ~ ?
expand_vars – Expand unix style $ and windows style % environmental variables?
expand_glob – Expand * glob expressions including recursive globs?
expand_vars_func – This function is used to expand shell variables in a string, analogous to os.path.expandvars
glob_hidden – Should globs include hidden directories?
glob_recursive – Should globs be recursive?

Returns:

shell arguments

dgenerate.textprocessing.shell_quote(string: str, double: bool = False, quotes: bool = True, strict: bool = False)[source]

Shell quote a string, compatible with dgenerate config shell syntax.

Parameters:

string – The input string.
double – Intended to be a double-quoted string?
quotes – Add quotes? if False only add the proper escape sequences and no surrounding quotes.
strict – Setting this to true disallows text from being intermixed next to complete strings, for example: test'string' would be fully quoted, even though the shell can normally parse this as a single token. URI arguments do not allow intermixed strings, so this is useful for quoting URI argument values.

Returns:

The quoted string.

dgenerate.textprocessing.tokenized_split(string: str, separator: str | None, remove_quotes: bool = False, strict: bool = False, escapes_in_unquoted: bool = False, escapes_in_unquoted_is_escape: ~typing.Callable[[str], bool] = <function is_escape_code>, escapes_in_unquoted_handler: ~typing.Callable[[str], str] = <function expand_escape_code>, escapes_in_quoted: bool = False, escapes_in_quoted_is_escape: ~typing.Callable[[str], bool] = <function is_escape_code>, escapes_in_quoted_handler: ~typing.Callable[[str], str] = <function expand_escape_code>, single_quotes_raw: bool = False, double_quotes_raw: bool = False, process_string_token: ~typing.Callable[[str], str] = None, process_intermixed_token: ~typing.Callable[[str], str] = None, string_expander: ~typing.Callable[[str, str], str] = None, text_expander: ~typing.Callable[[str], str | list[str]] = None, remove_stray_separators: bool = False, escapable_separator: bool = False, allow_unterminated_strings: bool = False, first_string_halts: bool = False) → list[str][source]

Split a string by a separator and discard whitespace around tokens, avoid splitting within single or double-quoted strings. Empty fields may be used.

Quotes can be always be escaped with a backslash to avoid the creation of a string type token. The backslash will remain in the output if escapes_in_unquoted or escapes_in_quoted are False and the escape occurs in the relevant context.

Raises:

TokenizedSplitSyntaxError – on syntax errors.

Parameters:

string – the string
separator – separator
remove_quotes – remove quotes from quoted string tokens?
strict – Text tokens cannot be intermixed with quoted strings? disallow IE: "text'string'text"
escapes_in_unquoted – evaluate escape sequences in text tokens (unquoted strings)? The slash is retained by default when escaping quotes, this disables that, and also enables handling of the escapes n, r, t, b, f, and \. IE: given separator =";" parse \"token\"; "a b" -> ['"token"', 'a b'], instead of \"token\"; "a b"-> ['\"token\"', 'a b']
escapes_in_unquoted_is_escape – Determine if a character should be expanded with escapes_in_unquoted_handler when escapes_in_unquoted is True.
escapes_in_unquoted_handler – Escape character handler for unquoted tokens when escapes_in_unquoted is True, defaults to expand_escape_code.
escapes_in_quoted – evaluate escape sequences in quoted string tokens? The slash is retained by default when escaping quotes, this disables that, and also enables handling of the escapes n, r, t, b, f, and \. IE given separator = ";" parse token; "a \" b" -> ['token', 'a " b'], instead of token; "a \" b"-> ['token', 'a \" b']
escapes_in_quoted_is_escape – Determine if a character should be expanded with escapes_in_quoted_handler when escapes_in_quoted is True.
escapes_in_quoted_handler – Escape character handler for quoted strings when escapes_in_quoted is True, defaults to expand_escape_code.
single_quotes_raw – Never evaluate escape sequences in single-quoted strings?
double_quotes_raw – Never evaluate escape sequences in double-quoted strings?
process_string_token – post process standalone string tokens. The function should take a string and return a string.
process_intermixed_token – This can be used to run a custom process on intermixed tokens, which consist of text intermixed with quoted strings. The function should take a string and return a string.
string_expander – User post process string expansion hook string_expander(quote_char, string) -> str
text_expander –
User post process text token expansion hook text_expander(text_token) -> str | list[str].

should return a list of new tokens (indicates globbing) or a single token string (indicates no globbing).
remove_stray_separators – Remove consecutive seperator characters with no inner content at the end of the string? In effect, do not create entries for empty separators at the end of a string.
escapable_separator – The seperator character may be escaped with a backslash where it would otherwise cause a split?
allow_unterminated_strings – Allows the lex to end on an unterminated string without a syntax error being produced. It is necessary to perform lookahead N to determine if a seperator is quoted by a string or not, this allows your input to end with an unterminated string and still split correctly, complete strings proceeding the unterminated string which contain the seperator character will not be split on the seperator because the seperator is considered quoted in a string token.
first_string_halts – The first completed string token halts lexing immediately, this is mainly used by the lexer internally for recursion in cases where a lookahead for string termination is required, but may be useful for some external parsing tasks.

Returns:

parsed fields

dgenerate.textprocessing.underline(string: str, underline_char: str = '=') → str[source]

Underline a string with the selected character.

Parameters:

string – the string
underline_char – the character to underline with

Returns:

the underlined string

dgenerate.textprocessing.unquote(string: str, escapes_in_unquoted: bool = False, escapes_in_unquoted_is_escape: ~typing.Callable[[str], bool] = <function is_escape_code>, escapes_in_unquoted_handler: ~typing.Callable[[str], str] = <function expand_escape_code>, escapes_in_quoted: bool = False, escapes_in_quoted_is_escape: ~typing.Callable[[str], bool] = <function is_escape_code>, escapes_in_quoted_handler: ~typing.Callable[[str], str] = <function expand_escape_code>, single_quotes_raw: bool = False, double_quotes_raw: bool = False, strict=True) → str[source]

Remove quotes from a string, including single quotes.

Unquoted strings will have leading and trailing whitespace stripped.

Quoted strings will have leading and trailing whitespace stripped up to where the quotes were.

Parameters:

escapes_in_unquoted – evaluate escape sequences in text tokens (unquoted strings)? The slash is retained by default when escaping quotes, this disables that, and also enables handling of the escapes n, r, t, b, f, and \.
escapes_in_unquoted_is_escape – Determine if a character should be expanded with escapes_in_unquoted_handler when escapes_in_unquoted is True.
escapes_in_unquoted_handler – Escape character handler for unquoted tokens when escapes_in_unquoted is True, defaults to expand_escape_code.
escapes_in_quoted – evaluate escape sequences in quoted string tokens? The slash is retained by default when escaping quotes, this disables that, and also enables handling of the escapes n, r, t, b, f, and \.
escapes_in_quoted_is_escape – Determine if a character should be expanded with escapes_in_quoted_handler when escapes_in_quoted is True.
escapes_in_quoted_handler – Escape character handler for quoted strings when escapes_in_quoted is True, defaults to expand_escape_code.
strict – Disallow intermixed tokens?
escapes_in_unquoted – Render escape sequences in strings that are unquoted?
escapes_in_quoted – Render escape sequences in strings that are quoted?
single_quotes_raw – Never evaluate escape sequences in single-quoted strings?
double_quotes_raw – Never evaluate escape sequences in double-quoted strings? T
string – the string

Returns:

The un-quoted string

dgenerate.textprocessing.wrap(text: str, width: int, initial_indent='', subsequent_indent='', break_long_words=False, break_on_hyphens=False, **fill_args)[source]

Wrap text.

Parameters:

text – The prompt text
width – The wrap width
initial_indent – initial indent string
subsequent_indent – subsequent indent string
break_long_words – Break on long words?
break_on_hyphens – Break on hyphens?
fill_args – extra keyword arguments to textwrap.fill() if desired

Returns:

text wrapped string

dgenerate.textprocessing.wrap_paragraphs(text: str, width: int, break_long_words: bool = False, break_on_hyphens: bool = False, clean_lines: bool = True, **fill_args)[source]

Wrap text that may contain paragraphs without removing separating whitespace.

The directive NOWRAP! can be used to start a paragraph block with no word wrapping, which is useful for manually formatting small blocks of text. NOWRAP! should exist on its own line, immediately followed by the block of text which will have wrapping disabled. The NOWRAP! directive line will not exist in the output text.

Parameters:

text – Text containing paragraphs
width – Wrap with in characters
break_long_words – break on long words? default False
break_on_hyphens – break on hyphens? default False
clean_lines – Right strip lines?
fill_args – extra keyword arguments to textwrap.fill() if desired

Returns:

text wrapped string

dgenerate.torchutil module

Commonly used torch utilities.

exception dgenerate.torchutil.InvalidDeviceOrdinalException[source]

Bases: Exception

Device in device specification (cuda:N, xpu:N) does not exist

dgenerate.torchutil.available_device_types() → list[str][source]

Return a list of available torch device type strings.

Such as: cpu, cuda, xpu, mps

Returns:: List of device type strings

dgenerate.torchutil.default_device() → str[source]

Return a string representing the systems default accelerator device.

Possible Values:

"cuda"

"mps"

"xpu"

"cpu"

Returns:: "cuda", "mps", "xpu", etc.

dgenerate.torchutil.devices_equal(device1: device | str, device2: device | str)[source]

Compare if two devices are the same device.

This considers cuda and cuda:{torch.cuda.current_device()} to be the same device, and xpu and xpu:{torch.xpu.current_device()} to be the same device.

Parameters:

device1 – Device 1.
device2 – Device 2.

Returns:

Equality?

dgenerate.torchutil.estimate_module_memory_usage(module: Module) → str[source]

Estimate the static memory use of a torch module.

Parameters:: module – the module
Returns:: static memory use in bytes

dgenerate.torchutil.invalid_device_message(device: device | str, cap: bool = True) → str[source]

Generate a standard invalid device message.

For example: Must be ...., unknown value: (given value)"

Or: CUDA device ordinal 2 is invalid, no such device exists.

The content is hardware / platform / selected device specific.

Parameters:

device – The device given that was invalid
cap – The message starts with a capital?

Returns:

Invalid device message string.

dgenerate.torchutil.is_cuda_available() → bool[source]

Check if CUDA is available on this system.

Returns:: True if CUDA is available, False otherwise

dgenerate.torchutil.is_mps_available() → bool[source]

Check if Apple Metal Performance Shaders (MPS) is available on this system.

Returns:: True if MPS is available, False otherwise

dgenerate.torchutil.is_tensor(obj) → bool[source]

Check if an object is a PyTorch tensor.

Parameters:: obj – Object to check
Returns:: True if the object is a torch.Tensor, False otherwise

dgenerate.torchutil.is_valid_device_string(device: str, raise_ordinal=False)[source]

Is a device string valid? including the device ordinal specified?

Other than cuda, “mps” (MacOS metal performance shaders) and “xpu” (Intel) is experimentally supported.

Parameters:

device – device string, such as cpu, or cuda, or cuda:N, or xpu:N
raise_ordinal – Raise InvalidDeviceOrdinalException if a specified CUDA or XPU device ordinal is found to not exist?

Raises:

InvalidDeviceOrdinalException – If raise_ordinal=True and a the device ordinal specified in a CUDA or XPU device string does not exist.

Returns:

True or False

dgenerate.torchutil.is_xpu_available() → bool[source]

Check if Intel XPU is available on this system.

Returns:: True if XPU is available, False otherwise

dgenerate.translators module

Translation backends for language translation via local inference.

exception dgenerate.translators.TranslationError[source]

Bases: Exception

Cannot translate text.

exception dgenerate.translators.TranslatorLoadError[source]

Bases: Exception

Cannot load translation models.

class dgenerate.translators.ArgosTranslator(from_lang: str, to_lang: str, local_files_only: bool = False)[source]

Bases: object

Translate languages locally on the CPU using argostranslate models.

Supports automatic pivot language selection.

__init__(from_lang: str, to_lang: str, local_files_only: bool = False)[source]

Parameters:

from_lang – From language code (IETF), or language name.
to_lang – To language code (IETF), or language name.
local_files_only – Only use models that have been previously cached?

Raises:

dgenerate.translators.TranslatorLoadError – If models cannot be loaded / found.

translate(texts: str | list[str]) → list[str][source]: Translate a list of texts. :param texts: Texts to translate. :return: Translated texts.

class dgenerate.translators.MarianaTranslator(from_lang: str, to_lang: str, local_files_only: bool = False)[source]

Bases: object

Translate languages locally using Helsinki-NLP opus models on the CPU or GPU.

Supports automatic pivot language selection.

__init__(from_lang: str, to_lang: str, local_files_only: bool = False)[source]

Parameters:

from_lang – From language code (IETF), or language name.
to_lang – To language code (IETF), or language name.
local_files_only – Only use models that have been previously cached?

Raises:

dgenerate.translators.TranslatorLoadError – If models cannot be loaded / found.

to(device: str | device)[source]: Move the model(s) to a specific device. :param device: The device :return: self

translate(texts: str | list[str]) → list[str][source]: Translate a list of texts. :param texts: Texts to translate. :return: Translated texts.

dgenerate.translators.disable_offline_mode()[source]

Disable offline mode for the translators module.

This will allow network requests to be made again.

dgenerate.translators.enable_offline_mode()[source]

Enable offline mode for the translators module.

This will prevent any network requests from being made.

dgenerate.translators.get_language_code(language: str) → str | None[source]

Return an IETF language code for the given language name or code.

Parameters:: language – Name of the language, or IETF language code.
Returns:: Language code, or None if not found.

dgenerate.translators.is_offline_mode() → bool[source]

Check if the translators module is in offline mode.

Returns:: True if in offline mode, False otherwise.

dgenerate.translators.offline_mode_context(enabled=True)[source]

Context manager to temporarily enable or disable offline mode for the translators module.

Parameters:: enabled – If True, enables offline mode. If False, disables it.

dgenerate.types module

Commonly used static type definitions and utilities for introspecting on objects, functions, types, etc.

class dgenerate.types.SetFromMixin[source]

Bases: object

Allows an obj ot have its attributes set from a dictionary or attributes taken from another obj with an overlapping set of attribute names.

set_from(obj: Any | dict, missing_value_throws: bool = True)[source]

Set the attributes in this configuration obj from a dictionary or another obj possessing keys / attributes of the same name.

Parameters:

obj – The obj, or dictionary
missing_value_throws – whether to throw ValueError if obj is missing an attribute that exist in this obj

Returns:

self

dgenerate.types.class_and_id_string(obj) → str[source]

Return a string formatted with an objects class name next to its memory ID.

IE: <ClassName: id_integer>

Parameters:: obj – the obj
Returns:: formatted string

dgenerate.types.default(value, default_value)[source]

Return value if value is not None, otherwise default

Parameters:

value
default_value

Returns:

value or default_value

dgenerate.types.format_function_signature(func: Callable, alternate_name: str | None = None, omit_params: Container | None = None) → str[source]

Formats the signature of a given function to a string, including default values for arguments.

Parameters:

func – The function
alternate_name – Alternate function display name
omit_params – Omit parameters by name, this should be a container filled with parameter names, such as a set or list.

dgenerate.types.fullname(obj) → str[source]

Get the fully qualified name of an obj or function or type/class

Parameters:: obj – The obj
Returns:: Fully qualified name

dgenerate.types.get_accepted_args_with_defaults(func) → Iterator[tuple[str] | tuple[str, Any]][source]

Get the argument signature of a simple function with any default values present.

Parameters:: func – the function
Returns:: an iterator over tuples of length 1 or 2, length 2 indicates a default argument value is present. (name,) or (name, value)

dgenerate.types.get_all_base_classes(cls) → list[Type][source]

Get all base classes of a given class recursively.

Returns:: A list of all base classes.

dgenerate.types.get_default_args(func) → Iterator[tuple[str, Any]][source]

Get a list of default arguments from a simple function with their default values.

Parameters:: func – the function
Returns:: iterator over tuples of length 2 (name, value)

dgenerate.types.get_null_attr_name(e) → str | None[source]

If an AttributeError occurred due to accessing an attribute of a NoneType value, get the exact name of the value (variable ID) that was None.

For the most accurate result, sources must be present. If sources are not present the line of code where the issue occurred is returned instead.

Parameters:: e – The exception
Returns:: Name or None if not found.

dgenerate.types.get_null_call_name(e) → str | None[source]

If a TypeError occurred due to calling a NoneType value, get the exact name of the value (variable ID) that was None

For the most accurate result, sources must be present. If sources are not present the line of code where the issue occurred is returned instead.

Parameters:: e – The exception
Returns:: Name or None if not found.

dgenerate.types.get_public_attributes(obj) → dict[str, Any][source]

Get the public attributes (excluding functions) and their values from an obj.

Parameters:: obj – the obj
Returns:: dict of attribute names to values

dgenerate.types.get_public_members(obj) → dict[str, Any][source]

Get the public members (including functions) and their values from an obj.

Parameters:: obj – the obj
Returns:: dict of attribute names to values

dgenerate.types.get_public_properties(obj) → dict[str, Any][source]

Get the public property-decorated attributes and their values from an obj.

Parameters:: obj – the obj
Returns:: dict of property names to values

dgenerate.types.get_type(hinted_type) → type[source]

Get the basic type of hinted type

Parameters:: hinted_type – the type hint
Returns:: bool

dgenerate.types.get_type_of_optional(hinted_type: type, get_origin=True) → type[source]

Get the first possible type for an typing.Optional[] type hint

Parameters:

get_origin – Should the returned type be the origin type?
hinted_type – The hinted type to extract from

Returns:

the type, or None

dgenerate.types.is_optional(hinted_type) → bool[source]

Check if a hinted type is typing.Optional[], or an equivalent union.

Parameters:: hinted_type – The hinted type
Returns:: bool

dgenerate.types.is_type(hinted_type, comparison_type) → bool[source]

Check if a hinted type is equal to a comparison type.

Parameters:

hinted_type – The hinted type
comparison_type – The type to check for

Returns:

bool

dgenerate.types.is_type_or_optional(hinted_type, comparison_type) → bool[source]

Check if a hinted type is equal to a comparison type, even if the type hint is typing.Optional[] (compare the inside if necessary).

Parameters:

hinted_type – The hinted type
comparison_type – The type to check for

Returns:

bool

dgenerate.types.is_typing_hint(obj) → bool[source]

Does a type hint object originate from the the builtin typing or types module?

Parameters:: obj – type hint
Returns:: True or False

dgenerate.types.is_union(hinted_type) → bool[source]: Check if a hinted type is typing.Union or types.UnionType :return: True if the type is one of typing.Union or types.UnionType else False

dgenerate.types.iterate_attribute_combinations(attribute_defs: Iterable[tuple[str, Iterable]], return_type: type) → Iterator[source]

Iterate over every combination of attributes in a given class using a list of tuples mapping attribute names to a list of possible values.

Parameters:

attribute_defs – sequence of tuple (attribute_name, [sequence of values])
return_type – Construct this type and assign attribute values to it

Returns:

an iterator over instances of the type mentioned in the my_class argument

dgenerate.types.module_all()[source]

Return the name of all public non-module type global objects inside the current module.

Can be used for __all__

Returns:: list of names

dgenerate.types.parse_bool(string_or_bool: str | bool) → bool[source]

Parse a case insensitive boolean value from a string, for example “true” or “false”

Additionally, values that are already bool are passed through.

Raises:: ValueError – on parse failure.
Parameters:: string_or_bool – the string, or a bool value
Returns:: python boolean type equivalent

dgenerate.types.partial_deep_copy_container(container: list | tuple | dict | set)[source]

Partially copy nested containers, handles lists, tuples, dicts, and sets.

Parameters:: container – top level container
Returns:: structure with all containers copied

dgenerate.types.type_check_struct(obj, attribute_namer: Callable[[str], str] | None = None)[source]

Perform some basic type checks on a struct like objects attributes using their type hints.

This function can only handle typing.Union and types.UnionType constructs in a very basic capacity.

Raises:

ValueError – on type checking failure

Parameters:

obj – the object
attribute_namer – function which names attributes an alternate name

dgenerate.webcache module

Single point of access to the global dgenerate web cache.

dgenerate.webcache.create_web_cache_file(url, mime_acceptable_desc: str | None = None, mimetype_is_supported: ~typing.Callable[[str], bool] | None = None, unknown_mimetype_exception: type[Exception] = <class 'ValueError'>, overwrite: bool = False, tqdm_pbar=<class 'tqdm.std.tqdm'>, local_files_only: bool = False) → tuple[str, str][source]

Download a file from a url and add it to dgenerate’s temporary web cache that is available to all concurrent dgenerate processes.

If the file exists in the cache already, return information for the existing file.

Append API tokens if applicable, such as CIVIT_AI_TOKEN from your environment.

Raises:

requests.RequestException – Can raise any exception raised by requests.get for request related errors.
WebFileCacheOfflineModeException – If the web cache is in offline mode and the file is not found in the cache.

Parameters:

url – The url
mime_acceptable_desc – A description of acceptable mimetypes for use in exceptions.
mimetype_is_supported – A function that determines if a mimetype is supported for downloading.
unknown_mimetype_exception – The exception type to raise when an unknown mimetype is encountered.
overwrite – Always overwrite any previously cached file?
tqdm_pbar – tqdm progress bar type, if set to None no progress bar will be used. Defaults to tqdm.tqdm
local_files_only – If True, do not attempt to download files, only check cache.

Returns:

tuple(mimetype_str, filepath)

dgenerate.webcache.disable_offline_mode()[source]

Disable offline mode for the web cache.

This will allow network requests to be made again.

dgenerate.webcache.enable_offline_mode()[source]

Enable offline mode for the web cache.

This will prevent any network requests from being made, and will only use files that are already in the web cache.

dgenerate.webcache.get_web_cache_directory() → str[source]

Get the default web cache directory.

Or the value of the environmental variable DGENERATE_CACHE joined with web.

Returns:: string (directory path)

dgenerate.webcache.is_downloadable_url(string) → bool[source]

Does a string represent a URL that can be downloaded into the web cache?

Parameters:: string – the string
Returns:: True or False

dgenerate.webcache.is_offline_mode() → bool[source]

Check if the web cache is in offline mode.

Returns:: True if the web cache is in offline mode, False otherwise.

dgenerate.webcache.offline_mode_context(enabled=True)[source]

Context manager to temporarily enable or disable offline mode for the web cache.

Parameters:: enabled – If True, enables offline mode. If False, disables it.

dgenerate.webcache.request_mimetype(url, local_files_only: bool = False) → str[source]

Request the mimetype of a file at a URL, if the file exists in the cache, a known mimetype is returned without connecting to the internet. Otherwise, connect to the internet to retrieve the mimetype, this action does not update the cache.

Parameters:

url – The url
local_files_only – If True, do not make a request, only check the cache.

Raises:

WebFileCacheOfflineModeException – If the web cache is in offline mode and the file data is not found in the cache.

Returns:

mimetype string