Image Generation with llm4s
Image Generation with llm4s
The llm4s
library provides a powerful and flexible API for generating images using different providers like Stable Diffusion (via a local WebUI) and the Hugging Face Inference API. This guide will walk you through setting up and using the image generation capabilities.
Core Concepts
The API is centered around the ImageGenerationClient
trait, which defines the common interface for interacting with different image generation services. You can get a pre-configured client using the ImageGeneration
factory object.
Key models include:
ImageGenerationConfig
: A trait for provider-specific configurations (StableDiffusionConfig
,HuggingFaceConfig
).ImageGenerationOptions
: A class to control generation parameters like image size, seed, and negative prompts.GeneratedImage
: A case class representing the generated image, including its data and metadata.
Using the Stable Diffusion Client
This client connects to an instance of the AUTOMATIC1111/stable-diffusion-webui. You must have this running locally or on a remote server.
Setup
First, ensure your build.sbt
includes the llm4s
dependency.
libraryDependencies += "org.llm4s" %% "llm4s" % "LATEST_VERSION" // Replace with the actual version
Example Usage
Here’s how to create a client and generate an image:
import org.llm4s.imagegeneration._
import java.nio.file.Paths
// 1. Create a client for a local Stable Diffusion server
val sdClient = ImageGeneration.stableDiffusionClient(
baseUrl = "http://localhost:7860" // Default URL
)
// 2. Define a prompt and generation options
val prompt = "A photorealistic portrait of a majestic lion"
val options = ImageGenerationOptions(
size = ImageSize.Square512,
negativePrompt = Some("cartoon, drawing, sketch, blurry")
)
// 3. Generate the image
println("Generating image with Stable Diffusion...")
sdClient.generateImage(prompt, options) match {
case Right(image) =>
println("Image generated successfully!")
// Save the image
val path = Paths.get("stable_diffusion_lion.png")
image.saveToFile(path)
println(s"Image saved to ${path.toAbsolutePath}")
case Left(error) =>
println(s"Error generating image: ${error.message}")
}
Using the Hugging Face Client
This client uses the Hugging Face Inference API, which requires an API token.
Setup
You will need a Hugging Face account and an API token with write permissions.
Example Usage
The process is very similar. The main difference is the configuration.
import org.llm4s.imagegeneration._
import java.nio.file.Paths
// 1. Get your API token (it's best to use an environment variable)
val hfApiKey = sys.env.getOrElse("HF_API_TOKEN", "your_hf_api_token_here")
if (hfApiKey == "your_hf_api_token_here") {
println("Please set the HF_API_TOKEN environment variable.")
} else {
// 2. Create a Hugging Face client
val hfClient = ImageGeneration.huggingFaceClient(
apiKey = hfApiKey,
model = "runwayml/stable-diffusion-v1-5" // You can choose other models
)
// 3. Define a prompt
val prompt = "A cute robot reading a book, sci-fi concept art"
// 4. Generate the image
println("Generating image with Hugging Face...")
hfClient.generateImage(prompt) match {
case Right(image) =>
println("Image generated successfully!")
val path = Paths.get("hugging_face_robot.png")
image.saveToFile(path)
println(s"Image saved to ${path.toAbsolutePath}")
case Left(error) =>
println(s"Error generating image: ${error.message}")
}
}
Customizing Generation
The ImageGenerationOptions
class allows you to fine-tune the generation process:
size
: The dimensions of the output image (e.g.,ImageSize.Square512
).format
: The image format (ImageFormat.PNG
orImageFormat.JPEG
).seed
: A specific seed for reproducible results (Option[Long]
).guidanceScale
: How strictly the model should follow the prompt (Double
).inferenceSteps
: The number of steps in the diffusion process (Int
).negativePrompt
: A prompt describing what you don’t want to see (Option[String]
).