ChunkingUtils

org.llm4s.llmconnect.utils.ChunkingUtils
object ChunkingUtils

Attributes

Graph
Supertypes
class Object
trait Matchable
class Any
Self type

Members list

Value members

Concrete methods

def chunkAudio(samples: Array[Float], sampleRate: Int, windowSeconds: Int, overlapRatio: Double, padToWindow: Boolean): Seq[Array[Float]]

Window an audio signal into fixed-length segments with overlap. Optionally right-pad the final window with zeros so all windows have equal length.

Window an audio signal into fixed-length segments with overlap. Optionally right-pad the final window with zeros so all windows have equal length.

Value parameters

overlapRatio

Overlap ratio in [0, 1). For example, 0.25 = 25% overlap.

padToWindow

If true, pad the last segment with zeros to full window length.

sampleRate

Samples per second (> 0).

samples

Mono PCM samples in [-1, 1].

windowSeconds

Window length in seconds (> 0).

Attributes

Returns

Sequence of audio windows (each Array[Float] of length windowSamples if padded).

def chunkText(text: String, size: Int, overlap: Int): Seq[String]

Splits a long text into chunks with specified size and overlap.

Splits a long text into chunks with specified size and overlap.

Value parameters

overlap

Number of overlapping characters between chunks (0 <= overlap < size).

size

Maximum characters per chunk (> 0).

text

Input string.

Attributes

Returns

Sequence of text chunks.

def chunkVideo[T](frames: Seq[T], fps: Int, clipSeconds: Int, overlapRatio: Double): Seq[Seq[T]]

Chunk a sequence of frames into clips of fixed duration with overlap. Generic over frame type T (e.g., BufferedImage).

Chunk a sequence of frames into clips of fixed duration with overlap. Generic over frame type T (e.g., BufferedImage).

Value parameters

clipSeconds

Clip duration in seconds (> 0).

fps

Frames per second (> 0).

frames

Sequence of frames.

overlapRatio

Overlap ratio in [0, 1).

Attributes

Returns

Sequence of frame clips (each is a Seq[T]).