UniversalExtractor
Attributes
- Graph
-
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
UniversalExtractor.type
Members list
Type members
Classlikes
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalstrait Extractedclass Objecttrait Matchableclass AnyShow all
Attributes
- Supertypes
-
class Objecttrait Matchableclass Any
- Known subtypes
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalstrait Extractedclass Objecttrait Matchableclass AnyShow all
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalstrait Extractedclass Objecttrait Matchableclass AnyShow all
Attributes
- Supertypes
-
trait Serializabletrait Producttrait Equalstrait Extractedclass Objecttrait Matchableclass AnyShow all
Value members
Concrete methods
Detect MIME type from bytes and filename.
Detect MIME type from bytes and filename.
Value parameters
- content
-
Raw document bytes (first few KB are sufficient)
- filename
-
Filename hint for detection
Attributes
- Returns
-
Detected MIME type string
Extract text from raw bytes.
Extract text from raw bytes.
This method enables source-agnostic document extraction - the same extraction logic can be used for documents from S3, HTTP responses, databases, etc.
Value parameters
- content
-
Raw document bytes
- filename
-
Filename for MIME type detection (e.g., "report.pdf")
- mimeType
-
Optional explicit MIME type (skips detection if provided)
Attributes
- Returns
-
Extracted text content or an error
Extract text from an InputStream.
Extract text from an InputStream.
Note: This reads the entire stream into memory for processing. The caller is responsible for closing the stream after this method returns.
Value parameters
- filename
-
Filename for MIME type detection
- input
-
InputStream to read from
- mimeType
-
Optional explicit MIME type
Attributes
- Returns
-
Extracted text content or an error