UrlNormalizer
Utility for normalizing URLs to ensure consistent deduplication.
Handles:
- Scheme normalization (lowercase)
- Host normalization (lowercase)
- Path normalization (remove trailing slash, decode/encode consistently)
- Fragment removal
- Optional query parameter handling
Attributes
- Graph
-
- Supertypes
-
class Objecttrait Matchableclass Any
- Self type
-
UrlNormalizer.type
Members list
Value members
Concrete methods
Extract the domain (host) from a URL.
Extract the domain (host) from a URL.
Value parameters
- url
-
URL to extract domain from
Attributes
- Returns
-
Domain string (lowercase), or None if invalid
Check if a URL belongs to one of the allowed domains.
Check if a URL belongs to one of the allowed domains.
Value parameters
- allowedDomains
-
Set of allowed domains
- url
-
URL to check
Attributes
- Returns
-
true if URL's domain matches or is subdomain of an allowed domain
Check if a URL is a valid HTTP/HTTPS URL.
Check if a URL is a valid HTTP/HTTPS URL.
Value parameters
- url
-
URL to validate
Attributes
- Returns
-
true if URL is valid HTTP or HTTPS
Normalize a URL for comparison and deduplication.
Normalize a URL for comparison and deduplication.
Value parameters
- includeQueryParams
-
Whether to keep query parameters
- url
-
URL string to normalize
Attributes
- Returns
-
Normalized URL string, or original if parsing fails
Resolve a potentially relative URL against a base URL.
Resolve a potentially relative URL against a base URL.
Value parameters
- baseUrl
-
Base URL (page the link was found on)
- href
-
Link href (may be relative or absolute)
- includeQueryParams
-
Whether to keep query parameters
Attributes
- Returns
-
Resolved and normalized absolute URL