infra-dns container has UDP connectivity issues to upstream DNS.
System resolver works (proven via wget test).
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- StartDomainCheckLoop: DNS verification for unchecked domains (1000 workers)
- StartFeedCrawlLoop: Feed discovery on DNS-verified domains (100 workers)
This fixes starvation where 104M unchecked domains blocked 1.2M
DNS-verified domains from ever being crawled for feeds.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Items now have a status column ('pass' or 'fail', default 'pass') to
control publishing eligibility. Includes migration for existing databases.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Publishing functionality has been moved to the standalone publish service.
Removed:
- publisher.go, pds_auth.go, pds_records.go, image.go, handle.go
- StartPublishLoop and related functions from crawler.go
- Publish loop invocation from main.go
Updated CLAUDE.md to reflect the new architecture.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add dpop_authserver_nonce, dpop_pds_nonce, pds_url, authserver_iss columns
- These columns are required by GetSession query but were missing from schema
- Add migrations to create columns on existing tables
- Add debug logging for OAuth flow troubleshooting
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Update schema to create oauth_sessions instead of sessions
- Add migration to rename existing sessions table
- Add token_expiry column for OAuth library compatibility
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Remove ID field from Item struct
- Remove ID field from SearchItem struct
- Update all SQL queries to not select id column
- Change MarkItemPublished to use feedURL/guid instead of id
- Update shortener to use item_guid instead of item_id
- Add migration to convert item_id to item_guid in short_urls table
- Update API endpoints to use feedUrl/guid instead of itemId
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace source_host column with proper FK to domains table using
composite key (domain_host, domain_tld). This enables JOIN queries
instead of string concatenation for domain lookups.
Changes:
- Update Feed struct: SourceHost/TLD → DomainHost/DomainTLD
- Update all SQL queries to use domain_host/domain_tld columns
- Add column aliases (as source_host) for API backwards compatibility
- Update trigram index from source_host to domain_host
- Add getDomainHost() helper for extracting host from domain
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add shutdownCh channel to signal goroutines to stop
- Check IsShuttingDown() in all main loops
- Wait 2 seconds for goroutines to finish before closing DB
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When searching for "npr.org" and viewing the .org TLD, use the host part
("npr") for matching instead of the full pattern ("npr.org").
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When searching for patterns like "npr.org", the search now also matches
the exact domain (host=npr, tld=org) in addition to the existing text
search across domain names, feed URLs, titles, and descriptions.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Checks column types before running ALTER TYPE migrations to avoid
slow table scans on every restart. Also guards column renames.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>