- Remove ID field from Item struct
- Remove ID field from SearchItem struct
- Update all SQL queries to not select id column
- Change MarkItemPublished to use feedURL/guid instead of id
- Update shortener to use item_guid instead of item_id
- Add migration to convert item_id to item_guid in short_urls table
- Update API endpoints to use feedUrl/guid instead of itemId
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Replace source_host column with proper FK to domains table using
composite key (domain_host, domain_tld). This enables JOIN queries
instead of string concatenation for domain lookups.
Changes:
- Update Feed struct: SourceHost/TLD → DomainHost/DomainTLD
- Update all SQL queries to use domain_host/domain_tld columns
- Add column aliases (as source_host) for API backwards compatibility
- Update trigram index from source_host to domain_host
- Add getDomainHost() helper for extracting host from domain
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add shutdownCh channel to signal goroutines to stop
- Check IsShuttingDown() in all main loops
- Wait 2 seconds for goroutines to finish before closing DB
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Feed details now expand inline instead of navigating to new page
- Add TLD section headers with domains sorted by TLD then name
- Add TLD filter button to show/hide domain sections by TLD
- Feed status behavior: pass creates account, hold crawls only, skip stops, drop cleans up
- Auto-follow new accounts from directory account (1440.news)
- Fix handle derivation (removed duplicate .1440.news suffix)
- Increase domain import batch size to 100k
- Various bug fixes for account creation and profile updates
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Use labeled links (Article · Audio) instead of raw URLs in posts
- Add language filter dropdown to dashboard with toggle selection
- Auto-deny feeds with no language on discovery
- Add deny/undeny buttons for domains to block crawling
- Denied domains set feeds to dead status, preventing future checks
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- New short_urls and clicks tables for URL mapping and analytics
- /r/{code} redirect endpoint with click tracking
- Short URLs use 6-char base64 hash codes (26 chars total)
- Publish loop now shortens article links and enclosure URLs
- Enables podcast audio URLs to fit in posts (139 → 26 chars)
- Tracks: timestamp, referrer, user agent, anonymized IP
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Load enclosure fields in GetAllUnpublishedItems query
- Only include enclosure URL if it fits within post length limit
- Shorter video/audio enclosures will be included when they fit
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Add image_urls to GetAllUnpublishedItems query
- Add aspectRatio to image embeds (required by Bluesky)
- Add image decoding to get dimensions (width/height)
- Fix rkey collision by using XOR of multiple hash bytes
The rkey collision was caused by using only 2 hash bytes (10 bits)
which had ~0.1% collision rate per pair of items with same timestamp.
Now XORs 8 hash bytes for better entropy distribution.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fetches the site's favicon and uses it as the avatar when creating
or updating feed account profiles. Tries common favicon locations
(/favicon.ico, /favicon.png, /apple-touch-icon.png) then falls back
to Google's favicon service.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Updates existing account profiles with the feed URL on startup.
This ensures all accounts have the source feed URL in their
profile description.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
When creating new accounts, include the full RSS/Atom feed URL
in the profile description so users can find the original source.
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Publishing:
- Add publisher.go for posting feed items to AT Protocol PDS
- Support deterministic rkeys from SHA256(guid + discoveredAt)
- Handle multiple URLs in posts with facets for each link
- Image embed support (app.bsky.embed.images) for up to 4 images
- External embed with thumbnail fallback
- Podcast/audio enclosure URLs included in post text
Media extraction:
- Parse RSS enclosures (audio, video, images)
- Extract Media RSS content and thumbnails
- Extract images from HTML content in descriptions
- Store enclosure and imageUrls in items table
SQLite stability improvements:
- Add synchronous=NORMAL and wal_autocheckpoint pragmas
- Connection pool tuning (idle conns, max lifetime)
- Periodic WAL checkpoint every 5 minutes
- Hourly integrity checks with PRAGMA quick_check
- Daily hot backup via VACUUM INTO
- Docker stop_grace_period: 30s for graceful shutdown
Dashboard:
- Feed publishing UI and API endpoints
- Account creation with invite codes
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
- Split main.go into separate files for better organization:
crawler.go, domain.go, feed.go, parser.go, html.go, util.go
- Add PebbleDB for persistent storage of feeds and domains
- Store feeds with metadata: title, TTL, update frequency, ETag, etc.
- Track domains with crawl status (uncrawled/crawled/error)
- Normalize URLs by stripping scheme and www. prefix
- Add web dashboard on port 4321 with real-time stats:
- Crawl progress with completion percentage
- Feed counts by type (RSS/Atom)
- Top TLDs and domains by feed count
- Recent feeds table
- Filter out comment feeds from results
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>