Commit Graph

  • 6cd7faf35f Use Athens proxy for Docker builds main primal 2026-02-06 21:36:52 -05:00
  • 78e7148e01 Fix saveItems: use individual inserts instead of transaction primal 2026-02-06 01:15:54 -05:00
  • 2615a03ab9 Remove 'feeds' from API subdomain list - fixes feeds.npr.org URL rewriting primal 2026-02-06 01:06:43 -05:00
  • 48f1740a81 Increase crawler DB pool from 10 to 100 primal 2026-02-06 01:01:28 -05:00
  • 23fcff58b4 Skip fetching feeds when work channel is >50% full to prevent DB pool exhaustion primal 2026-02-06 00:58:55 -05:00
  • 0270d72920 Prioritize never-checked feeds in GetFeedsDueForCheck primal 2026-02-06 00:53:15 -05:00
  • 699639f3d6 Delete shortener.go - replaced by rkey-based tracking primal 2026-02-05 22:15:16 -05:00
  • 1f951ed394 Add rkey IS NULL filters to crawler item queries primal 2026-02-05 22:15:00 -05:00
  • 4fdf9c6695 Add rkey/visit_count columns, drop short_urls/clicks tables primal 2026-02-05 22:12:10 -05:00
  • 1f4aea2360 Halve miss_count on hit instead of resetting to zero primal 2026-02-05 21:49:50 -05:00
  • fcdfc03238 Remove publishing columns from items (queue architecture) primal 2026-02-05 12:59:39 -05:00
  • dd8029ab37 Add deterministic TID generation for AT Protocol rkeys primal 2026-02-05 12:48:11 -05:00
  • f48fecfaf0 Optimize feed check ordering for connection reuse primal 2026-02-05 12:15:59 -05:00
  • d4a1928fa6 Increase feed check parallelism for 1Gbps bandwidth primal 2026-02-05 12:11:58 -05:00
  • 1f90b7d6a0 Simplify launch script primal 2026-02-04 23:49:21 -05:00
  • b515976fef Add clean-descriptions migration tool primal 2026-02-04 23:43:58 -05:00
  • 253e04a749 Strip HTML and truncate descriptions on input primal 2026-02-04 22:59:21 -05:00
  • 70828bf05d Remove unused enclosure_length from items table primal 2026-02-04 22:56:45 -05:00
  • 3af9e65937 Remove unused updated_at column from items table primal 2026-02-04 22:44:33 -05:00
  • 018c059924 Remove GUID from items, use Link as primary key primal 2026-02-04 21:34:39 -05:00
  • 6314b934c1 Remove content column from items table - only description used for posts primal 2026-02-04 21:14:29 -05:00
  • 8cf25a55dc Remove item_count column from feeds table - compute dynamically from items primal 2026-02-04 21:00:23 -05:00
  • f4c6a9d814 Remove last_error_at from feeds table and queries primal 2026-02-04 20:53:52 -05:00
  • be969c11db Remove site_url field from feeds primal 2026-02-04 20:44:35 -05:00
  • 288379804d Remove category field from feeds primal 2026-02-04 20:37:26 -05:00
  • 94369232f8 Remove domains table - feeds imported directly from CDX primal 2026-02-04 20:17:52 -05:00
  • 82b40b9155 Remove domain_host/domain_tld columns from feeds table primal 2026-02-04 20:07:03 -05:00
  • 037b453a68 Remove oldest_item_date and newest_item_date columns primal 2026-02-04 19:28:20 -05:00
  • fec53f913c Remove last_build_date column from feeds schema primal 2026-02-04 19:16:30 -05:00
  • 2c3fa5e104 Remove discovered_at column from feeds and items tables primal 2026-02-04 19:07:20 -05:00
  • 0428ff0241 Remove unused source_url column from feeds table primal 2026-02-04 16:16:53 -05:00
  • e56bcd456d Remove next_check_at column from feeds table primal 2026-02-04 16:07:04 -05:00
  • 1b0ff1b507 Add import-tsv tool for bulk importing TSV feed files primal 2026-02-04 15:25:29 -05:00
  • 091fa8490b Filter text/html extraction by feed-like URL patterns primal 2026-02-04 14:30:40 -05:00
  • 61ca7a4c7a Add html feed type for content-sniffed feeds primal 2026-02-04 14:18:26 -05:00
  • 5c43fc693b Add text/html to CDX filter for feeds with wrong MIME type primal 2026-02-04 14:11:57 -05:00
  • 12bd68000d Rename status column to feed_health primal 2026-02-04 10:58:33 -05:00
  • 02378950f4 Add bulk import for CDX feeds primal 2026-02-04 10:42:52 -05:00
  • 3b1b12ff70 Simplify crawler: use CDX for feed discovery, remove unused loops primal 2026-02-03 22:53:23 -05:00
  • 07621a7059 Switch back to infra-dns for DNS lookups primal 2026-02-02 21:02:28 -05:00
  • e6761954c0 Use system DNS resolver instead of custom infra-dns primal 2026-02-02 20:55:57 -05:00
  • f2bb1e72d2 Split domain processing into separate check and crawl loops primal 2026-02-02 20:35:46 -05:00
  • 26de5d3753 Add status column to items table primal 2026-02-02 15:46:33 -05:00
  • 6eaa39f9db Remove publishing code - now handled by publish service primal 2026-02-02 15:40:49 -05:00
  • 7b50f5c008 Update shared references to commons primal 2026-02-02 15:19:48 -05:00
  • bd76ea1108 Trim shortener.go - keep only URL creation, remove click tracking primal 2026-02-02 13:28:10 -05:00
  • aea101a5e7 Update short URLs to use news.1440.news primal 2026-02-02 13:23:24 -05:00
  • ec53ad59db Phase 5: Remove dashboard code from crawler primal 2026-02-02 13:08:48 -05:00
  • fa82d8b765 Move plan to dedicated plans/ directory primal 2026-02-02 12:40:48 -05:00
  • 98bee87c05 Add dashboard separation plan primal 2026-02-02 12:39:08 -05:00
  • bce9369cb8 Fix OAuth session storage - add missing database columns primal 2026-02-02 00:44:19 -05:00
  • 86d669e08e Make oauth_sessions.access_token nullable primal 2026-02-02 00:35:53 -05:00
  • 265975c7c5 Rename sessions table to oauth_sessions for consistency primal 2026-02-02 00:34:13 -05:00
  • 615aa6ef5d Fix TLD sync to use domain_tld column for feeds table primal 2026-02-01 23:52:29 -05:00
  • 3f277ec165 Remove item ID column references - items now use composite PK (guid, feed_url) primal 2026-02-01 23:51:44 -05:00
  • 7ec4207173 Migrate to normalized FK schema (domain_host, domain_tld) primal 2026-02-01 22:36:25 -05:00
  • e7f6be2203 Add internal crawl endpoint without auth primal 2026-02-01 19:59:39 -05:00
  • edf54ca212 Add graceful shutdown for goroutines primal 2026-02-01 19:23:57 -05:00
  • 81146fd572 Fix domain search when pattern looks like domain primal 2026-02-01 19:19:21 -05:00
  • 7011b126fe Fix tld_enum comparison - cast to text instead of LOWER() primal 2026-02-01 19:13:21 -05:00
  • f2978e7ab5 Clean up debug logging primal 2026-02-01 19:11:49 -05:00
  • 8a9001c02c Restore working codebase with all methods primal 2026-02-01 19:08:53 -05:00
  • 211812363a Add TLD sync loop for IANA TLD updates primal 2026-02-01 19:07:43 -05:00
  • d41f9cc7c9 Fix blocking TLD sync loop - add missing go keyword primal 2026-02-01 19:05:50 -05:00
  • c6ec482d1f Add exact domain matching for domain-like search queries primal 2026-02-01 19:00:50 -05:00
  • 71d8ec0a39 Resize small cards to 115px primal 2026-02-01 18:04:01 -05:00
  • 03dcf1cedc Resize small cards to 110px primal 2026-02-01 18:02:28 -05:00
  • a34a284d77 Use Unix timestamp for cache busting, remove version display primal 2026-02-01 18:01:04 -05:00
  • 49c2370d84 Resize small cards to 100px primal 2026-02-01 18:00:00 -05:00
  • 02564bfde7 Fix CSS/JS cache busting - sync versions on launch primal 2026-02-01 17:58:55 -05:00
  • 3a28518366 Resize small cards to 80px primal 2026-02-01 17:56:55 -05:00
  • c50ee3b03e Resize small cards to 100px primal 2026-02-01 17:52:53 -05:00
  • f307e6c845 Add guards to skip migrations if already done primal 2026-02-01 17:44:35 -05:00
  • 58bb560ae6 Resize small cards to 110px primal 2026-02-01 17:27:56 -05:00
  • dd17889695 Rename rate cards: alive/min, crawl/min, check/min primal 2026-02-01 17:26:18 -05:00
  • be595cb403 v100 primal 2026-01-30 22:35:08 -05:00
  • f49fc2f0ad v59: simplify to single feeds view with search primal 2026-01-30 17:16:14 -05:00
  • 9530c2ceab v58: remove all explicit font-sizes, reduce feed indentation primal 2026-01-30 17:11:31 -05:00
  • 3405e31f2c v57: remove font-size from stats to use default primal 2026-01-30 17:07:42 -05:00
  • 3147b4e48a v56: standardize font sizes to match domain name primal 2026-01-30 17:06:24 -05:00
  • c5ad66ee81 v55: fix item_count to query actual DB count primal 2026-01-30 17:05:07 -05:00
  • 406f9397c2 v54: fix d:feeds to load items primal 2026-01-30 17:02:18 -05:00
  • a3d8f4ea8e v53: add feed info and items panels with click toggles primal 2026-01-30 16:59:38 -05:00
  • 442e010672 v52: simplify feed row: status, count, path, title inline primal 2026-01-30 16:49:38 -05:00
  • 6c9702eebc v51: remove debug logging primal 2026-01-30 16:43:23 -05:00
  • 2289d73288 v50: add debug logging for spacer click primal 2026-01-30 16:41:56 -05:00
  • 51d05e18a1 v49: fix spacer click using event delegation primal 2026-01-30 16:39:20 -05:00
  • 57801d0946 v48: domain name links to site, spacer toggles feeds primal 2026-01-30 16:36:14 -05:00
  • 5b3330ba07 v47: Fix d:feeds auto-expand for hidden container primal 2026-01-30 16:31:32 -05:00
  • 97051f3967 v46: Click domain name to toggle feeds div primal 2026-01-30 16:29:05 -05:00
  • cf34db1e6c v45: Auto-expand feed details in d:feeds mode primal 2026-01-30 16:23:03 -05:00
  • f59e7dcbc3 v44: Left-justify TLD footer primal 2026-01-30 16:19:08 -05:00
  • 018f47449f v43: Add TLD footer with collapse button primal 2026-01-30 16:17:59 -05:00
  • cbf16bfbc8 v42: Revert to persistent session cookie (24h) primal 2026-01-30 16:13:24 -05:00
  • aef0826004 v41: Session cookie for browser-close logout primal 2026-01-30 16:12:33 -05:00
  • e0602b0123 v40: Persist OAuth sessions to database primal 2026-01-30 16:09:46 -05:00
  • 31b7b61bb0 v39: Fix session cookie Secure flag for HTTP primal 2026-01-30 16:05:59 -05:00
  • c374260e11 v38: d:feeds only shows feeds with items primal 2026-01-30 16:04:38 -05:00
  • 388e846f18 v37: Add right margin to language column primal 2026-01-30 16:01:47 -05:00
  • 2504927022 v36: Widen language column to 32px primal 2026-01-30 16:00:27 -05:00