review · segments
Design proximity location search doc for /authors optimization
claude 23 events 1 segments authors-location-optimization-plan
segment 1 of 1
Design proximity location search doc for /authors optimization
Read production facts (00-PRODUCTION-FACTS.md), foundation perf/indexing doc (10-perf-and-indexing.md), LocationSearch Livewire component, Location model, RedditUserIndex component, and City model. Then wrote a detailed design document (20-proximity-search.md) covering: city-centric centers only (state/country centroids rejected), radius opt-in (0/10/25/50/100 mi), nearby-set resolution via bounding-box on locations (lat,lng) B-tree + ST_Distance_Sphere refine + quality gate (merged_into_id IS NULL AND city_id IS NOT NULL AND visible_author_count >= 5) + cap to 64 nearest, pivot-first IN query deduped by GROUP BY reddit_user_id, offset pagination with 1500-user cap, two caches (candidate-list LRU-128 TTL 30min, page-level TTL 15min), Livewire property and query-string changes for radius and distance, new column visible_author_count on locations with daily cron, and handling of empty/edge cases.
outcome
Design doc saved at /Users/mikeferrara/Documents/code/lounge/docs/authors-optimization/20-proximity-search.md with all decisions documented.
next steps
- Implement the design per the doc: migration for visible_author_count, daily cron, Livewire property changes, pivot query rewrite, caching, UI radius control
- Review with team to confirm decisions (especially 1500-user cap, city-only centers, visible_author_count freshness)
- Write tests for proximity query and cache behavior
key decisions
- City-centric centers only (state/country centroids rejected as semantically meaningless)
- Radius opt-in default 0 (exact) with discrete values (10/25/50/100 mi) controlled by query string
- Nearby-set capped at 64 nearest locations after quality gate to keep pivot-first IN query performant
- Offset pagination used for proximity branch (cannot use cursor pagination due to unstable ordering from variable location set)
- 1500-user cap (63 pages) to bound worst-case cost for metro areas
- Two caches: candidate-list LRU-128 (keyed by location set hash, TTL 30min) and page-level TTL (15min)
- New column visible_author_count on locations, updated daily via cron, used in quality gate
- ST_Distance_Sphere POINT argument order (lng, lat) documented to avoid foot-gun
- Group by reddit_user_id with MAX() aggregation to dedup overlapping locations from multiple cities
open questions
- 1500-user cap may surprise users near huge metros – is this a product call to accept or increase?
- visible_author_count daily freshness: acceptable for a browse gate, or need tighter refresh for correctness?
- City-only centers drop ~13% of quality geocoded locations (2,409 of 21,819 have coords+>=5 authors but no city_id) – are those region-level junk as assumed, or valid centers?
6 days ago → 6 days ago