flower
/

review · segments

Design proximity location search doc for /authors optimization

claude 23 events 1 segments authors-location-optimization-plan

segment 1 of 1

Design proximity location search doc for /authors optimization

Done

Read production facts (00-PRODUCTION-FACTS.md), foundation perf/indexing doc (10-perf-and-indexing.md), LocationSearch Livewire component, Location model, RedditUserIndex component, and City model. Then wrote a detailed design document (20-proximity-search.md) covering: city-centric centers only (state/country centroids rejected), radius opt-in (0/10/25/50/100 mi), nearby-set resolution via bounding-box on locations (lat,lng) B-tree + ST_Distance_Sphere refine + quality gate (merged_into_id IS NULL AND city_id IS NOT NULL AND visible_author_count >= 5) + cap to 64 nearest, pivot-first IN query deduped by GROUP BY reddit_user_id, offset pagination with 1500-user cap, two caches (candidate-list LRU-128 TTL 30min, page-level TTL 15min), Livewire property and query-string changes for radius and distance, new column visible_author_count on locations with daily cron, and handling of empty/edge cases.

outcome

Design doc saved at /Users/mikeferrara/Documents/code/lounge/docs/authors-optimization/20-proximity-search.md with all decisions documented.

next steps

  • Implement the design per the doc: migration for visible_author_count, daily cron, Livewire property changes, pivot query rewrite, caching, UI radius control
  • Review with team to confirm decisions (especially 1500-user cap, city-only centers, visible_author_count freshness)
  • Write tests for proximity query and cache behavior

key decisions

  • City-centric centers only (state/country centroids rejected as semantically meaningless)
  • Radius opt-in default 0 (exact) with discrete values (10/25/50/100 mi) controlled by query string
  • Nearby-set capped at 64 nearest locations after quality gate to keep pivot-first IN query performant
  • Offset pagination used for proximity branch (cannot use cursor pagination due to unstable ordering from variable location set)
  • 1500-user cap (63 pages) to bound worst-case cost for metro areas
  • Two caches: candidate-list LRU-128 (keyed by location set hash, TTL 30min) and page-level TTL (15min)
  • New column visible_author_count on locations, updated daily via cron, used in quality gate
  • ST_Distance_Sphere POINT argument order (lng, lat) documented to avoid foot-gun
  • Group by reddit_user_id with MAX() aggregation to dedup overlapping locations from multiple cities

open questions

  • 1500-user cap may surprise users near huge metros – is this a product call to accept or increase?
  • visible_author_count daily freshness: acceptable for a browse gate, or need tighter refresh for correctness?
  • City-only centers drop ~13% of quality geocoded locations (2,409 of 21,819 have coords+>=5 authors but no city_id) – are those region-level junk as assumed, or valid centers?

6 days ago 6 days ago