Skip to content

Instantly share code, notes, and snippets.

View sychen23's full-sized avatar

Sharon Chen sychen23

View GitHub Profile
@sychen23
sychen23 / intent_enrichment_query_templates.sql
Created May 20, 2026 17:43
Intent metadata enrichment — query templates (SQL for UNKNOWN rows, disagree rows, per-source flips, v1 vs v2 rates)
-- ============================================================
-- Intent Metadata Enrichment — Query Templates
-- Sample table: scratch.schen.intent_metadata_sample_n8000
-- Result tables: scratch.schen.intent_pred_ft_9qtt_{none,bio,ai_summary,attachments,bio_v2,ai_summary_v2,attachments_v2}_simple_n8000
-- Warehouse: 6144901064699dd6 (aaa_default, PRO)
-- ============================================================
-- ------------------------------------------------------------
-- Template 1: All rows where any of the 7 labels = UNKNOWN
-- (Used to build /tmp/unk_rows2.csv / the "UNKNOWN-bearing" gist)
@sychen23
sychen23 / ai_summary_nonunk_flips.csv
Created May 20, 2026 17:41
Intent metadata enrichment — business_ai_summary non-UNK label flips (n=8000, full text, 2026-05-20)
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 2.
post_id,has_photo,is_fsf,subject,body,image_description,ocr_text,author_bio,business_ai_summary,attachment,baseline,bio_v1,bio_v2,ai_summary_v1,ai_summary_v2,attachments_v1,attachments_v2
474573865,true,False,Need mulch and make new beds just Text or call 914 2828043,,"A garden bed with shrubs and a tree, partially cleared soil, and surrounding grass in a residential area.",,,"Peña Lawn Maintenance LLC earns strong word-of-mouth for delivering neat, professional results that transform lawns and backyards. Clients highlight that they “did an amazing job cleaning up” outdoor spaces, leaving everything “so neat and professional.” The team is described as very reliable, easy to communicate with, and handling every project with great professionalism, making them a trusted choice for ongoing lawn service. Reviewers emphasize attention to detail and a meticulous approach that “exceeded my expectations,” even from customers who “set the bar high for a job well done.” Several neighbors use Peña Lawn Maintenance LLC a
@sychen23
sychen23 / unk_rows2_full.csv
Created May 19, 2026 23:59
Intent metadata enrichment — all 381 UNKNOWN-bearing rows (full text, n=8000, 2026-05-19)
We can make this file beautiful and searchable if this error is corrected: Unclosed quoted field in line 5.
post_id,has_photo,is_fsf,subject,body,image_description,ocr_text,author_bio,business_ai_summary,attachment,distinct_labels,baseline,bio_v1,bio_v2,ai_summary_v1,ai_summary_v2,attachments_v1,attachments_v2
474500375,true,False,,,,,,,,1,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474501291,false,False,Good night may God bless you.,,,,I'm a single mother and grandmother with a disability that's an activist. I don't want no relationship.,,,2,UNKNOWN_INTENT_TYPE,NEIGHBORHOOD_PRIDE,NEIGHBORHOOD_PRIDE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502036,false,False,Any single ladies in this group?,,,,New to the neighborhood!,,,2,UNKNOWN_INTENT_TYPE,CONNECT_WITH_A_NEIGHBOR,CONNECT_WITH_A_NEIGHBOR,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502461,false,False,Alone and need a boost off quickly?,"Text me at 9012908290 for help, day or night anytim
@sychen23
sychen23 / unk_rows2_disagree_full.csv
Created May 19, 2026 23:59
Intent metadata enrichment — 157 disagreeing rows (full text, n=8000, 2026-05-19)
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 18 columns, instead of 13 in line 3.
post_id,has_photo,is_fsf,subject,body,image_description,ocr_text,author_bio,business_ai_summary,attachment,distinct_labels,baseline,bio_v1,bio_v2,ai_summary_v1,ai_summary_v2,attachments_v1,attachments_v2
474501291,false,False,Good night may God bless you.,,,,I'm a single mother and grandmother with a disability that's an activist. I don't want no relationship.,,,2,UNKNOWN_INTENT_TYPE,NEIGHBORHOOD_PRIDE,NEIGHBORHOOD_PRIDE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502036,false,False,Any single ladies in this group?,,,,New to the neighborhood!,,,2,UNKNOWN_INTENT_TYPE,CONNECT_WITH_A_NEIGHBOR,CONNECT_WITH_A_NEIGHBOR,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502795,true,False,Come pick it up,,,TY OF SAN ANTONIO,"Hey! Just been residing in SA for 2yrs. I haven’t gone out much. Need new hotspots! Also looking to buy or sell & trade things! If theres any way we can help each other, Im all for it!",,,3,UNKNOWN_INTENT_TYPE,EXCHANGE_GOOD
@sychen23
sychen23 / unk_rows2_disagree.csv
Created May 19, 2026 23:22
Intent metadata — rows where labels disagree (n8000 sample, 2026-05-19)
We can make this file beautiful and searchable if this error is corrected: It looks like row 4 should actually have 18 columns, instead of 17 in line 3.
post_id,has_photo,is_fsf,subject,body,image_description,ocr_text,author_bio,business_ai_summary,attachment,distinct_labels,baseline,bio_v1,bio_v2,ai_summary_v1,ai_summary_v2,attachments_v1,attachments_v2
474501291,false,,Good night may God bless you.,,,,I'm a single mother and grandmother with a disability that's an activist. I don't want no relationship.,,,2,UNKNOWN_INTENT_TYPE,NEIGHBORHOOD_PRIDE,NEIGHBORHOOD_PRIDE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502036,false,,Any single ladies in this group?,,,,New to the neighborhood!,,,2,UNKNOWN_INTENT_TYPE,CONNECT_WITH_A_NEIGHBOR,CONNECT_WITH_A_NEIGHBOR,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502795,true,,Come pick it up,,,TY OF SAN ANTONIO,Hey! Just been residing in SA for 2yrs. I haven’t gone out much. Need new hotspots! Also looking to buy or sell & trade things! If theres any way we …,,,3,UNKNOWN_INTENT_TYPE,EXCHANGE_GOODS,"L,H",UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNK
@sychen23
sychen23 / unk_rows2.csv
Created May 19, 2026 23:09
Intent metadata enrichment — UNKNOWN-bearing rows (n8000 sample, lag+fsf+intro filtered, 2026-05-19)
We can make this file beautiful and searchable if this error is corrected: It looks like row 5 should actually have 18 columns, instead of 12 in line 4.
post_id,has_photo,is_fsf,subject,body,image_description,ocr_text,author_bio,business_ai_summary,attachment,distinct_labels,baseline,bio_v1,bio_v2,ai_summary_v1,ai_summary_v2,attachments_v1,attachments_v2
474500375,true,,,,,,,,,1,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474501291,false,,Good night may God bless you.,,,,I'm a single mother and grandmother with a disability that's an activist. I don't want no relationship.,,,2,UNKNOWN_INTENT_TYPE,NEIGHBORHOOD_PRIDE,NEIGHBORHOOD_PRIDE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502036,false,,Any single ladies in this group?,,,,New to the neighborhood!,,,2,UNKNOWN_INTENT_TYPE,CONNECT_WITH_A_NEIGHBOR,CONNECT_WITH_A_NEIGHBOR,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE,UNKNOWN_INTENT_TYPE
474502461,false,,Alone and need a boost off quickly?,"Text me at 9012908290 for help, day or night anytime :)",,,,,,1,UNKNOWN