this post was submitted on 13 Jul 2024
50 points (100.0% liked)
TechTakes
1400 readers
111 users here now
Big brain tech dude got yet another clueless take over at HackerNews etc? Here's the place to vent. Orange site, VC foolishness, all welcome.
This is not debate club. Unless it’s amusing debate.
For actually-good tech, you want our NotAwfulTech community
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
An extra bit of labeling on your training data set really doesn't help you that much. LLMs already make up plausible looking citations and website links (and other data types) that are actually complete garbage even though their training data has valid citations and website links (and other data types). Labeling things as "fact" and forcing the LLM to output stuff with that "fact" label will get you output that looks (in terms of statistical structure) like valid labeled "facts" but have absolutely no guarantee of being true.