The solution to AI lying is… wikipedia but worse, apparently

Jack Riddle[Any/All]@lemmy.dbzer0.com · 2 days ago

what text are you reading that has a 0% error rate?

as I said, the text has a 0% error rate about the contents of the text, which is what the LLM is summarising, and to which it adds it’s own error rate. Then you read that and add your error rate.

the question is can we make a system that has an error rate that is close to or lower than a person’s

can we???

could you read and summarize 75 novels with a 0% error rate?

why… would I want that? I read novels because I like reading novels? I also think that on summaries LLMs are especially bad, since there is no distinction between “important” and “unimportant” in the architecture. The point of a summary is to only get the important points, so it clashes.

provide a page reference to all of the passages written in iambic pentameter?

no LLM can do this. LLMs are notoriously bad at doing any analysis of this kind of style element because of their architecture. why would you pick this example

Meanwhile an LLM could produce a summary, with citations generated and tracked by non-AI systems, with an error rate comparable to a human (assuming the human was given a few months to work on the problem) in seconds.

I still have not seen any evidence for this, and it still does not adress the point that the summary would be pretty much unreadable

Jack Riddle[Any/All]@lemmy.dbzer0.com · 2 days ago

The study of this in academia

you are linking to an arxiv preprint. I do not know these researchers. there is nothing that indicates to me that this source is any more credible than a blog post.

has found that LLM hallucination rate can be dropped to almost nothing

where? It doesn’t seem to be in this preprint, which is mostly a history of RAG and mentions hallucinations only as a problem affecting certain types of RAG more than other types. It makes some relative claims about accuracy that suggest including irrelevant data might make models more accurate. It doesn’t mention anything about “hallucination rate being dropped to almost nothing”.

(less than a human)

you know what has a 0% hallucination rate about the contents of a text? the text

You can see in the images I posted that it both answered the question and also correctly cited the source which was the entire point of contention.

this is anecdotal evidence, and also not the only point of contention. Another point was, for example, that ai text is horrible to read. I don’t think RAG(or any other tacked-on tool they’ve been trying for the past few years) fixes that.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 2 days ago

see, the problem is that I am not going to be reading that text because I know it is unreliable and ai text makes my eyes glaze over, so I will be clicking on all those links until I find something that is reliable. On a search engine I can just click through every link or refine my search with something like site:reddit.com site:wikipedia.org or format:pdf or something similar. With a chatbot, I need to write out the entire question, look at the four or so links it provided and then reprompt it if it doesn’t contain what I’m looking for. I also get a limited amount of searches per day because I am not paying for a chatbot subscription. This is completely pointless to me.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 2 days ago

good basilisk save us all. so they built a script for their chatbot that allows it to purchase more chatbots? seems like a great use of money. Also, what’s with the insanely placed emdashes? did conway write this for him or has his brain been rotted so much that he writes like an LLM? large parts seem human-written at least…

Jack Riddle[Any/All]@lemmy.dbzer0.com · 3 days ago

the screenshot is very very clearly LLM generated right? This is so insanely stupid

Jack Riddle[Any/All]@lemmy.dbzer0.com · 3 days ago

hate it when you are working on a major featue for the next release but tim keeps continvoucly morging

Jack Riddle[Any/All]@lemmy.dbzer0.com · 3 days ago

so I fail to see why I should be using an LLM at all then. If I am going to the webpages anyway, why shouldn’t I just use startpage/searx/yacy/whatever?

Jack Riddle[Any/All]@lemmy.dbzer0.com · 3 days ago

so we are using the “regular search which has always given you garbage” and taking that garbage automatically to get summarised by the hallucinator and we are supposed to trust the output somehow?

Jack Riddle[Any/All]@lemmy.dbzer0.com · 3 days ago

unfortunately, julia has been adding “agentic code” to their codebase for a while now.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 4 days ago

bluemonday1984@awful.systems update: quokka reached out to me and apparently you had been banned on another instance for report abuse and that ban had synchronised to quokk.au. You should be unbanned now which means the next stubsack should federate again.

E: I do not know how tags work E2: why does that format to a mailto link?

Jack Riddle[Any/All]@lemmy.dbzer0.com · 5 days ago

OT: I tried switching away from my instance to quokk.au because it becomes harder and harder to justify being on a pro-ai instance, but it seems that your posts stopped federating recently. Which makes that a bit harder to do. Any clue why?

Jack Riddle[Any/All]@lemmy.dbzer0.com · 5 days ago

The bot shows the user flair of /0 users. The pirate icon means you said you had an interest in piracy during signup

Jack Riddle[Any/All]@lemmy.dbzer0.com · 6 days ago

it is some old formatting getting converted incorrectly I believe, though I don’t remember the details

Jack Riddle[Any/All]@lemmy.dbzer0.com · 7 days ago

microblog site ✅
funny image (meme) ✅

my checks are all green boss

Jack Riddle[Any/All]@lemmy.dbzer0.com · 7 days ago

what about xxy? What about people with female sex characteristics but xy chromosomes? This seems stupid.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 7 days ago

that is what BLM is saying, which is why the response “well, ALM” is bad. This implicitly says that BLM means “BLM more”.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 7 days ago

see, the problem with this is that no person on earth calls it “oat drink” or whatever. Oat milk is the accepted term. The etymology is only there to highlight how ridiculous the entire thing is.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 8 days ago

just don’t ask for gender? let the player either design a sprite or pick one and pick pronouns or just refer to the player with they/them. Sex is only a fact insofar as the collection of bodily features we commonly refer to as sex exist. the reference to it is still a social choice, and we could’ve just as easily picked another group of features, or none at all.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 9 days ago

it does smell really good tho

Jack Riddle[Any/All]@lemmy.dbzer0.com · 10 days ago

yes.

Jack Riddle[Any/All]@lemmy.dbzer0.com · 6 months ago

The solution to AI lying is… wikipedia but worse, apparently

Jack Riddle[Any/All]@lemmy.dbzer0.com · 7 months ago

Found a pretty good blog post on our friends in the wild

Jack Riddle[Any/All]

The solution to AI lying is… wikipedia but worse, apparently

The solution to AI lying is… wikipedia but worse, apparently

Found a pretty good blog post on our friends in the wild

Found a pretty good blog post on our friends in the wild