I have an idea. Have every single article or comment posted by a user scanned by an LLM. Prompt the LLM to identify logical fallacies in the post or comment. Post the user logical fallacies counts on a public scoreboard hosted on each federated instance. Now, ban the top 10% scoring users each quarter who have a fallacy ratio surpassing some reasonable good faith objective.
Pros: Everyone is judged by the same impassive standard.
Cons: 1) A fucking LLM has to burn coal for every stupid post we make. 2) LLM prompt injection/hijacking vulnerability.
I have an idea. Have every single article or comment posted by a user scanned by an LLM. Prompt the LLM to identify logical fallacies in the post or comment. Post the user logical fallacies counts on a public scoreboard hosted on each federated instance. Now, ban the top 10% scoring users each quarter who have a fallacy ratio surpassing some reasonable good faith objective.
Pros: Everyone is judged by the same impassive standard.
Cons: 1) A fucking LLM has to burn coal for every stupid post we make. 2) LLM prompt injection/hijacking vulnerability.