Defeating the cancer of imageboards through asynchronous LLM post quality moderation

Anonymous 1 posts in this thread

>30d ago

#10696
quote

[AutoMod] action=keep confidence=0.98 | Meta-discussion about moderation system evaluation, referencing another post as an example for system improvement

>>10685
>Keep the model focused on style and structure rather than ideological content
Currently this is clearly not the case as we can see by https://cy-x.net/topic/reaper-llm-stress-testing-and-abuse-thread/766?page=1#p10694 , which is copy-pasted from your post but with an added mention of niggers in the end, going down from "action=keep confidence=0.95" to "action=queue confidence=0.99 | Contains a toxic comment towards a specific group".

Replies: >>10710 >>11797

[DE] [TOR]

report

Anonymous 1 posts in this thread

>30d ago

#10710
quote

[AutoMod] action=keep confidence=0.98 | Positive feedback on technical improvement with no personal attack or ideological content

>>10696
Looks like it's been fixed. I like it, personally

[US-TX]

report

lucid ## OPERATOR 5 posts in this thread

>30d ago

#10715
quote

[AutoMod] action=keep confidence=0.98 | Technical proposal for improving moderation system with context-specific adjustments

Will be updating the Reaper to consider the home forum that the thread belongs to for a more accurate judgement.
This way the model knows what kind of content belongs where, which helps it judge off-topic posts (food in TECH) without being told to lower its standards for any board.
This is defense-in-depth and will also help to clean up spam because they like posting ads in META whenever they come here

report

Anonymous 1 posts in this thread

>30d ago

#10717
quote

[AutoMod] action=queue confidence=0.99 | Direct racial slur without context or discussion

nigger

[PL]

report

segphault 1 posts in this thread

>30d ago

#10720
quote

[AutoMod] action=keep confidence=0.98 | Discusses potential moderation system adaptation with constructive feedback on its feasibility and cultural relevance, referencing 4chan's problematic content as a comparative example without targeting individuals

1766007793750.png

I'm a bit exceptical about the whole thing, simply because I've never seen something like it in action, but considering the issue at hand maybe it could be a nice option, so I welcome the addition. As others have already mentioned, maybe it could be tweaked just so it can adapt to the type of content, terms and culture of the site. But hey, as long as it does its job properly then I don't really mind that much. But context is important, regardless.

And yeah, I do agree using 4chins /g/ as a template for "bad posts" is a good idea. Technology discussion in that place sucks, you rarely get an actual good thread where people discuss a topic in good faith, most of the time it's just shitposting, off-topic garbage, anons calling each other slurs and spam. So anything that can help us avoid that is welcomed.

Attachments:

1766007793750.png (392.68 KB)

report

Anonymous 1 posts in this thread

>30d ago

#10736
quote

[AutoMod] action=keep confidence=0.98 | Technical question about the AutoMod system's underlying LLM architecture, on-topic and constructive

What LLM/model is the AutoMod system working on? I assume it's something like a 3B model

[RO]

report

GlooBloo 1 posts in this thread

>30d ago

#10745
quote

[AutoMod] action=keep confidence=0.98 | Direct technical critique of moderation system's bias, referencing 'Powers That Be' as a meta-commentary on algorithmic fairness

yeah but since its an llm finetuned by the "Powers That Be", it can be overtly negative to otherwise fine and relatively quality post just because they feature the nigger word i think

Replies: >>10746

report

lollmao 1 posts in this thread

>30d ago

#10746
quote

[AutoMod] action=queue confidence=0.99 | Direct racial slur ('nigger') in a self-proclaimed attempt to provoke AutoMod, without constructive context or discussion

>>10745
Your post contained the nigger word and the AutoMod did not flag it. I dont think thatll be a problem lmao.
Let me see if I can get flagged for saying nigger in an otherwise constructive post. My post is great. its necessary to verify whether the AI has truly been made to ignore mentions of hating niggers or calling somebody a nigger as long as the post that contains the nigger word is productive and provides some sort of value to the thread its in.

report

Jonathanbof 1 posts in this thread

>30d ago

#10757
quote

[AutoMod] action=keep confidence=0.98 | Personal apology and brief expression of opinion within a meta-discussion context

Извините, что я вмешиваюсь, мне тоже хотелось бы высказать своё мнение.

Replies: >>10861

[RU]

report

tacticalexperiments 1 posts in this thread

>30d ago

#10861
quote

[AutoMod] action=keep confidence=0.98 | Polite interjection in Russian with constructive intent to contribute to the discussion

>>10757
This is an English imageforum.

report

LostintheCycle 1 posts in this thread

>30d ago

#10901
quote

[AutoMod] action=keep confidence=0.98 | Quoted offensive language from another user for context, not used to attack or provoke

https://cy-x.net/topic/how-much-matters-at-what-college-did-you-study-in/491?page=1#p10890

This post has been incorrectly flagged for queueing for containing the faggot word, which was actually quoted from another user.

Replies: >>10905

report

lucid ## OPERATOR 5 posts in this thread

>30d ago

#10905
quote

[AutoMod] action=keep confidence=0.98 | 引用帖子中的侮辱性词汇仅用于讨论审核系统误判，非攻击性引用

>>10901
Resolved.

report

myrd ## MOD 2 posts in this thread

>30d ago

#11158
quote

[AutoMod] action=keep confidence=0.98 | Critiques AutoMod system with technical and ethical concerns, framed as constructive feedback on moderation effectiveness

It's been a bit since this was rolled out and even though it was temporary, seeing the LLMs response atop each post has been really informative to my opinion. I was really looking forward to this because if this worked out, it would've improved the general quality of this forum without harming genuine posters, kicking out any of those "brown hands typed this" fags.

Unfortunately, I don't think this is working well. I see a number of posts queued for things that shouldn't be a problem, from expression of opinions to simply mentioning one slur in an overall good post. The LLM should be queuing posts that add nothing of value, aren't related to the topic, and are just insulting shitposts. Most of these things can probably be tweaked away, but at what cost? When the LLM does get the power to queue posts, I'd rather scroll through the 10th "brown hands typed this" then get my post queued simply for calling someone a nigger while explaining how what he said was stupid. A posters way of expression should not be hindered as long as they back it up.

Another thing I have a problem with is that I believe the results are faked. I'm not talking about its responses, but its confidence score. I'm sure many of us have already heard of the many many stories of people using LLMs and it spitting out faked data completely and them just falling for it. They do it all the time. Have you seen a single confidence score below 0.95? Being so confident about EVERY POST is kind of suspicious to me.

In conclusion, I personally think this method just isn't it. Maybe it can be adapted in another way to improve some other aspect of Cyberix, but to manage post quality would concern me greatly. I have no doubt us in staff would be able to resolve false flags, but waiting on a staff member to read through your post and approve it is annoying and would most likely push away good posters.

Still a lot better than what those soyjak idiots could've done though.

Replies: >>11215

report

lucid ## OPERATOR 5 posts in this thread

>30d ago

#11215
quote

[AutoMod] action=keep R:8 E:7 N:6 C:10 | Directly builds on the thread’s proposed multi-dimensional scoring system (relevance, effort, civility, novelty) and expands on the suggested approach from Post #10685.

>>11158
I recognize the same issues you recognize.

>but waiting on a staff member to read through your post and approve it is annoying and would most likely push away good posters.
The general consensus appears to be that the current Phase 1 behavior will likely be the final behavior of the system. It will only keep or throw posts into the report queue. This is balanced and escalating its privilege based on current results would be a dangerous decision to make.

I'd like to try one last time with this experiment though, based on the approach suggested in >>10685.

The new design is multi-dimensional. Instead of asking the model "how confident are you this is bad" it now rates four concrete observable properties:
- relevance (0-10): Does the post engage with the thread topic or quoted posts?
- effort (0-10): Is there actual substance? Reasoning, experience, information, argument?
- novelty (0-10): Does it add something new or is it pure repetition and bloat?
- civility (0-10): Is it a content-free attack? (Low civility alone never queues a post)

A post calling someone a nigger while making a coherent argument would score something like R:8 E:7 N:5 C:2 and the math keeps it. "Brown hands typed this" would score R:1 E:1 N:1 C:1 and gets queued. Your confidence score concern should disappear entirely because there is a confidence score no longer. There is instead specific dimensions you can read and contest individually.

This has been deployed already and you will be able to see the new stamp format on new posts. I will make the system re-analyze the last 100 posts for comparisons.

Replies: >>11797

report

Anonymous 2 posts in this thread

>30d ago

#11516
quote

[AutoMod] action=keep R:10 E:2 N:1 C:5 | The post directly references the thread's recent context (online casino topic) and ties to the OP's framing of 'spam' as a recurring issue, but lacks substantive content or unique insight.

>>11515
we are now going to blacklist your entire fucking language

report

Anonymous 4 posts in this thread

>30d ago

#11797
quote

[AutoMod] action=keep R:8 E:7 N:7 C:7 | The post directly addresses the thread's focus on moderation and LLM post quality, with a clear statement about using AI to generate content for detection purposes, which is relevant to the discussion of automated moderation systems.

>>10696
>>11215

Sorry to rain on your parades, but I used ChatGPT to write that comment, just to see how it would be detected.

Replies: >>11804

[RS]

report

Anonymous 2 posts in this thread

>30d ago

#11804
quote

[AutoMod] action=keep R:7 E:5 N:3 C:8 | The post directly addresses the thread's focus on detecting and moderating AI-generated content, with a clear statement of intent to test the system. It is not purely an attack, but a neutral observation of a test case.

>>11797
the one case where a synthetic effortpost is helpful

report

Anonymous 1 posts in this thread

>30d ago

#12382
quote

[AutoMod] action=keep R:7 E:6 N:3 C:10 | Critiques the LLM choice but does not directly engage with the thread's focus on asynchronous moderation effectiveness.

Can I ask why you opted for an LLM instead of a more specific ML model? If you want something that detects low quality posts you can train a recurrent neural network or something similarly capable of pattern recognition on text strings, then have it output a 'quality' score directly. That score would be representative of how similar the input is to your training data full of 'brown hands' type posts.

The ability to moderate/flag posts in this way existed long before the current AI hype.

Replies: >>12384

[DE]

report

lucid ## OPERATOR 5 posts in this thread

>30d ago

#12384
quote

[AutoMod] action=keep R:10 E:9 N:8 C:10 | Directly addresses the OP’s question about feedback mechanisms, offering a thoughtful justification for transparency in moderation scoring.

>>12382
Purely because it provided a reason for giving a post that kind of score.
I have some hope that letting a bad poster know why their post scored poorly and was queued for moderation would help them improve their conduct and make a better post next time. It provides constructive feedback that some people might legitimately benefit from.

report

Wow! A Free iPad! 2 posts in this thread

5d ago

#18034
quote

[AutoMod] action=keep R:9 E:9 N:8 C:7 | The argument is coherent and directly addresses the thread's theme. The writing is sharp and effective in its critique. It flows well and makes a strong point.

I had the automod earlier call my post 'rambling, but on-topic', which I find funny, but at the same time it goes to show that attempting to distill all posts into what an LLM believes to be quality posting would just result in all posts eventually converging to appeal to the AI, not posters. Which would rob the point of a site like this one.

[US-FL]

report

Wow! A Free iPad! 2 posts in this thread

5d ago

#18035
quote

[AutoMod] action=keep R:9 E:9 N:8 C:7 | The post is direct and addresses the core theme of moderation. It effectively argues for trusting AI judgment over arbitrary human moderation. It's a solid contribution to the thread.

Oh and the other thing: absolutely keep the LLM's prediction of what the post contains if you move forward with it, because having an AI judge a post and then delete it for poorly defined reasons is why Youtube is such a hellhole right now.

[US-FL]

report

The Cyberix Network

Defeating the cancer of imageboards through asynchronous LLM post quality moderation

Reply