Page 2 of 7

Re: Maybe pruning?

Posted: Mon May 22, 2023 6:32 pm
by zompist
Done for now.

* Heavy-handed deletion of Ephemera posts on page 5 or more.
* Light deletion of posts past page 2.
* Trimmed down Random and Venting.

Other forums were slim pickings-- many many small-count threads, and probably no less interesting today than when they were made. I did get rid of some Bob threads.

Re: Maybe pruning?

Posted: Tue May 23, 2023 12:49 pm
by Zju
zompist wrote: Mon May 22, 2023 6:32 pm I did get rid of some Bob threads.
Oh no! Now we won't be able to gawk in awe at the World Famous Dinosaur Conlang Pakuni and Atlantean, Sister Language to Clingon TV Show Series Mark Orcand's Mythical Conlang!

(also, did we really switch boards five years ago? I swear it was two!)

Re: Maybe pruning?

Posted: Tue May 23, 2023 2:36 pm
by alice
Zju wrote: Tue May 23, 2023 12:49 pm
zompist wrote: Mon May 22, 2023 6:32 pm I did get rid of some Bob threads.
Oh no! Now we won't be able to gawk in awe at the World Famous Dinosaur Conlang Pakuni and Atlantean, Sister Language to Clingon TV Show Series Mark Orcand's Mythical Conlang!
I agree that the board is somehow indefinably diminished by their absence.

Re: Maybe pruning?

Posted: Wed May 24, 2023 4:01 pm
by Raphael
Serious question: which share of the total number of posts in the database got pruned? Asking because I'm curious about how much good this might do.

Re: Maybe pruning?

Posted: Wed May 24, 2023 4:16 pm
by zompist
Raphael wrote: Wed May 24, 2023 4:01 pm Serious question: which share of the total number of posts in the database got pruned? Asking because I'm curious about how much good this might do.
Something like 13%.

Re: Maybe pruning?

Posted: Mon May 29, 2023 2:20 am
by quinterbeck
The random thread seems to think I haven't read any posts since 2020... is this an effect of pruning??

Re: Maybe pruning?

Posted: Mon May 29, 2023 2:29 am
by bradrn
quinterbeck wrote: Mon May 29, 2023 2:20 am The random thread seems to think I haven't read any posts since 2020... is this an effect of pruning??
I suspect that every time zompist deletes an old post (or set of posts), it sends out an ‘unread’ notification for that time. I’ve been getting some of those notifications too. (I don’t mind, though.)

Re: Maybe pruning?

Posted: Mon May 29, 2023 2:36 am
by zompist
My guess is it stores the number of posts read (which would produce errors when posts are deleted), rather than a post ID (which wouldn't).

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 12:18 pm
by Raphael
Despite the pruning, there sometimes still seem to be problems.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 1:48 pm
by masako
I support the pruning.

While I feel like preserving much of the C&C/L&L stuff would be good, I think the majority of Ephemera could be gone and none of us would be really too sad.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 3:10 pm
by rotting bones
Raphael wrote: Thu Jun 29, 2023 12:18 pm Despite the pruning, there sometimes still seem to be problems.
rotting bones wrote: Wed Jun 28, 2023 6:40 pm
Raphael wrote: Mon Jun 26, 2023 12:37 pm 27 guests on the ZBB? Is there some kind of DDOS thing going on?
Maybe make the site inaccessible to robots?

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 4:15 pm
by rotting bones
rotting bones wrote: Thu Jun 29, 2023 4:01 pm
Travis B. wrote: Thu Jun 29, 2023 3:54 pm Malicious robots tend to not pay attention to robots.txt.

Edit: and captchas are annoying as all hell.
Yeah, I was about to suggest IP verification by CAPTCHA for guests before I saw your edit.

I guess there's a tradeoff. If there's a serious threat to the site's stability, an annoying, one time CAPTCHA for guests is the only alternative I can think of right now. It could be as simple as a 1 digit arithmetic problem. If you don't want to deal with that, and you want to stay anonymous, you could log in as a hidden user.
First Google search result: https://www.imperva.com/blog/9-recommen ... r-website/

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 5:15 pm
by zompist
That's addressing business websites, from a company that wants to sell you security products. We need advice specific to phpBB boards.

I don't even know if logs are kept, or for what. If someone wants to research and tell me exactly what to look for, I'll be happy to. I don't think guesses about robots are much better than my guesses about pruning. :(

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 5:23 pm
by rotting bones
zompist wrote: Thu Jun 29, 2023 5:15 pm I don't even know if logs are kept, or for what. If someone wants to research and tell me exactly what to look for, I'll be happy to.
I'll try eventually, but I don't know anything about phpBB. If things get bad, I'm not against pruning, but Ephemera is the entirety of my social life these days. That might not be a good thing for me.
zompist wrote: Thu Jun 29, 2023 5:15 pm I don't think guesses about robots are much better than my guesses about pruning. :(
The number of guests go up to over 50 and then 100, and the site crashes with an error like: too many requests. Not a certain result, but suggestive.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 9:27 pm
by rotting bones
zompist wrote: Thu Jun 29, 2023 5:15 pm That's addressing business websites, from a company that wants to sell you security products. We need advice specific to phpBB boards.

I don't even know if logs are kept, or for what. If someone wants to research and tell me exactly what to look for, I'll be happy to. I don't think guesses about robots are much better than my guesses about pruning. :(
I asked ChatGPT Code Interpreter what it thinks: https://chat.openai.com/share/5934e9ce- ... 43322930f5

ChatGPT is sometimes muddled on the details, but it often gives is strong starting line of attack. You can also ask follow-up questions.

PS. Also ChatGPT thinks phpBB comes with CAPTCHA plugins: https://chat.openai.com/share/f7015f3f- ... af9ff99560

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 10:29 pm
by zompist
rotting, though I love you like a brother, I am not going to make changes to a live bulletin board based on the output of a glorified Markov chain.

Do you know know how LLMs work? They produce plausible sounding text based on existing web pages. It does not have any idea of verification or accuracy. (It does not have any ideas at all.)

It's likely that it's echoing actual web pages. In which case those web pages might be worth looking at.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 10:44 pm
by rotting bones
zompist wrote: Thu Jun 29, 2023 10:29 pm rotting, though I love you like a brother, I am not going to make changes to a live bulletin board based on the output of a glorified Markov chain.

Do you know know how LLMs work? They produce plausible sounding text based on existing web pages. It does not have any idea of verification or accuracy. (It does not have any ideas at all.)

It's likely that it's echoing actual web pages. In which case those web pages might be worth looking at.
Please don't. What ChatGPT suggests is very close to the KIND OF THING that needs to be done. Based on the direct output, it makes sense to open the controls panels it suggests and see what's there. For example, using the last answer in my first link, you can try opening the log. But before making any changes, we should at least Google the specific changes that are suggested. That's what I'm doing now. For example, the "persistent" bit is probably wrong: https://www.phpbb.com/support/docs/en/3 ... gphp-file/ I could try skimming the manual if you tell me the version of phpBB you're using. The utility of ChatGPT here is to direct my attention to the sections of the phpBB manual that are relevant to the problem. As long as you don't treat it as an infallible oracle, ChatGPT works very well. Hell, when you're facing a solid wall of problems and you don't know where to start, even a pair of dice are better than staring at the thing.

PS. If you like, it's analogous to the advice given to writers: It's easier to be an editor than a writer, so write garbage, and then edit it. ChatGPT's output is usually significantly better than garbage.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 10:57 pm
by bradrn
zompist wrote: Thu Jun 29, 2023 5:15 pm I don't even know if logs are kept, or for what.
The hosting company should almost certainly keep logs. I can’t remember which one you’re using now… at least for Dreamhost, viewing them seems to be pretty easy, though they only keep 3 days’ worth: https://help.dreamhost.com/hc/en-us/articles/216512197. Running tracepath brings up the name of Bluehost instead, which also keeps access logs.

That being said, there’s only so much information you can get out of logs. Access logs can tell you where all the visitors are coming from — which might tell you if there’s a particular range of IP addresses you can block, but otherwise isn’t too helpful. Maybe I’ll try installing phpBB again and seeing what options there are around CAPTCHAs and suchlike.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 11:09 pm
by bradrn
bradrn wrote: Thu Jun 29, 2023 10:57 pm Maybe I’ll try installing phpBB again and seeing what options there are around CAPTCHAs and suchlike.
OK, got this working. At a glance, they do have a section on ‘Spambot countermeasures’: they seem to mostly be CAPTCHAs of various types. However, the focus is on blocking spam registrations and logins, rather than on blocking viewers. For the latter, I suspect you’ll need to configure things on the hosting provider.

Re: Maybe pruning?

Posted: Thu Jun 29, 2023 11:25 pm
by rotting bones
bradrn wrote: Thu Jun 29, 2023 10:57 pm The hosting company should almost certainly keep logs. I can’t remember which one you’re using now… at least for Dreamhost, viewing them seems to be pretty easy, though they only keep 3 days’ worth: https://help.dreamhost.com/hc/en-us/articles/216512197. Running tracepath brings up the name of Bluehost instead, which also keeps access logs.
Right, if this site isn't self-hosted, I don't know what will be visible from the outside.
bradrn wrote: Thu Jun 29, 2023 11:09 pm OK, got this working. At a glance, they do have a section on ‘Spambot countermeasures’: they seem to mostly be CAPTCHAs of various types. However, the focus is on blocking spam registrations and logins, rather than on blocking viewers. For the latter, I suspect you’ll need to configure things on the hosting provider.
When I Googled this earlier, I found a dizzying array of CAPTCHA-related extensions. IIRC a lot of them change the CAPTCHA type.