yes, I don't know how phpBB stores its passwords but if they're unsalted you can use rainbow tables. wow that sounds fakebradrn wrote: ↑Thu Jan 23, 2020 9:09 pmEven if the passwords are unrecoverable, I still think that it would be better to obliterate them, just in case there is some method to recover them.Pabappa wrote: ↑Thu Jan 23, 2020 9:06 pm I wouldnt think the passwords would be recoverable, since if they were, people who host phpBB boards could use them to hack their own users' accounts on other sites. I havent looked into it, but I do remember one phpBB board admin who hacked the administrator of a rival phpBB board, and to do it he had to set up a false registration process where the password was sent unencrypted. (This worked because the rival board owner just so happened to use the same password for both boards.) I wouldnt think it would be necessary for the perpetrator to do this if it were possible to just unhash the passwords in the database. But again, I havent looked into this.
Posts from the old board
-
- Posts: 1660
- Joined: Sun Jul 15, 2018 3:29 am
Re: Posts from the old board
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
Re: Posts from the old board
If they're MD5-hashed you can directly break them with ease, that's how broken MD5 is.Nortaneous wrote: ↑Fri Jan 24, 2020 3:38 pmyes, I don't know how phpBB stores its passwords but if they're unsalted you can use rainbow tables. wow that sounds fakebradrn wrote: ↑Thu Jan 23, 2020 9:09 pmEven if the passwords are unrecoverable, I still think that it would be better to obliterate them, just in case there is some method to recover them.Pabappa wrote: ↑Thu Jan 23, 2020 9:06 pm I wouldnt think the passwords would be recoverable, since if they were, people who host phpBB boards could use them to hack their own users' accounts on other sites. I havent looked into it, but I do remember one phpBB board admin who hacked the administrator of a rival phpBB board, and to do it he had to set up a false registration process where the password was sent unencrypted. (This worked because the rival board owner just so happened to use the same password for both boards.) I wouldnt think it would be necessary for the perpetrator to do this if it were possible to just unhash the passwords in the database. But again, I havent looked into this.
Yaaludinuya siima d'at yiseka wohadetafa gaare.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
Ennadinut'a gaare d'ate eetatadi siiman.
T'awraa t'awraa t'awraa t'awraa t'awraa t'awraa t'awraa.
Re: Posts from the old board
only 602 bytes got me a search function in PHP that Im sure would work as well as anyone would need it to ... it might over-report results, but that's better than the opposite. Unless MySQL is encrypted/compressed and therefore low-level search functions like grep wont work?
or maybe speed is the problem ... as you said, searchin a 300 MB database might take a thousand times as long as searching 300K worth of short text files, and even if the traffic to the site was fairly small it could use up a lot of processor time.
or maybe speed is the problem ... as you said, searchin a 300 MB database might take a thousand times as long as searching 300K worth of short text files, and even if the traffic to the site was fairly small it could use up a lot of processor time.
-
- Posts: 1660
- Joined: Sun Jul 15, 2018 3:29 am
Re: Posts from the old board
i do not think it is wise to try to search a sql database with grep
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
Re: Posts from the old board
How does one search a SQL database then?
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
-
- Posts: 1660
- Joined: Sun Jul 15, 2018 3:29 am
Re: Posts from the old board
with Structured Query Language
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
Re: Posts from the old board
Good point. I’m horribly unfamiliar with SQL, so I wasn’t sure whether it was possible to use it to do a text search across multiple records.
(Of course, I don’t actually know if phpBB stores posts in a database. It may well turn out that it stores it in a form which is amenable to a simple text search! But I presume that others here know more than me about this topic.)
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
-
- Posts: 1660
- Joined: Sun Jul 15, 2018 3:29 am
Re: Posts from the old board
phpBB uses a SQL database, yes. Some SQL databases provide the tools for relatively simple optimized text search -- see here. For others, you'd have to either use LIKE '%foo%' or build the term vectors yourself.bradrn wrote: ↑Thu Feb 06, 2020 1:17 amGood point. I’m horribly unfamiliar with SQL, so I wasn’t sure whether it was possible to use it to do a text search across multiple records.
(Of course, I don’t actually know if phpBB stores posts in a database. It may well turn out that it stores it in a form which is amenable to a simple text search! But I presume that others here know more than me about this topic.)
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
-
- Posts: 1307
- Joined: Mon Jul 09, 2018 4:19 pm
Re: Posts from the old board
I would like to write this post with more detail, but in the little time I have, I'll say:
1. I'm in favour of publishing the database, with personal information pruned. Then multiple people could try to make working frontends.
2. By personal information I mean emails, passwords, and also very importantly, private messages.
3. Passwords from public database hashes are generally recoverable. A good algorithm (e.g. Bcrypt, which is what I think phpBB uses) with good salting (long salts, and a different salt for each password) simply make the recovery a lot slower, which is great because it buys you time. I don't know what configuration phpBB 3.0 uses, but a couple years ago someone who I trust told me a password protected with PHP's classic default Bcrypt setup (good salting with a cost parameter of 10) may take three good Amazon WebServices servers around six months or so to definitely decrypt one password (normally very impractical for a hacker who wants to attack many users, but practical if they're aiming at one specific person; and governments of course can do a lot better).
4. I was once fairly familiar with phpBB's database structure because of a custom modification I once made. I'm happy to say it's not that hard to figure out.
5. I wouldn't call phpBB's search data in the database laughable, precisely because the data provides wonderful savings in processing & time when searching text. It is useful data to keep co$$$ts down while providing fast searches, in ways that MySQL's or Postgres's search functions are not. The data keeps costs down because these days processing is more expensive than storage.
1. I'm in favour of publishing the database, with personal information pruned. Then multiple people could try to make working frontends.
2. By personal information I mean emails, passwords, and also very importantly, private messages.
3. Passwords from public database hashes are generally recoverable. A good algorithm (e.g. Bcrypt, which is what I think phpBB uses) with good salting (long salts, and a different salt for each password) simply make the recovery a lot slower, which is great because it buys you time. I don't know what configuration phpBB 3.0 uses, but a couple years ago someone who I trust told me a password protected with PHP's classic default Bcrypt setup (good salting with a cost parameter of 10) may take three good Amazon WebServices servers around six months or so to definitely decrypt one password (normally very impractical for a hacker who wants to attack many users, but practical if they're aiming at one specific person; and governments of course can do a lot better).
4. I was once fairly familiar with phpBB's database structure because of a custom modification I once made. I'm happy to say it's not that hard to figure out.
5. I wouldn't call phpBB's search data in the database laughable, precisely because the data provides wonderful savings in processing & time when searching text. It is useful data to keep co$$$ts down while providing fast searches, in ways that MySQL's or Postgres's search functions are not. The data keeps costs down because these days processing is more expensive than storage.
-
- Posts: 1660
- Joined: Sun Jul 15, 2018 3:29 am
Re: Posts from the old board
If it gets published, better to pick tables to publish (and leave out everything else by default) than to pick tables *not* to publish -- if the table doesn't absolutely *need* to be published, it shouldn't be.
Is Postgres's full-text search that bad?Ser wrote: ↑Thu Feb 06, 2020 9:13 am 5. I wouldn't call phpBB's search data in the database laughable, precisely because the data provides wonderful savings in processing & time when searching text. It is useful data to keep co$$$ts down while providing fast searches, in ways that MySQL's or Postgres's search functions are not. The data keeps costs down because these days processing is more expensive than storage.
Duaj teibohnggoe kyoe' quaqtoeq lucj lhaj k'yoejdej noeyn tucj.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
K'yoejdaq fohm q'ujdoe duaj teibohnggoen dlehq lucj.
Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq. Teijp'vq.
-
- Posts: 1307
- Joined: Mon Jul 09, 2018 4:19 pm
Re: Posts from the old board
Yes, I think you're right about that. In fact, I just remembered that phpBB saves the last ~10 distinct IP addresses or so from which a user has logged in, and that's personal information that should not be published. The tables with users' settings are also very much unnecessary (most people never even change the default settings anyway).Nortaneous wrote: ↑Thu Feb 06, 2020 11:42 amIf it gets published, better to pick tables to publish (and leave out everything else by default) than to pick tables *not* to publish -- if the table doesn't absolutely *need* to be published, it shouldn't be.
Postgres's full-text search is a very decent thing, but it's weaker when the data is in multiple languages (because word normalization undergoes lots of false positives, unless carefully configured and applied on well-marked content, which is usually not the case on a phpBB public forum) or contains a lot of non-standard language (because that doesn't get normalized). There is always the option of doing almost no normalization at all (other than turning uppercase to lowercase and the like), but then you're trying to find posts with practically the same procedure that phpBB uses.Is Postgres's full-text search that bad?
On the other hand, Postgres has that ranking thing out of the box which gives you the most important results depending on how many matches a post has, while phpBB just orders everything by the timestamp.
The extensive storage of word locations (for all attested words except the most common ones), which zompist finds a bit laughable, is referred to as a Generalized Inverted Index in Postgres's documentation, and is recommended for often-consulted columns. So both phpBB and Postgress use that, although it wouldn't surprise me if Postgres has a better implementation with less storage size with only a slight penalty in processing.
-
- Site Admin
- Posts: 2944
- Joined: Sun Jul 08, 2018 5:46 am
- Location: Right here, probably
- Contact:
Re: Posts from the old board
I mostly said it was laughable because it doesn't work well! That was a longstanding complaint about the old board, in fact. To actually find anything you mostly had to use Google...
-
- Posts: 1307
- Joined: Mon Jul 09, 2018 4:19 pm
Re: Posts from the old board
Oh, yes, I just realized phpBB must decay in long-running forums, because pruning threads doesn't update the search index in the database. lol
So... those 200000 posts throughout the decade and a half all counted to mark words as "too common in the forum". That's terrible. Another thing: it doesn't help much when querying phrases anyway.
Alright guys, I was wrong, and that data is not very useful in a long-running forum...
Re: Posts from the old board
Has there been any progress on this yet? It’s been three weeks since the last post.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
-
- Posts: 1307
- Joined: Mon Jul 09, 2018 4:19 pm
Re: Posts from the old board
I am definitely very interested in recovering my own posts from the old ZBB, so I'll be spending some time looking into what can be done with a database dump or a rebuilt local copy of such a database. The task is to identify which columns of which tables have content we want to make public. As mentioned before, it is not as straightforward as it sounds, since a lot of the metadata is done through relations between different tables (such as a user ID number for the person who posted a post, and a thread ID number and a forum ID number the post is found in). An ideal proposal would document that.
If anyone else is interested in finding out how, here is a page where you can download phpBB 3.0.12 (the version that the old ZBB was using when the database dump was made):
https://download.phpbb.com/pub/release/3.0/3.0.12/
Then install phpBB locally, make extra subforums as an admin, create various users and make some posts with them. Then examine the database. Again, don't propose grabbing whole tables, as a fair bit of personal information is recorded for users in particular (email, hashed password, last few IPs they connected from, likely a number of other things).
If anyone else is interested in finding out how, here is a page where you can download phpBB 3.0.12 (the version that the old ZBB was using when the database dump was made):
https://download.phpbb.com/pub/release/3.0/3.0.12/
Then install phpBB locally, make extra subforums as an admin, create various users and make some posts with them. Then examine the database. Again, don't propose grabbing whole tables, as a fair bit of personal information is recorded for users in particular (email, hashed password, last few IPs they connected from, likely a number of other things).
Re: Posts from the old board
That sounds like something I’d be really interested in investigating as well! I don’t have too much free time right at this moment, but I certainly think I’ll be trying this at some time in the next couple of days.Ser wrote: ↑Thu Mar 05, 2020 6:03 pm I am definitely very interested in recovering my own posts from the old ZBB, so I'll be spending some time looking into what can be done with a database dump or a rebuilt local copy of such a database. The task is to identify which columns of which tables have content we want to make public. As mentioned before, it is not as straightforward as it sounds, since a lot of the metadata is done through relations between different tables (such as a user ID number for the person who posted a post, and a thread ID number and a forum ID number the post is found in). An ideal proposal would document that.
If anyone else is interested in finding out how, here is a page where you can download phpBB 3.0.12 (the version that the old ZBB was using when the database dump was made):
https://download.phpbb.com/pub/release/3.0/3.0.12/
Then install phpBB locally, make extra subforums as an admin, create various users and make some posts with them. Then examine the database. Again, don't propose grabbing whole tables, as a fair bit of personal information is recorded for users in particular (email, hashed password, last few IPs they connected from, likely a number of other things).
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Re: Posts from the old board
Hmm, phpBB is proving much more difficult to install than expected — I can’t get it to find MySQL. But in the process I did find this list of phpBB database tables, which could prove useful.
EDIT: I finally managed to get phpBB working! Like I expected, it was horribly painful (made worse by the fact that I’m using the same outdated version the old ZBB uses); unlike I expected, I actually managed to get it working in one night. I think that now I’ll make a small forum with a couple of subforums, users and posts, and then make a dump and post it here for further analysis by anyone interested.
EDIT: I finally managed to get phpBB working! Like I expected, it was horribly painful (made worse by the fact that I’m using the same outdated version the old ZBB uses); unlike I expected, I actually managed to get it working in one night. I think that now I’ll make a small forum with a couple of subforums, users and posts, and then make a dump and post it here for further analysis by anyone interested.
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Re: Posts from the old board
Alright, so I’ve managed to set up a small forum and make a backup of it using phpMyAdmin (with the default options)! Here’s a full description of the contents of the forum (collapsed for space):
Hopefully the above forum includes a fairly comprehensive subset of the phpBB features used in the old ZBB; if it’s missing anything, please let me know, and I would be happy to add a couple more posts which use those features. The backup file is named test-backup.sql.tar (because phpBB won’t let me attach SQL files; just remove the .tar at the end and it should be fine), and should be attached to this post.
More: show
- Attachments
-
- test-backup.sql.tar
- Not really a tar file! Please remove the .tar at the end before using, since this is actually a SQL file.
- (223.16 KiB) Downloaded 347 times
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Re: Posts from the old board
Using the backup file I posted, I think I’ve managed to figure out how the phpBB backup is structured. Here’s a description of the most important parts (again collapsed for brevity):
__________
On the other hand, I was talking to someone I know who works in IT, and he suggested an alternative approach. As I understand it, the problem with restoring the backup of the old ZBB to this board was that the old version was too out of date (inferred from https://www.verduria.org/viewtopic.php?f=5&t=3); he suggested that zompist could download several versions of phpBB from 3.0.12 up to the current version (e.g. 3.0.12, 3.1.0, 3.1.6, 3.2.0, 3.2.8), and incrementally restore the backup to each one in turn. Then, once the database has been made compatible with the latest version, he could make the board read-only again and upload it; since the board would now be the latest version, this would prevent problems with PHP like the one which stopped the old board from working. Would there be any problems with this approach?
More: show
On the other hand, I was talking to someone I know who works in IT, and he suggested an alternative approach. As I understand it, the problem with restoring the backup of the old ZBB to this board was that the old version was too out of date (inferred from https://www.verduria.org/viewtopic.php?f=5&t=3); he suggested that zompist could download several versions of phpBB from 3.0.12 up to the current version (e.g. 3.0.12, 3.1.0, 3.1.6, 3.2.0, 3.2.8), and incrementally restore the backup to each one in turn. Then, once the database has been made compatible with the latest version, he could make the board read-only again and upload it; since the board would now be the latest version, this would prevent problems with PHP like the one which stopped the old board from working. Would there be any problems with this approach?
Conlangs: Scratchpad | Texts | antilanguage
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
Software: See http://bradrn.com/projects.html
Other: Ergativity for Novices
(Why does phpBB not let me add >5 links here?)
-
- Posts: 1307
- Joined: Mon Jul 09, 2018 4:19 pm
Re: Posts from the old board
I mentioned in this thread that the problem is that phpBB 3.0.12 relies on a "modifier" of a PHP function argument that was removed in PHP 7 due to security vulnerabilities (modifier 'e' of preg_replace). (Did you see no errors in your local install while using PHP 7? It'd be amusing if you didn't.) Then, in the post below that, I said that successfully updating phpBB should work, with a tip about mysqli configuration.bradrn wrote: ↑Fri Mar 06, 2020 7:57 pmOn the other hand, I was talking to someone I know who works in IT, and he suggested an alternative approach. As I understand it, the problem with restoring the backup of the old ZBB to this board was that the old version was too out of date (inferred from https://www.verduria.org/viewtopic.php?f=5&t=3);
I see none, but I don't know how difficult and/or error-prone updating phpBB forums to new versions is.he suggested that zompist could download several versions of phpBB from 3.0.12 up to the current version (e.g. 3.0.12, 3.1.0, 3.1.6, 3.2.0, 3.2.8), and incrementally restore the backup to each one in turn. Then, once the database has been made compatible with the latest version, he could make the board read-only again and upload it; since the board would now be the latest version, this would prevent problems with PHP like the one which stopped the old board from working. Would there be any problems with this approach?
Thanks for your work so far on the database! I think that we could and should make the limited archive anyway even if incatena.org could be restored to a functional state.