BKC Forum - Death Throes

Information Hub. MANDATORY READING WITHIN!
Rules, FAQs and More

Re: BKC Forum - Death Throes

Postby Tsurugi » Sat Dec 14, 2013 5:53 pm

SurrealBrain wrote:The bugs aren't so bad to justify a shutdown. Keep it open.

Seconded, let's keep it going.
User avatar
Tsurugi
BumbleNoble
 
Posts: 1804
Joined: Sun Jan 01, 2012 2:48 pm
Location: Galway

Re: BKC Forum - Death Throes

Postby LBD_Nytetrayn » Sat Dec 14, 2013 5:58 pm

Time will tell, given they seem to be getting progressively worse. The fact Ian is having the troubles he is already seems to be proving this forum unsustainable.

--LBD "Nytetrayn"
User avatar
LBD_Nytetrayn
BumbleElite
 
Posts: 10987
Joined: Mon Dec 04, 2006 6:19 pm
Location: Balloon Fight Arena

Re: BKC Forum - Death Throes

Postby LilacDownDeep » Sun Dec 15, 2013 4:45 pm

And what will happen when it suddenly collapses? At least we have a warning now and the ability to manage each other contact wise.
User avatar
LilacDownDeep
BumbleNoble
 
Posts: 1675
Joined: Tue Nov 06, 2012 4:41 pm
Location: The Lone Star State

Re: BKC Forum - Death Throes

Postby theJcfreak » Sun Dec 15, 2013 5:26 pm

I'm hoping the forum holds out till after the Secret Santa at least. Also as LilacDownDeep has said, there is a topic to post contact information for when the forum does go down.
User avatar
theJcfreak
BumbleChosen
 
Posts: 10652
Joined: Mon Apr 20, 2009 10:59 am
Location: ふざけんなよ。

Re: BKC Forum - Death Throes

Postby Scruffy » Sun Dec 15, 2013 10:40 pm

I hope we can keep soldiering on until a replacement forum can be set up.
User avatar
Scruffy
BumbleClan
 
Posts: 4809
Joined: Mon Jun 27, 2011 2:42 pm
Location: Melbourne, Australia

Re: BKC Forum - Death Throes

Postby linebyline » Mon Dec 16, 2013 5:33 pm

Honestly, hard to say. As long you're not losing any data, I say leave it open. But shut the sucker down the instant you hear a report that posts or other content have gone missing. Odds are, if something gets deleted, it ain't coming back. Unless you feel like dropping thousands of dollars to have someone run some data forensics and see what they can recover (and even that requires physical access to the hard drive).

I don't think a replacement forum will solve the underlying problem, which is with the MySQL database that powers the forum (or possibly even with the Web server itself, seeing as there are also FTP issues with have nothing to to with MySQL).

I hate to say it, knowing what a huge pain it was last time, but I think a new Web host would be more likely to resolve the issues than than switching to new forums on the same host.

If I get time, I'll look into the possibility of putting together a script to copy from MySQL to SQLite.

I managed to find this so far: https://www.phpbb.com/kb/article/transf ... or-domain/ There are instructions for using PhpBB's admin CP to export a backup, but it apparently only works with smaller databases. (PHPMyAdmin has this limitation as well, though perhaps less so). Note that I have never heard of the mentioned BigDump program and can't vouch for it. But you might not need it.

Now that I think of it, depending on your host's settings, you might be able to get into your MySQL from your own PC. The bad news is the mysqldump utility I'd recommend using is a command-line tool, pretty far removed from the hand-holding WYSIWYG kind of program. There's also MySQL Workbench, but I'm not sure offhand what backup capabilities it offers.

I'll do some research and report back what else I find out.

Update: Yes, MySQL Workbench will let you back up the database. This might be the way to go. The program can be douwnloaded from http://dev.mysql.com/downloads/tools/workbench/ . Instructions are at http://forum.hostek.com/showthread.php? ... L-database but they're out of date. I'll see if I can get the program up and running myself in order to get some more accurate info.

Update some more: If your host will let you log into your MySQL database remotely, which it probably won't (but it's worth a try, right?), then you can use MySQL Workbench. Instructions spoiler-tagged for space reasons.

Spoiler: show
When you first fire up Workbench, you should see the Welcome to MySQL Workbench screen. You should see three columns. You want the one on the right. At the bottom, click New Server Instance. This will open up a wiard-type window. It will ask you for a TCP/IP address. Select "Remote Host" and in the Address field, enter the hostname for your

You can find the hostname based on your PhpBB settings. If PhpBB is using localhost or 127.0.0.1, enter bumbleking.com as the Hostname. Otherwise, enter whatever hostname PhpBB is using. Click Next.

The next page of the wizard should say "Set the Database Connection Values." The first option will be Connection Name and will likely be preset to whatever hostname you entered on the last page. This is for your reference, not actual connection data, so name it something like BumbleDatabase or whatever you want.

You can leave Connection Method alone.

On the Parameters tab, you'll see the hostname again. Leave it and the Port as-is.

In the Username field, enter the username for your database. You can find this in your PhpBB settings. Leave everything else as-is. You don't need to worry about the Advanced tab. (It might not be a bad idea to go to the Advanced tab and select "Use SSL if available" though.) Click Next.

It will prompt you for a password. This can also be found in your PhpBB settings. It may be blanked out, but you can get around that in most browsers by using the developer tools. Let me know if you need instructions for that.

Once you enter your password, Workbench will test out the connection. This is the moment of truth: You'll soon know whether or not you're allowed to log into your MySQL server remotely. (I'm betting on not, but it's worth a try.) If you get an error that says "using password: NO" it just means it's a MySQL permissions issue. It didn't forget to send the password.

If it worked, you should see a bunch of blue check marks. Click next. Select "Do not use remote management." Click next again.

Server Instance Name is for your convenience; you can enter the same thing you used for Connection Name. Click Next. Or Finish. Or whatever. I forgot to write down what the button says. You're a smart guy; you can figure it out.

You should be back at the Welome screen, and now the big white box in the right column should contain the server instance you just added. Double-click it. It will prompt you for your password again. You know what to do.

In the left-hand column (labeled "Task and Object Browser" because that's helpful) you should see a section called "DATA EXPORT/RESTORE" (BECAUSE ALL CAPS MAKE EVERYTHING BETTER I GUESS). Underneath that, there's an item called Data Export. Click on that to make the Data Export screen appear. Under "Select Database Objects to Export," you should see your database listed. (It will probably be the only one you can see; if not, check your PhpBB settings.) Put a checkmark in the box next to your database.

Toward the bottom of the screen, under Options, select "Export to Self-Contained File." Under that, browse to the path/filename where you want to save your backup. Click the Start Export button at the bottom to get started. You may need to enter your password again. MySQL really likes passwords.

The forum database is pretty big, so the backup will probably take a while.


Of course a backup you didn't test is not much of a backup, but I'll leave it up to you whether you want to set up Apache, MySQL, and PHP on your machine, restore the database from the backup, and browse around to make sure it worked. It's not all that hard, but it takes some time and effort. Alternatively, you can open up the backup file in a text editor (it's just a plain text file) and read through it. This may take a while; it is, after all, every thread, post, and PM in the entire BumbleKingdom. You might want to skim.

As usual, I can't provide a warranty. Follow my instructions at your own risk and all that jazz. I' be highly surprised if you could hurt anything with this, but with the database acting flaky I can't swear to that.
Last edited by linebyline on Mon Dec 16, 2013 7:12 pm, edited 2 times in total.
User avatar
linebyline
BumbleKnight
 
Posts: 927
Joined: Wed Dec 17, 2008 10:02 am

Re: BKC Forum - Death Throes

Postby Sunwalker » Mon Dec 16, 2013 6:47 pm

A little update on the back up that I am doing.

I found out that the computer crashed a just couple of hours after I left thanks to the antivirus the(which I uninstalled now).

The crawling did not go very far, but in the end it turned out to be a good thing. I ended with some duplicates because phpBB appends some sort of session ID to the URL of unregistered users, so it saved both the pages with and without the SID. On the top of that, it also appends a SID to the style sheets resulting in a lot of broken style sheets links since Wget only saved the sheet without any SID (or at least it did not run until the point it would save, but it would end with lots of duplicates anyway). Without a proper style sheet link all the pages become a wall of plain and unformatted text over a blank background.

All of this could be avoided if the SID were stripped out of the URLs, but Wget provides no reliable way to do that. Then my solution was to crawl the site as a registered user, so no SID is append to the URLs. And it is working! The pages look fine now. I also have remote access to the computer in order to check if everything is going smoothly.

Another thing. It seems I have underestimated the time to crawl the forum. At the speed I am using now (a random delay of 0.5 and 1.5 seconds between requests) it should take around five days from now.

I will keep you posted if anything out of the ordinary happens and when it finishes. ;)
User avatar
Sunwalker
BumbleHonored
 
Posts: 657
Joined: Wed Feb 13, 2013 6:55 pm
Location: Brazil

Re: BKC Forum - Death Throes

Postby linebyline » Mon Dec 16, 2013 7:15 pm

Five days? Hmm. I'm somewhat surprised it isn't longer than that.

Also, 678th post. Because that kind of thing amuses me. :)
User avatar
linebyline
BumbleKnight
 
Posts: 927
Joined: Wed Dec 17, 2008 10:02 am

Re: BKC Forum - Death Throes

Postby Sunwalker » Mon Dec 16, 2013 7:43 pm

linebyline wrote:Five days? Hmm. I'm somewhat surprised it isn't longer than that.

Also, 678th post. Because that kind of thing amuses me. :)

I estimated it by considering one second per page. It saves one page per post plus each page of the thread (15 posts each). So I took the current number of posts and and added to it 1/15 of the same number. Then I multiplied it by one second and converted it to days (the result was 4.6, but I rounded to 5).

Maybe it was not the most accurate of the estimations, but it should be able to give a good idea of the time range since posts and threads together are the majority of the pages. If the result were around five months, then I would be worried ;).

Edit: in the last 10 minutes it saved 225 pages. At this rate, my estimation is now of 13 days.
User avatar
Sunwalker
BumbleHonored
 
Posts: 657
Joined: Wed Feb 13, 2013 6:55 pm
Location: Brazil

Re: BKC Forum - Death Throes

Postby SurrealBrain » Mon Dec 16, 2013 11:48 pm

Sunwalker wrote:Edit: in the last 10 minutes it saved 225 pages. At this rate, my estimation is now of 13 days.

Anything to preserve this old forum, I say. Let's just hope it doesn't come crashing down before you're done. Hopefully the most important things can be rescued, if nothing else.
User avatar
SurrealBrain
BumbleClan
 
Posts: 3621
Joined: Thu Jan 24, 2013 11:48 pm
Location: Michigan

Re: BKC Forum - Death Throes

Postby Sunwalker » Mon Dec 23, 2013 10:34 am

Updated! (×2) (see edits further down)

New update for the backup (20/12/2013): at the time of this writing, it is going smoothly.

I have switched the program I am using for the backup, instead of Wget I am using HTTrack now. The latter has better has better filters for what to download or not, for instance now I can exclude certain query strings from the URLs (these the parameters that come after the quotation mark in the address). It helps to avoid downloading A LOT of duplicates, thus significantly reducing the download time. The majority of the duplicates would come from the individual post links that are next to the post dates. Instead to leading to a page with only the post, they lead to the thread page where the post is. As a result, each thread page would end up being downloaded 16 times. So I excluded the individual post links.

Another advantage of HTTrack is that it allows to continue the crawling if it is interrupted for whatever reason. The "continue" feature of Wget does not work well for recursive download of dynamic pages (or better: it practically does do work, it end up downloading again all the dynamic pages anyway).

I am also downloading all the image files needed to display the pages, even those hosted outside bumbleking.com. The reason for that is that the BumbleKreative Korner is a important part of the community and there are lot of interesting artwork done there. When the backup is finished, I will create a two compressed files: one with everything saved (so, if needed, it will be possible to upload it somewhere and have a 1:1 copy of the site) and another file without external media (in order to to have a smaller file too, which will make easier if it needs to be shared).

Just a remark: I am only backing up the content that was publicly posted. No user pages, no private messages etc. The content is no different from what a regular user could see by just browsing the forum.

Edit (23/12/2013): Instead of double posting, I am updating this post.

The crawling is mostly done! The good news is that nearly all forum content has been crawled, however I got some issues while getting the images. I already solved them, but let me tell what happened:

When HTTrack is told to get the external content to display the page, if the image link return a HTML page instead of an image it crawls the external page. The classic example is an external image which was removed and the image host returns a error page, or when the poster mistakenly used the wrong link to insert the image (i. e. the image page instead of the direct link). If HTTrack only got the external page it would not be a big issue, but the problem is that it starts to crawl the entire site! As a result it ended up crawling the entire DeviantArt, Gifsoup, ImageShack, several Wikia's wikies etc. I could not find a way to make it not get a external HTML page while getting external images, so my solution was to make it only accept external links that ends with image extensions.

It worked fine in vast majority of the cases to stop it crawling external pages. But I still got two nasty loops:
1) Wikies' image pages ends with a image extension even though it is a HTML page. So HTTrack, once in the image page, started crawling all the other image pages in the wiki.
2) I got two misconfigured websites that return a not found page for www.example.com/images/file.jpg with a page for www.example.com/images/images/file.jpg. This page had a www.example.com/images/images/images/file.jpg and so on.

The solution for (1) was adding filters to exclude wiki image pages. It was easy because fortunately they always follow the same pattern (a URL ending with File:somename.extension or Image:somename.extension), while the images themselves follow a different pattern, so a correctly inserted image would not be excluded.

The solution to (2) was to filter out the offending domains. It was only two domains and they were used very few times, so not much was lost.

Now the back up is running fine again and it should end very soon :)

Edit (24/12/2013):

The backup has been successfully completed! I am getting in touch with Ian.
User avatar
Sunwalker
BumbleHonored
 
Posts: 657
Joined: Wed Feb 13, 2013 6:55 pm
Location: Brazil

Re: BKC Forum - Death Throes

Postby linebyline » Tue Dec 24, 2013 1:21 pm

I don't wear a hat but if I did it would be off to you! Thanks so much.
User avatar
linebyline
BumbleKnight
 
Posts: 927
Joined: Wed Dec 17, 2008 10:02 am

Re: BKC Forum - Death Throes

Postby Ian Flynn » Tue Dec 24, 2013 10:55 pm

Far all of his dedicated work to archiving the forum, Sunwalker has been gifted with the unique rank of "BumbleHonored."

Thanks again for all your help, Sunwalker!
Ian Flynn
BumbleKing
 
Posts: 5837
Joined: Thu Aug 04, 2005 12:11 pm

Re: BKC Forum - Death Throes

Postby NiTROACTiVE » Tue Dec 24, 2013 11:48 pm

Ian Flynn wrote:Far all of his dedicated work to archiving the forum, Sunwalker has been gifted with the unique rank of "BumbleHonored."

Thanks again for all your help, Sunwalker!

It is indeed great that Sunwalker archived all of the forum posts, and he does deserve that unique rank you gave him.

However, it's still sad that all our post counts are most likely going to get nuked due to this forums problems, because some of us came a long way to get to big post counts, especially JC.

Sorry if I'm going overboard with the post count issue, I'm just saying. That's all.
User avatar
NiTROACTiVE
BumbleClan
 
Posts: 3712
Joined: Sun Mar 27, 2011 4:53 pm
Location: United States

Re: BKC Forum - Death Throes

Postby Sunwalker » Wed Dec 25, 2013 12:46 am

You are welcome! I am flattered :)

I already gave Ian the link to the files, and he said it is OK to post them here too.

Here is the link to the files:
https://www.dropbox.com/sh/6lh1atam3ble290/cmRqXKbv3Z
(you can also transfer the files to your Dropbox account, if you have one. I suggest bookmark the link since I intend to keep the files hosted indefinitely)

There are two files, one with only the content of bumbleking.com and other with this content plus the external images posted. The backup is browsable (after uncompressed). If you are browsing the backup with no external images, you will need a Internet connection in order to see the images, on the other hand the file with the images can be browsed without a internet connection for the images.

I used the 7z format to compress the files because it gives better compression ratios than zip or rar. Most of the file compression programs should be able to handle the 7z format, but if you are unable to open it you should download 7-zip (free).

Some data on the files:
BKC - No external images.7z
Download size: 390 MB (compressed)
Uncompressed size: 2.52 GB
Number of files: 38.028

BKC - Full.7z
Download size: 4.28 GB (compressed)
Uncompressed size: 6.62 GB
Number of files: 62.076


How to browse: After you decompress a file it will create a BKC folder, in order to browse the backup just open the index.html file inside this folder. Note that at the top of each forum page there is the "current time". In the backup just look at this time if you want to know when a certain page was saved. If you want too know the original address of the page, look at its source (Ctrl+U) and see the comment at the fourth line).

One more thing. I plan to update the backup each 6 months or one year.

Now I leave you guys with a very thought provoking image from the backed up content:
Spoiler: show
hc099.jpg

Best regards!

Edit (April 12nd 2015): There is an updated version of the backup, which can be downloaded on http://1drv.ms/1DAsNQq.
Last edited by Sunwalker on Sun Apr 12, 2015 8:10 pm, edited 1 time in total.
User avatar
Sunwalker
BumbleHonored
 
Posts: 657
Joined: Wed Feb 13, 2013 6:55 pm
Location: Brazil

PreviousNext

Return to BumbleKnowledgebase



Who is online

Users browsing this forum: No registered users and 1 guest