I recently stepped down from my voluntary role as one of the Administrators for a popular skateboarding web forum (for old gits), due to time commitments (i.e. I couldn’t commit any). I thought I would share a few tips I picked up from that and other forums i’ve implemented. The forum in question uses phpBB, but I think some of my observations will be useful for users of other forum software.
Choose your hosting carefully
A high volume web forum requires more availability than your average website, as regular users accessing it around the clock can get pretty upset when the forum is running slow or timing out. Physical location of the hosting can make a difference too – the host of this particular forum seemed to be having routing problems, with the forum unavailable for days at a time for some people while fine for others. Users complained a lot, and people complained that they couldn’t get to the forum to complain, and that sometimes the forum timed out while they were complaining, resulting in them posting their complaints multiple times!
Over xmas, the (windows) host was infected by a virus, which resulted in the server being rolled back with no warning several days to the point in time before it got infected. This meant that the forum lost all content in the meantime. I had my own nightly off-site backups (i’ll touch on this later), but because this happened over xmas and wasn’t keeping an eye on the forum, people came onto the forum and posted before I had a chance to restore it from one of my backups, so there wasn’t really much I could do without a lot of gruntwork, to merge it all back together, as there would have been problems with duplicate id’s.
I don’t want this to be a “linux vs windows” argument, because a linux server could also be hit by a virus and/or hacked, but I suspect – wrongly or rightly – that it’s much more likely to happen on a windows host. Apparently this one was infected by a customer with ftp access.
Server resources are also important for a forum, particularly if the server is on shared hosting. The phpBB based forum I administrated would frequently grind to a halt, with no real way of telling what the problem was without direct access to the server (as it is on shared hosting). The search index (mysql) table for the forum is massive (i.e. hundreds of thousands of rows) and this is searched everytime anyone hit the site, not just by people making searches, but to display things like “posts since last visit”. This also sais something about the efficiency of the phpBB software – it would be interesting to compare the efficiency with other systems once the search index has reached this sort of size.
I didn’t choose the hosting for the forum in question, but it was chosen by someone (non-techie) on the basis that they host several other (static) sites with the company and never had any problems. Running a web forum is a different ballgame to hosting static sites. People hardly ever complain when they can’t reach a static site, and a static site is less likely to go down because there are less failure points, and less resources needed.
Forums often have email notification functionality, i.e. people can choose to be notified when something is updated – this is also another point of failure on an inadequate host, such as this one.
I wasn’t going to name and shame the host, but they haven’t been too helpful in resolving the problems so I think in the interests of the public I would advise people not to use this company for a high volume web forum.
If your host turns out not to be suitable, you will need to move it somewhere else which can upset forum users even more, as it inevitably takes a few days for DNS servers to settle down.
If it’s not essential, block the forum from search spiders
This won’t be suitable for every forum, but if you don’t want or need the forum content to be indexed by search engines, use robots.txt to exclude the forum directory. I did this after being hit by the infamous phpBB virus which used google to search for phpBB based sites to take down. I reasoned that this alone would help hide it from repeat attacks. It has also proven fairly useful to hide from spambots that use search engines to find victims, and makes the forum less attractive to those seeking to improve their pagerank by having links on the forum.
remote backups
I mentioned before that the host server got hit by a virus resulting in the machine being rolled back several days – database backups stored on that machine would have been lost too. I used a mysql server client and cron job running on a linux machine to keep 30 days worth of remote backups in the form of date stamped sql dump files. If I had managed to disable posting on the forum as soon as it had been rolled back I could have restored remotely from one of the backups and only lost a few hours worth of content.
Don’t make people think
As the majority of the users of the forum are non-techies, they can have trouble with certain features such as posting images and links etc. The more features you give them, the more requests for help you will get. Only enable what you don’t mind supporting. A good example of this is enabling HTML in posts, then requiring people to select “disable HTML” for a post where having HTML could cause problems (e.g. the built in phpBB code that is usually used for formatting).
Avatar Size
One modification I did make was to use CSS to limit the area shown of an avatar. This was in response to people huge avatars and distorting the page, then either complaining, or not understanding when they were asked to limit the width of their linked (i.e not hosted on the server) avatars. Since I made this very simple hack, people tend to understand what they need to do to make their avatar display properly. On systems where avatars are uploaded to the host and resized, this isn’t a problem.
Modification hinders upgrades
The more you modify a piece of forum software, the more difficult it is to upgrade and apply security patches.
Single sign-on and spam handling
On a seperate forum where we (we = netsight) have integrated the sign-on (and sign-up) for an Invision Power Board (IPB) and a Plone site, we have just started having problems with spam, because the default plone sign-on allows people/spambots to bypass the anti-spam measures, and the user management features built into IPB. I should point out that anti spambot mods are available for Plone.
Beware banning IP addresses, and especially ranges of IP addresses
Some users share blocks of IP address, i.e people using a particular ISP. Banning an IP address or range of IP addresses can have the knock-on effect of blocking a whole load of innocent users. As people without fixed IP addresses get a different IP address virtually every time they log on, and spammers are well versed in spoofing and changing IP addresses, using IP addresses to block individuals is largely pointless, unless you can identify a persistent spammer who happens to have a fixed IP address.
Spam, spam, spam
Spam is the biggest challenge in running a public web forum, and it is getting more difficult. Multiple guards (Captchas, email verification, javascript foo, concealed weapons and lie detectors) should be used where possible. The more popular your web forum software is, the more it will be targeted (but also patches and mods will appear quicker)
Anyway, just a few noteson my experiences. I’m keep to hear other tips on this subject and to hear tips from people using a recent version of PloneBoard which I am about to start using for a commercial project.
(Or… er… work near Blagdon.)
Fintan 2007-08-07 10:53:37