Skip to content
Do you have adblock enabled?
If you can read this, either the style sheet didn't load or you have an older browser that doesn't support style sheets. Try clearing your browser cache and refreshing the page.
View a Fark blog:    

Newest | « | 1 | 2 | 3 | » | Oldest

Fark hardware, for the morbidly curious
Posted by Mike at 2008-01-16 1:43:23 PM, edited 2008-01-16 2:36:06 PM (8 comments) | Permalink

I thought I'd throw a little bit of info out about what we do hardware-wise. Software's for later.

The history of the hardware we've used over the years is mostly boring and a little scary, so let's just skip all that and focus on what our setup looks like in January 2008:

[image from too old to be available]

OK, there was a burned out lighting fixture in the room, but wow that's a shiatty-ass photo -- now you know why I don't enter Farktography contests :)

Near the middle we have five identical web servers:

* 1U rackmount Supermicro 5015M-MT+ with PDSMI+ motherboard
* Core 2 Quad Q6600 (Kentsfield)
* 6 GB of ECC memory
* mirrored pair of hot-swappable SATA disks

One of these also does email, two of them also do primary DNS. Secondary DNS is offsite. Two Foundry Serverirons sit in front of these to do load balancing.

One development server:

* 1U rackmount Intel SR1530AHLX with S3000AHLX motherboard
* Core 2 Duo E6600 (Conroe)
* 6 GB of ECC memory
* mirrored pair of hot-swappable SATA disks

(this machine was donated to us -- very cool)

One database server:

* homebuilt in an Antec 4U rackmount case
* Supermicro PDSMA+ motherboard
* one Xeon X3220 (Kentsfield) (ok, this is really another Core 2 Quad Q6600, but, hey, same thing) :)
* 6 GB of ECC memory
* Adaptec SCSI RAID card
* five hot-swappable SCSI disks -- four in RAID 10, one hot spare

You might notice some trends there... chipset, clock speed, etc, disks/memory are all the same brands too... makes replacing dead parts easier :)

They all run 64-bit FreeBSD/amd64.

There are plans to switch the database box over to SAS disks soon.

The other gear in the rack is mostly Cisco and APC stuff, so I can whack sick servers into behaving from my house 70 miles away -- serial consoles, remote control power strips, etc.

All this crap takes up about half a rack, fed by two 120V 20A circuits and one fast ethernet into our firewall.

Backups and monitoring are offsite.

Some of this might not match everyone's particular hardware fanboy religion, but, really, who cares, choice is a good thing, and this works pretty well for us, and this is turning into a hell of a run-on sentence... so yeah, there's probably better ways to do it, but there's no point in throwing it all out and buying all new stuff either. Besides, some the gear we used to run with even as recently as a year ago was much scarier... :)

The one really big hardware issue is the database is a single point of failure, but a) our cost of downtime is very low and b) fixing that is much more of a software problem than a hardware problem (async replication = easy, sync replication = hard) and that's for a future blog entry...
· · ·

Two brief scheduled outages this Tuesday
Posted by Mike at 2008-01-13 12:40:30 PM, edited 2008-01-16 2:36:57 AM (0 comments) | Permalink

We'll have two brief 5-10 min outages this Tuesday evening to install two new UPS's (one on each 20A circuit)

...assuming that they arrive on time and aren't DOA, of course. :)

[EDIT: All done. They work. The boxes they came in smelled like something died in them though.]
· · ·

Some tweaks and bugs tonight, plus new PDA stuff
Posted by Mike at 2008-01-07 10:28:38 PM, edited 2008-01-07 11:10:18 PM (3 comments) | Permalink

I'm making some changes in cookie and clickthru handling tonight. You might see, or have seen, some weird behavior, like clicking main page links would go right back to Fark instead of to the intended site for a few minutes...

One of the points is to try to reduce the number of awkward double-logins required on Totalfark; if you go to the main page of Totalfark and then click comments, you have to log in a second time. Normally you only have to do the second login once (unless your browser's cookie handling is broken). One change going in is an attempt to avoid that second login by setting the cookie as you hit the main page of Totalfark. Because the main page is static and not a script, it can't do its own database lookups (or cookie checks), so, this was a little tricky.

(The main page being static is why we can't do per-user main page customizations -- for example, displaying counts of the number of new comments per link. The upside is there's a huge performance benefit to having our most-frequently-hit pages being static instead of dynamic scripts.)

Also, in an unrelated change, some of you may notice that we automatically direct you to the PDA version of Fark if you visit from a PDA or smartphone; you don't have to go directly to the /pda.html URL anymore. Also, the PDA version of now allows posting. iPhones are exempt from this as they can render the full site just fine.
· · ·

Latest on db issues
Posted by Mike at 2007-12-12 1:27:01 AM (1 comment) | Permalink

I did some emergency unannounced database maintenance just now to have another go at fixing some random table corruption that's been causing MySQL to crash randomly.

It's strange because the actual table files aren't getting corrupted (they're read-only tables and md5's of the files aren't changing!) so barring hardware problems, which I've mostly ruled out, it's gotta be the main innodb files getting hosed -- so I converted all tables to MyISAM, blew out the innodb files, and converted them back. To do that means running without transactions, which means weird things can happen (like comments getting the wrong person's name stuck on them) so I shut the site down as a precaution.

If this doesn't fix it then I may have to have another look at hardware, however, I've checked it out pretty thoroughly (no ECC problems with the memory, CPU temperature's fine, power supply seems OK, no apparent problems with the RAID controller or disks...) and after that it's looking for bugs in the OS I guess.

New comment counters being wrong are an unrelated problem; I'm working on that as a separate issue. But the red line should always appear in the correct place and clicking the (sometimes wrong) number will always jump to the correct new comment anyway.
· · ·

New post counting may be off a bit
Posted by Mike at 2007-12-05 8:12:29 PM (7 comments) | Permalink

The red line marking the first new post in a thread may show up in the wrong place for a little bit. This should only happen to each user once per thread; once you go back in a second time (after more posts appear) it should then be in the right place. By later tonight, hopefully, it won't even happen the first time.
· · ·

Posted by Mike at 2007-12-04 10:17:37 PM (7 comments) | Permalink

Well, I did say "expected" outages and "forseeable" future...

InnoDB decided to take a dump on itself earlier tonight, so I had to recover the entire database from a backup. Fortunately we have a replica set up for exactly this situation, so we didn't lose anything. Sucks being down for 3 hours but it could have been much, much worse.

Archives are still restoring; that should be done by midnight. Til then, any links older than 60 days probably won't work.
· · ·

One last bit of database hardware maintenance tonight
Posted by Mike at 2007-12-03 5:29:06 PM, edited 2007-12-03 5:31:03 PM (23 comments) | Permalink

There will be one last round of database downtime tonight at 9 pm Eastern Time. This time we'll be replacing the temporary motherboard we put in Friday with a permanent replacement.

It'll take about 30 minutes -- or probably less. Anyone who's built PC's knows how that goes. :) I've already got the CPU, fan, and RAM installed on the board already, so it's just a matter of pulling the machine out of the rack, swapping boards, hooking SCSI/PATA/USB/power LED/power switch/reset switch cables back up, going through and changing all the BIOS settings, etc. Nothing major. The OS boot scripts have already been edited.

It's at 9 pm so I'm not driving 80 miles back home at the wee hours of the morning.

This is the last expected downtime for the forseeable future. Last week's performance problems have been fixed.
· · ·

Db maintenance overnight
Posted by Mike at 2007-12-02 12:29:53 AM (20 comments) | Permalink

I'm upgrading the database server from a 32-bit OS to a 64-bit OS tonight starting at 2 am, so there'll be a few 5 to 15 minute outages during the necessary reboots.
· · ·

Performance problems
Posted by Mike at 2007-11-27 1:29:47 AM, edited 2007-11-30 4:29:22 PM (31 comments) | Permalink

Well, this hasn't been the best couple of days. Performance has been t3h suck today, due to one or two separate issues combining and then causing others.

The short version is the images web servers got hammered by all the new images in user profiles and the submit script got hammered by a spammer all at the same time, which ran the boxes out of memory, which made them accept data from the database too slow, when then ran that box out of connection handles... so a nice little domino effect there.

Fixes so far:

- ordered more RAM, should be here by the end of the week

- removed the site logo images from the user profiles (left the topic tags in though)

- switched from Apache to lighttpd

- raised # of database connection handles; hopefully we'll be getting a new database box in January as well (after taxes)

- added about a half-dozen anti-spam features that I won't get into in case the spammers are reading this. More are forthcoming. (And if the stupid pigfarkers are reading this: submitting to Fark does nothing for your Google PageRank and will get you zero traffic, because it's not like we're going to greenlight it and it's not like Totalfarkers will click it, so why bother)

- fixed several slow SQL queries that were deadlocking other queries, and redesigned some table schemas to try to fix others

- made a config tweak on the load balancers

- recompiled database to be sure it's using the correct threading library

- Replaced database server's motherboard with a slightly faster temporary one (not as fast as I'd like but it's all I had sitting around); this one's capable of 64-bit. An even faster 64-bit board has been ordered.

- fixed several more bad SQL queries, two of which I think were the root cause of all this crap. Not that all the other issues didn't also need fixing, they just weren't as apparent til now...

- Still in progress: upgrading database server to a 64-bit OS and (once it arrives) upgrading its motherboard again

For more updates see the comments for this blog entry.


Side note: if you're using the FarkIt extension and you haven't upgraded to at least version 2.3f here, do it ASAP or you're likely to hit one of the anti-spam features.
· · ·

Spurious "are you a bot" messages?
Posted by Mike at 2007-11-26 12:42:24 AM (1 comment) | Permalink

If you're getting "are you a bot?" errors when trying to post or submit... just wait about 5-10 seconds and just click submit a second time and it should go through. But you shouldn't get this very often, if at all, anymore. It's a new anti-spam thingie and I'm still tweaking the values a bit.
· · ·

Newest | « | 1 | 2 | 3 | » | Oldest

Continue Farking

On Twitter

  1. Links are submitted by members of the Fark community.

  2. When community members submit a link, they also write a custom headline for the story.

  3. Other Farkers comment on the links. This is the number of comments. Click here to read them.

  4. Click here to submit a link.