Hosting Provider Nightmare
Date Friday, March 29, 2024 - 06:26 AM PST
Topic This Website


I usually like to only post stories that everyone can relate to. I'm sure not a lot of site members have had to deal with this, but you might still be interested in how this stuff works, and how much shmeng you can get from dealing with Web Hosting Providers. Read on for the story of why this site was just down for almost a week.

The server that shmeng.com and darkness-embraced.com runs on went down at 10pm last Friday night. I submitted a trouble ticket to the hosting provider to have them go reboot the box.  The response I got back said that the hard drive had crashed and that the machine would not boot.  Not what I wanted to hear.

I wrote back to them asking if they could ship the machine to me that afternoon, since Monday was a holiday and otherwise they probably wouldn't be able to send it until Tuesday. That wouldn't give me enough time to replace the drive and configure the box in time to send it back by the following weekend.

They wrote back with good news though. They said they could replace the drive for me either saturday night or sunday morning and they wouldn't need to ship it back.  Obviously this would be much quicker than shipping the server back and forth.  I agreed and wrote up some detailed instructions on how to transfer an entire UNIX system from one disk to another without having to reinstall and reconfigure the box.

I didn't hear back Saturday, but Sunday morning, I got a call from the tech. He seemed to understand my instructions so I thought it would go well.  I clarified a few things for him, and he went off to fix it.

I got a call back from him a few hours later saying that he couldn't copy the partitions from the old drive and asking what he should do.  I told him if he can't copy the whole disk, or the partitions, he can just copy the files.  I told him the command to do it, and he went off to fix it.

I got a call back from him a few hours later saying that he couldn't copy the files, but that he had put the old drive back in my server and it booted up. I connected and poked around a little. It wasn't good enough to bring the websites back up, but I could at least run some backups.  I told him I was going to grab what I could off of it and that I'd get back in touch to let him know if he needed to do a fresh Linux install on the new drive.

After grabbing the really important files, I decided to try and make a disk image of the system partition. I figured since it failed for him, it would probably fail for me, but it was worth a try.  It worked fine for me - No errors or anything. I quickly transferred the disk image to another machine and submitted a trouble ticket explaining how to grab the image from the other machine and restore it to the new drive. Then I waited.

Monday was a holiday so I waited some more (they weren't answering phones).

Tuesday morning, I waited.

I got sick of waiting before too long and called, since they weren't answering my trouble ticket.  I told the phone goon that my server had been down since saturday and asked if someone was going to look at it.  He told me there was nobody qualified to do it who was working right then and that it would be 24-48 hours before someone could get to it.  24-48 hours is still faster than shipping stuff, so I figured I'd just wait.

Wednesday I waited.  At this point though, I started worrying a bit.

Wednesday night, I pulled apart one of my machines.  I hooked up an empty hard drive and restored the system partition image from the server. The command took 3 1/2 minutes to finish.  I installed a bootloader, and booted the machine from the new drive. It said it was the shemng/darkness-embraced server and wanted to know where the website files were.  I humored it and restored the backups of the website and email partitions.  After restoring those, and restarting all of the services, it told me everything was working fine, but that it appears to be on the wrong network.

So in about 20 minutes, I managed to make a perfectly working copy of the server that was sane enough to wonder where it was when it woke up in a strange place. I managed to do all of this with the original crashed disk a few hundred miles away. This makes me wonder - why couldn't they do it with the original disk right in front of them?

So if I could get this disk into the server, it would just start right up and everything would work with no configuration or anything.  Unfortunately, the server is still a few hundred miles away, so the quickest option is still for the hosting people to just spend the time to restore the system partition (the 3 1/2 minute part) from the image file I made, and boot the machine from a bootdisk.  I can install the bootloader and restore all of the web and mail files from here.

Thursday morning their 24-48 hours was up, so I emailed them (not very nicely) asking if they were going to fix it or if I should just ship the configured drive. I got a note back asking how I wanted the new drive partitioned.  I replied, but by noon, nothing had happened.  I called and was told someone would be fixing it today.  

So I waited.

By 4pm, I still hadn't heard anything so I called.  Finally I was told that someone was downloading the disk image and was about to restore it.

At 4:30, I got a call saying that the 3 1/2 minute part was done, but that the disk would not boot - something must be wrong with the disk image.

"Did you install the bootloader?" I asked.

"I shouldn't have to, should I?" he replied.

"How do you expect it to boot then?" I asked.

So off he went to install the bootloader.  Half an hour later the machine was online and I could connect to it.  It took about 3 minutes to restore the mail and web partitions and a few minutes poking around and making sure everything was ok - but it's finally working. No data was lost as far as I can tell.

The nastiest part of all of this? After teaching their tech 3 different ways to replace a hard drive without reinstalling the operating system or losing data, and after doing all of the hard work myself - They're still going to be charging ME money for all of this.  Probably quite a bit.



This article comes from Shmeng
http://www.shmeng.com/

The URL for this story is:
http://www.shmeng.com/modules.php?op=modload&name=News&file=article&sid=572