Confessions of a CPU and memory hog

The Thinker by Rodin

I have heard of diners who eat so much at all-you-can-eat restaurants that the management throws them out. All you can eat apparently means as much as the restaurant thinks you should eat. Eat too much and they are losing money. Therefore, certain patrons find their butts unexpectedly on the cement sidewalk in front of the restaurant.

While I have never been thrown out of a restaurant for consuming too much at the buffet bar, I have been driving my web host a bit batty lately. This is because my blog is getting more popular. I have noticed unexpected downtimes. In particular, my MySQL database (which drives all the content you see here) has been churning through CPU cycles on the server, even though my tables are not huge and my content is well optimized. At least that is what my web host is claiming. They sent me this ominous email a week or so back:

Recently your account potomactavern.org has been causing high load on the server. This is a serious problem as it degrades server performance for all of our clients who share and are hosted on the same server. High loads contribute to problems such as delayed e-mail and slow site loading times; you may have periodically experienced some of these issues.

This blog is hosted under my potomactavern.org web space. Anyhow, I was pointed to this web page that told me how to go on a resource diet. So I have been using its guidance and have been playing with Apache web server and MySQL database server settings trying to minimize usage. Yet, this seemed to cause more downtime. Reducing the number of processes that MySQL can use, for example, makes response time slower. It also seemed to cause MySQL to crash, which generated more support tickets.

Nonetheless, I thought I had taken appropriate actions. I even emailed the person who sent me the warning, asking him if what I had done was okay. I never heard from him so I assumed my usage was now acceptable. Over the course of several days, I noticed that the number of hits I was taking was declining precipitously. Instead of 500 page views a day, I was getting 150. Lately, it has been more like 100.

When this happened a year ago, it was because Google had found too many broken links, so it dropped me from its search index. This time though when I went into my Google Adsense account to see what was up, it reported a HTTP 403 error. This means that my server was refusing to serve any content to Google. This seemed very odd, so I filed a support ticket. They told me to fix my robots.txt file. This file tells search engines what they may access on the site. Only I did not have a robots.txt file for this blog. So what was going on?

More emails later, I learned that since this blog exists in a directory under my potomactavern.org domain, any rules affecting the potomactavern.org domain would affect it too. Moreover, under the potomactavern.org domain there was an .htaccess file. This is a hidden file used by the Apache web server to say who is authorized to access the site. The file contained a statement that told Google (and only Google) to go away. This was confirmed by looking at my SiteMeter referrals log. I still saw referrals from Google, but they were rare. I do not know who added this statement to the file, but I have no memory of adding it. I quickly removed the command and Google reported it could read my blog again. However, it will likely be some time before it fully indexes this blog again. In addition, there is no way to know whether the site will get as much traffic as it has gotten recently.

So apparently, I have sinned, but it was a sin of omission. If this was their way to limit my CPU usage, then I wish they had the courtesy to tell me.

I have learned there are some drawbacks to becoming more popular. I am learning a little lesson in the realities of web hosting that most people do not know. The amount of bandwidth and disk space you are given is merely marketing. In most cases, they mean nothing. In my case, I can store up to 2.5 gigabytes of data on the server. In addition, every month I can use a half a terabyte of bandwidth. Even with the extra traffic, it is a rare month that I use 5% of my bandwidth. So apparently, what really matters to a web host is how much CPU and memory on the server that you are using. If you are using” too much”, you are abusing the server. Note that this web host like most set the criteria for what they consider to be a “reasonable” number of domains that can exist on one server. The exact criteria though tend to be obfuscated. We can assume though that they want to put a lot of domains on the same server. This way they have to buy and maintain fewer servers. They expect your usage will be minimal. You, of course, are thinking, “Gosh, what a deal! $5.95 a month for hosting and I have terabytes of bandwidth! I better sign up!”

Since I have a virtual private server, I share this server with others but I also control exactly which applications are installed and how they are used. If I want to (and generally I do not) I can tweak settings in MySQL and the Apache web server to give myself more memory and CPU. In my case, I used the default settings. The defaults were apparently set too high for the number of domains actually placed on this server. For this, I pay $16.95 a month. If I want, I can elect to pay about $60 a month to be on one of their servers with no more than twenty domains.

I certainly understand that if I truly am a CPU hog that I should pay for a higher class of service. Still, I am puzzled by how I could be one. 500 browser page views a day, even if you add the usage consumed by search engines and feeds should not cause that much of a performance problem. (I also now cache my blog entries so static content is served unless the last request was more than an hour ago.) While I have a couple other domains other than this blog, they do not get nearly as much traffic. Yes, I know that serving the graphics and such on web pages also consumes CPU and bandwidth. In addition, generating content from MySQL on the fly uses CPU cycles too. While I do not run a hosting center for a living, I still find it puzzling that my traffic would use that many resources. With microprocessor on the servers capable of hundreds of millions of instructions per second, I should be a blip on their radar. Yet I am not.

Naturally, this problem manifested itself as soon as I sent in my money to renew my web hosting for another year. This means I can cross my fingers, upgrade my service (hoping that I will not meet the murky criteria for being a CPU hog again) or find another host. Regardless, I am unlikely to get my money back. To compensate I will move my forum to my friend Jim Goldbloom’s web space. This way, these issues will not impact the small number of regular users on my forum.

What I find annoying is that web hosts generally provide no clear and up front criteria for what high usage looks like. In my naiveté, I think that CPU and memory usage should be metered. If I am one of twenty hosts on a machine, then logically I should be able to claim 5% of the memory, 5% of the CPU utilization and 5% of the disk space at any one time. Actually, if I am not using the remainder, I am fine with someone else using it, but I sure want my 5% when I need it. It seems reasonable and fair to me. Moreover, I should have a tool that shows me my usage and compares it to the total available and the actual number of other domains the machine is hosting. As best I can tell, there are no such tools and I doubt this is accidental.

Web hosts of course need to make a profit. They do so I suspect in part through obfuscation of these sorts of details. As in the all you can eat buffet, there is only so much CPU available and memory available; they are finite resources. You are entitled to your share of it, but they will not tell you what the share is, only when you have used too much of it. As a result, people like me are left wondering what the heck they are supposed to do. How am I supposed to know if any web host can support my traffic for X dollars per month?

If anyone knows of a web host that guarantees a percentage of the CPU for virtual private server hosting please leave me a comment. Even a chat with a technician at LiquidWeb, which hosts my friend Jim Goldbloom’s web space, got me nowhere. They tell me that information is proprietary.

Meanwhile, I am feeling very paranoid. I am monitoring my resources, checking my web logs and wondering if I am being good or bad, but having no way of knowing. I am also wondering whether my content will denied to Google or other search engines again without my knowledge or consent. Whatever, web hosting strikes me as a lot of smoke and mirrors. Customers deserve clear criteria for acceptable usage.

Leave a Reply

Your email address will not be published.