You are here: Home page > Computers > Cloud hosting and CDNs
Advertisement

Questionmarks in the clouds.

Cloud hosting and CDNs

by Chris Woodford. Last updated: August 13, 2016.

Cloud this, cloud that—it seems the world of computing has suddenly become very overcast! If you've read our general cloud computing introduction, you'll know that there are lots of different ways you can take advantage of Internet-based computing. If you run your own website, there are two very specific forms of cloud computing you might want to investigate: cloud servers and content delivery networks (CDNs). What are they, how do they work, and how do they compare with existing shared and managed web hosting? What benefits do they bring—and what are the drawbacks?

Photo: Cloud computing questionmarks: Moving your hosting to the "cloud" has advantages but it can bring security and management problems too.

What is cloud hosting?

If you're a webmaster, two kinds of cloud hosting will be of interest to you: cloud servers (which do the same job as a managed or shared server, but in a more dynamic and scalable way) and CDNs (content delivery networks, which distribute "static" files, such as images and CSS, at multiple locations around the world so they load faster for a worldwide audience). Let's take a look at them both, in turn.

Cloud servers

Traditionally, web hosting came in two flavors: high-cost managed hosting, in which you have your own private server (an actual computer!) dedicated to running only your website and its applications, and low-cost shared hosting, where your site and apps run on a large server with a number of other sites run by other people. Now there's a third option, widely marketed as cloud hosting in which your site runs on a virtual server somewhere up in the cloud; depending on how it's set up, a cloud server might be an actual computer, but it's just as likely to be a chunk of a much bigger machine—as with other kinds of cloud computing, the point is that it shouldn't matter either way to you as an end user. Rackspace's Cloud Servers, Liquid Web's Storm on Demand, and Amazon's Elastic Compute Cloud (EC2) are three examples of this kind of cloud hosting—and there are many more.

An example cloud server

So what's a cloud server like in practice? It's relatively easy to sign up to cloud services and see for yourself. With Storm on Demand, one of the cloud services I've used, you simply create a billing account and then tick the kind of server you want from a list of common examples (running from 1GB memory and 1CPU up to 96GB memory and 32 CPUs). Then you tick the "server image" (essentially the software you want on the server at startup, including the operating system) and specify whether you want a managed server (where the Storm guys sort out operating system patches and so on) or a self-managed server (where you do these things yourself). Finally, you specify whether you want backups of your data and how you'll pay for bandwidth (either in large, specified blocks of GB or per GB used). When that's all done, you click to create the server and it's all "built" for you, on the fly, in a matter of minutes.

Once the server's created, you can configure it in the usual way (just like a physical server) with software like WHM and cPanel—or however you wish. If you decide you no longer want your server you can destroy it just as easily, and you simply pay for what you've used (an hourly rate for the server and a per GB rate for the bandwidth). It's extremely easy to use. Even with only previous experience of shared hosting and no experience at all of setting up standalone servers, I had this website up and running on a Storm cloud server in a couple of hours.

The brilliant thing about a cloud server like Storm on Demand is that you can scale it up or down at any time simply by revisiting the control panel and changing what you need to. Suddenly find you need more CPUs or more memory? No problem! Just tick the boxes and your server is automatically reconfigured and working at its new spec in a few minutes. That makes cloud servers a great choice for people who need supreme flexibility or whose computing needs are steadily changing. For example, if you're a fashion store, and you have a time of peak demand coming up—an end-of-season sale, perhaps—you could double or triple the power of your machine for a week or two before scaling back down again when traffic returns to normal. You can increase the power of your server at the click of a mouse but, because you're billed on a pay-as-you-go basis, you'll only pay for the more powerful "machine" for the period when you actually use it. Compare that to dedicated hosting: to get the same results, you'd need to anticipate every increase in server power you might need, invest in a more powerful machine in advance, allow time to get it set up and tested, and keep that powerful new machine running (at considerably greater cost) even if your traffic returns to lower levels again in future.

What are the drawbacks? If you run a small-scale website that has very steady or totally predictable traffic, you might find a cloud server is too expensive and demanding. If you don't need the flexibility, you're not worried about sharing a server with other people's websites, and you don't want to waste time monitoring the performance of your server, shared hosting will probably be a better solution for you than cloud hosting. The important thing to remember is that "cloud server" is essentially a marketing term and not a technical description or explanation; well-managed, traditional shared hosting can give you many of the benefits of cloud hosting, though without the flexibility or independence.

Screen shot of Storm on Demand cloud server dashboard

Photos: Liquid Web's Storm on Demand allows you to set up a cloud server in a matter of minutes, simply by ticking a few boxes. Every aspect of the service is pay-as-you-go. It's easy to use even if you have little or no experience of setting up or managing dedicated servers.

Cloud servers or virtual servers?

How do cloud servers work? Hosting products described as "cloud servers" are generally virtual slices of large, physical servers running what's called virtualization software (the most common types being VMware® and Xen® hypervisor for Linux and Microsoft® Hyper-V™ for Windows). In other words, they are effectively "virtual servers" (entirely independent virtual machines) running on a real, physical server. How is that different from shared hosting? The virtual servers are essentially independent of one another (though they do use the same processors and memory), so you're not at risk from other people's applications or websites. You have full root access to your virtual server (unlike on shared hosting, where different users' files are simply subdirectories of a single server running a single operating system) and your own unique IP address (so, unlike with shared hosting, there is no risk to your site if other people host "dodgy" websites on the same machine), and you can reboot or reimage, as you wish—you can even run entirely different operating systems on the same physical server. From the viewpoint of the hosting company, the main benefit of using virtualization is reducing the number of physical servers they have to buy and manage: it's a much more efficient use of resources. However, that doesn't necessarily translate into the cost savings you might expect because support costs may be higher and you may still need multiple software licenses for each virtual server.

Cloud-based content delivery networks (CDNs)

Your website can benefit hugely from cloud computing even if you don't want to migrate it to a cloud server. Information-rich sites like this one, with a lot of static content, typically use over 90 percent of their bandwidth serving up images (and other media) and CSS files that probably don't change from one month to the next. With traffic split equally between Europe, America, and Asia, there's no easy way to decide where to locate your main server: wherever you choose, some users will benefit and others will lose out. But putting the static content on a content delivery network (CDN), dispersed across the cloud, will benefit everyone. Simply speaking, a CDN makes multiple copies of your static files and stores them at many different places around the world (called edge locations) so that different users in different continents receive whichever files are nearest (and therefore quickest to download).

How do you set up a CDN in practice?

Suppose you want to speed up your website by moving all your images on to a CDN. You can sign up for a pay-as-go CDN in a matter of minutes (Amazon's Cloudfront and Rackspace Cloud Files are two popular, instant options, but there are plenty of others). Once you've sorted out the billing, you simply upload your files (in a similar way to using FTP) and you'll be allocated a web address (such as abcdefg123456789.cloudservice.whatever) that you can use to link to them. You can either use this address explicitly (referring to it directly in your IMG tags) or (more sensibly) refer to it through a CNAME (effectively a DNS alias) based on your own domain name. When people download your web pages, the images are no longer pulled from your main server but from one of the edge locations around the world—ideally one that's geographically close to where they happen to be.

How does it work behind the scenes? It's easy to see if you do a DNS lookup for whatever domain name you're using for your CDN. Instead of a single IP address, you'll find the name resolves to different IP addresses in different parts of the world. In other words, the files resolve to a different IP address depending on where the end user happens to be. So for a person on the West Coast of the United States, abcdefg123456789.cloudservice.whatever might resolve to a server in Mountain View, California, while for a user in Europe, the same domain might resolve to a server physically located in Paris, France or London, England.

Pros and cons? There is almost always a significant performance boost from moving to a CDN, but if you're paying a fixed-price for your web hosting (or server) bandwidth, using a CDN is going to work out as an extra cost. CDNs rely on your files being copied, periodically, from the central server where you upload them to the edge locations around the world where they're served to users and typically cached for anything from a few days to several weeks or more (you can generally specify the cache expiry time)—so file management and updating can sometimes be a problem. For example, suppose you set a 30-day cache on your main CSS file but suddenly want to change the way some aspect of your site is presented. You can either upload a new CSS file and wait up to 30 days for all the edge locations to reflect the change or rename your CSS file (and all the pages that reference it), then upload a completely new version of your entire website. Either way, you lose a certain amount of flexibility in file management and it's important to remember that different users in different locations may see different versions of the same file for a period of time. That's why CDNs work best for static (rarely changing) content.

Worth a go?

One of the best things about cloud services is that they're generally pay-as-you-go—so it's very easy to try them out, at relatively little cost, and see what difference they make.

Stopping hotlinks to your CDN files

You've been bitten by the cloud bug! Excited by the possibility of supercharging your website, you've shifted your static content (your images, your CSS, and whatever else doesn't change that often) onto a content delivery network (CDN), such as Amazon Cloudfront or Rackspace Cloud Files. Everything goes well for the first few months and then, a little while later, a dreadful realization slaps you round the face: your monthly bill is increasing more quickly than your monthly traffic! You're the person "paying-as-you-go" for every single hit on your CDN—but you're not the only person generating those hits. Other people are happily hotlinking your files, duplicating your content on search engines, and running up your bill. The absolute, ultimate, nightmare scenario? Someone embeds or links to one of your really big files in a viral link; days or weeks later, you discover millions of downloads and an unexpectedly massive credit-card bill. Blocking hotlinks on a standalone server is easy enough—a two-minute edit of your Apache htaccess file—but most CDNs don't offer the same kind of HTTP referrer checking. Of course, you could simply rename the files people are hotlinking, but if there are hundreds or thousands of them (I have at least 3000) that's not an option. So what to do? Here's a quick and easy solution. It's not a permanent fix, but it certainly solves for the problem for a time.

Using a CDN with a CNAME

For the sake of illustration, let's suppose I'm using the fictitious "CloudExample," example.com, as my CDN. The same method works fine with any CDN that lets you refer to your files using a CNAME (including Cloudfront, Cloud Files, and others).

When you set up a CloudExample distribution, you can either reference your files directly from your distribution's URL (typically something like d1234567890.example.com) or, more sensibly, set up a DNS CNAME (essentially a kind of alias) based on your own domain name. So if I wanted to serve files from CloudExample based on this website's domain name (explainthatstuff.com) I could set up a CNAME something like fred.explainthatstuff.com and point that to my CloudExample distribution, d1234567890.example.com, then simply make any links to files on CloudExample by linking to fred.explainthatstuff.com. Only the terminally curious—DNS anoraks who go to sleep reading the output from "Live HTTP headers"—will ever discover that my files are actually coming from CloudExample.

A CNAME looks much more professional, but the other great advantage is that you can change it very easily: you can point it to another CDN, point it back to your original server, or delete it entirely—and that's exactly how we can tackle CloudExample Hotlinking. If you're not currently using a CNAME (if you're linking directly to your CloudExample distribution with raw URLs like d1234567890.example.com), you're not going to be able to take advantage of the trick I'm about to share unless you set up a new distribution, copy your files across, and link to them with a CNAME.

Changing your CNAME

If you've reached the end of your tether with the hotlinkers, it's time to change your CNAME! And it's really easy to do, with (hopefully) no downtime to your website and complete transparency to search engines. Here's what you do.

1. Set up a new DNS CNAME

It doesn't matter what name you choose. If my first CNAME was fred.explainthatstuff.com, I could set up a new CNAME fred2.explainthatstuff.com and point it to the same CloudExample distribution. Remember to give it the standard 48 hours to propagate over the Internet, though if you're lucky it'll be live much more quickly.

2. Go into CloudExample and add the new CNAME to your existing distribution

Depending on the CDN you're using, you may need to associate your new CNAME with your distribution in your CDN control panel. Do that now. CloudExample will take a little while to propagate that information through its edge servers. Don't sit there waiting for it; there's something else you can be doing in the meantime...

3. Now change your website files

I have maybe 10,000 or more IMG tags in about 500 HTML files pointing to maybe 2500 images on my CDN, so changing my files by hand isn't an option! Fortunately, I set them up a while ago so all the IMG tags are explicitly referencing full URLs on my CDN, such as fred.explainthatstuff.com/myimage.jpg.

Using perl, it's a cinch to change all those URLs so they point to fred2. With all your HTML files in the same directory, you need a quick line of perl something like this:

perl -pi -e 's%fred.explainthatstuff.com%fred2.explainthatstuff.com%' *.html

That takes no more than a second.

4. Test that your new CNAME is working

Once your CloudExample distribution is up and running, try referencing some of your files using the new CNAME to make sure it's working. If it is, you can safely upload your HTML files. Once you've done that, you've effectively switched over to your new CNAME. Use that from now on.

5. Switch off the old CNAME

So now you're no longer using the old CNAME, you have two options: you can either just "switch it off" or you can 301 redirect it for a time to help search engines find where your files have gone to, and then switch it off later. If you don't care about the search engines, simply go into your CloudExample control panel and edit your distribution again, deleting the CNAME you no longer require (fred.explainthatstuff.com) and leaving the new CNAME (fred2.explainthatstuff.com) as the one and only active CNAME. You'll find CloudExample quite rapidly denies any links to fred.explainthatstuff.com. Next, edit your DNS settings and delete your old, unwanted CNAME (fred.explainthatstuff.com). In 48 hours or less, fred.explainthatstuff.com will be history: any files that still hotlink to it will be rendered useless by a DNS name that doesn't resolve. It's a great (and instantly satisfying solution): that unwanted traffic will be stopped dead in its tracks before it goes anywhere near your server, slows it down, or costs you a dime.

6. Redirect the old CNAME

Instantly killing the old CNAME is a bit crude and drastic—because you're potentially also going to lose any useful links to your images on search engines and from other sites. Fortunately, there's a solution to that too. Once you've deleted the old CNAME, you can set up a new A record with exactly the same domain name and point it to the same place as your ordinary server (www.explainthatstuff.com). Now all attempts to reference fred.explainthatstuff.com (formerly at CloudExample) will go to your main server instead and you can handle them, however you want to, with rewrites and redirects from an htaccess file (if you're using cPanel, use its built-in redirects—which simply edit your htaccess file behind the scenes). You can create a wildcard, 301 redirect to point any legitimate links from search engines to files like fred.explainthatstuff.com/myimage.jpg so they go to fred2.explainthatstuff.com/myimage.jpg and neatly return the 301: Moved permanently code. You'll need something like this:

RewriteCond %{HTTP_HOST} ^fred.explainthatstuff.com$
RewriteRule ^(.*)$ "http\:\/\/fred2\.explainthatstuff\.com\/$1" [R=301,L]

That's purely for the benefit of the search engines and traffic you want to retain. At the same time, you can block links from anywhere else so they return a custom "Hotlinks blocked" image—much as you'd do if you were blocking hotlinks on an ordinary server in the absence of a CDN (I'm not going to go into how you do that here—there are plenty of other pages telling you how to block hotlinks with htaccess). Once you're confident the search engines have got the message, you can delete the redirect code and the old CNAME from your DNS. (I found Google had reindexed all my images at the new CNAME within a few days—though you might want to wait longer. In my experience, Bing takes longer to catch up.)

7. Monitor the redirected traffic

If you're redirecting this way, it's very wise to monitor incoming traffic at the place you're redirecting to (for a few days at least). Using a stats package (such as Awstats), or even simply inspecting raw access logs, you'll get a list of people who were previously hotlinking your CloudExample files. Some of them will be innocent links (maybe text links in an HTML page rather than actual image hotlinks); most will be exactly the kind of hotlinks you're trying to knock out; and a few will be scraped copies of your entire web pages (text and images) that you can pursue with polite emails or takedown notices, as you wish. But some may also be links from your own website (or other websites you operate) that you'll need to change over to avoid your own web pages getting one of those "Hotlinks blocked" messages. It's also worth watching out for blocked links coming from search engines that you haven't correctly exempted in your htaccess file. For example, I was using some old code and discovered that I was allowing links from live.com but blocking its new incarnation bing.com, which I quickly addressed.

Inspecting your logs will also give you a sense of how many people were hotlinking, how much bandwidth you were losing, and how much it was costing you; it can be quite hard to establish that from the raw data CloudExample provides (not least because logging is not enabled by default and, even when it is, you don't automatically get meaningful statistics). In my case, after about 9–10 months of CloudExample use, I was getting about 500 hotlinked image hits a day, making my total bandwidth loss maybe 300–500MB per month or so—nothing too serious compared to my own monthly bandwidth use of about 50GB, but still not trivial. If I'd left it another 9–10 months, I could easily have been draining 1GB a month.

A copied, hotlinked webpage from explainthatstuff.com. The same page with its hotlinks blocked and replaced by dummy images.
Photo: Goodbye hotlinks—at least for now. Left: Before: One of the many web pages I've found ripping off my words and photos and hotlinking to my CDN; Right: After: Moments later, with my CNAME switched over, this page is now trying to get its images from my ordinary server and being redirected to blocked image files. I could leave that there indefinitely to advertise my site but after a month or two, once the search engines have indexed my images again, I'll probably delete the DNS entry I was previously using for my CDN (the original CNAME and now an A record) so those linking images break completely.

8. Until next time

Of course this is only a temporary solution to the CDN hotlinking problem—but it is a solution. Most of your hotlinkers probably won't even spot that their links are broken, but there are always new ones waiting to take their place. There's nothing to stop you changing your CNAME regularly if hotlinking is driving you crazy or costing you money. Maybe more of the CDN providers will implement some sort of decent http referrer checking on incoming requests—but don't hold your breath. I understand Amazon Cloudfront does now support referrer checking (though it took them severel years and lots of requests from customers to make it happen), but many other CDN providers still don't.

Find out more

On this website

On other sites

Books

Sponsored links

If you liked this article...

Atoms Under the Floorboards book cover

You might like my new book, Atoms Under the Floorboards: The Surprising Science Hidden in Your Home, published worldwide by Bloomsbury.

Please do NOT copy our articles onto blogs and other websites

Text copyright © Chris Woodford 2010, 2016. All rights reserved. Full copyright notice and terms of use.

Amazon Web Services, AWS, Amazon Cloudfront, and Cloudfront are trademarks or registered trademarks of Amazon Web Services LLC in the United States and/or other countries.

Follow us

Rate this page

Please rate or give feedback on this page and I will make a donation to WaterAid.

Share this page

Press CTRL + D to bookmark this page for later or tell your friends about it with:

Cite this page

Woodford, Chris. (2010/2016) Cloud hosting. Retrieved from http://www.explainthatstuff.com/cloud-hosting.html. [Accessed (Insert date here)]

More to explore on our website...

Back to top