How to not get hotlinked on S3


I am too poor to get hotlinked, and I figure you are too.
aws

Wikipedia defines hotlinking as:

Hotlinking is a term used on the Internet that refers to the practice of displaying an image on a website by linking to the same image on another website, rather than saving a copy of it on the website on which the image will be shown. So, instead of loading picture.jpg on to their own website, a website owner uses a link to the picture as http://example.com/picture.jpg. When the hotlinking website is loaded, the image is loaded from the other website, which uses its bandwidth, costing the hotlinked website’s owners money.

Definitely sounds like a problem.

Hosting a static site on S3

Every static site I host, is currently hosted on Amazon Web Services’ (AWS) Simple Storage Service (S3). All because it is:

  1. Easy to use
  2. Easy to integrate with AWS Cloud Front for a CDN
  3. The pricing scheme is based on bandwidth (and none of my sites are popular), therefore…
  4. Cheap

Since I don’t regularly make a new site, if I try to do it by memory, I always screw something up. I usually just follow this guide word for word until everything is working.

By default, files on S3 are set to private. This is a problem for website hosting, since usually you want other people to be able to see it. To fix this problem, the guide suggests this bucket (AWS for file directory) policy to make all of the objects (AWS for files) in the bucket public:

{
  "Version":"2012-10-17",
  "Statement": [{
    "Sid": "Allow Public Access to All Objects",
    "Effect": "Allow",
    "Principal": "*",
    "Action": "s3:GetObject",
    "Resource": "arn:aws:s3:::example.com/*"
  }
 ]
}

And this policy will work great… until you host a hilarious meme, and someone copies the image link, and shares it everywhere, making you pay for the bandwidth of all the views, but not netting you the benefit of any ads being loaded or written cotent being viewed.

Luckily I’m speaking from a point of pessimism not experience.

Trying to stop hotlinking

If you read the whole Wikipeia article I quoted above, you are probably thinking: “What an idiot. This is a non-issue. Just use the .htaccess solution that the article YOU linked suggested.”, but that’s where you would be wrong. S3 is not a real webserver, and therefore does not care at all about .htaccess files. Time to find a different solution.

After a Google search a little bit of research you will probably come across this “Bucket policy examples” article in the AWS docs, which will suggest this bucket policy:

{
   "Version": "2012-10-17",
   "Id": "http referer policy example",
   "Statement": [
     {
       "Sid": "Allow get requests referred by www.example.com and example.com.",
       "Effect": "Allow",
       "Principal": "*",
       "Action": "s3:GetObject",
       "Resource": "arn:aws:s3:::examplebucket/*",
       "Condition": {
         "StringLike": {"aws:Referer": ["http://www.example.com/*","http://example.com/*"]}
       }
     },
      {
        "Sid": "Explicit deny to ensure requests are allowed only from specific referer.",
        "Effect": "Deny",
        "Principal": "*",
        "Action": "s3:*",
        "Resource": "arn:aws:s3:::examplebucket/*",
        "Condition": {
          "StringNotLike": {"aws:Referer": ["http://www.example.com/*","http://example.com/*"]}
        }
      }
   ]
}

Your immediate reaction will be to criticize Amazon for using a 2 space indent scheme in the “Hosting a website on Amazon Web Services” guide but some indecipherable indent scheme here. Do they not appreciate consistency?! I mean what’s next? TABS!

Scene from Silicon Valley where richard gets mad at his girlfriend for using spaces instead of tabs

Richard knows what I'm talking about

Then you will look at the policy, and it will seem to make sense, so you figure why not try it out. So you do. Then you’ll get a 403 FORBIDDEN error because your browser won’t be able to access index.html or whatever you specified as your root document when you set up your bucket for website hosting.

Actually stopping hotlinking

Thinking that I only really needed to be worried about hotlinking on images (at least for now), I decided to take a swing at writing my own policy to just block requests to my images directory… by basically just merging the two we saw above together and getting:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "Allow Public Access to All Objects",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::example.com/*"
    },
    {
      "Sid": "Allow images requests referred by www.example.com and example.com.",
      "Effect": "Allow",
      "Principal": "*",
      "Action": "s3:GetObject",
      "Resource": "arn:aws:s3:::example.com/images/*",
      "Condition": {
        "StringLike": {
          "aws:Referer": [
            "http://www.example.com/*",
            "http://example.com/*"
          ]
        }
      }
    },
    {
      "Sid": "Explicit deny to ensure images requests are allowed only from specific referer.",
      "Effect": "Deny",
      "Principal": "*",
      "Action": "s3:*",
      "Resource": "arn:aws:s3:::example.com/images/*",
      "Condition": {
        "StringNotLike": {
          "aws:Referer": [
            "http://www.example.com/*",
            "http://example.com/*"
          ]
        }
      }
    }
  ]
}

And it seems to work.

  • The fact that you can see this page is proof enough that anyone can access non-images.
  • The fact that you get an error if you open a new tab and try to go to this url of an image from a previous post is proof enough that access to images is actually restricted by referer.
    • Try it: http://petetalksweb.com/images/post-images/1/AlansBar.jpg

Should be enough to stop most hotlinking and image sharing problems.

Unfortunately, the bucket policy works by looking at the “aws:Referer”, which is just the http_referer header on the request which a malicious user could spoof to represent your domain. Luckily for us, AWS Cloud Front solves this problem, but it’s a problem for a differnt day, or at least a different post.

comments powered by Disqus