archive of pdf uploads safe from bots

6 posts by 4 authors in: Forums > CMS Builder
Last Post: March 5, 2012   (RSS)

By markr - February 28, 2012

Whats the best way to secure an archive of pdf uploads from unwanted web viewing using cms?

If you use a custom upload directory to hide them, aren't they still just waiting for a clever bot or patient crawler to see them?

Re: [Damon] archive of pdf uploads safe from bots

By markr - February 28, 2012

I was hoping to do it without an htaccess login, e.g. perhaps using the membership plugin.

And even if I drop a blank index.php file in the directory, a browser can still load the pdf if landed upon. Let's say the client names the pdf something simple like a.pdf, the bot would randomly generate that name pretty quickly. A longer name would only delay the beast, no?

When you say "protected by username/password", are you referring to the htaccess edit?

I was wondering if maybe the pdf could be stored in a non-public directory and displayed on a secure (members only) html page using an embed thing. In that hypo, can cmsb upload to a non-public area of the server?

Re: [markr] archive of pdf uploads safe from bots

By sublmnl - March 4, 2012 - edited: March 4, 2012

either move the directory to somewhere only FTP can get to but not in the public or www folder.

Or if you can't move to above/outside the www folder, then put a robots.txt file in your root and dissallow the pdf directory you are talking about. Also put a index.php file in the root of the pdf folder that makes the listing die or redirect. Also put a .htaccess file in the PDF folder that doesn't allow listing of file contents for that folder.
Or do like he said and put a login on the pdf folder.
you could also apply document security to the PDF's so that you would have to put in a password if you tried to open them. You can do all of the above as good practice if needed.

We have done this with a client and moved an entire section of their site 'behind the wall' - it was all learning module content and run from the CMS. So was the content on the outside of the login wall. Win win for us. It may be a win win for your client as well.

Re: [markr] archive of pdf uploads safe from bots

By Dave - March 4, 2012

Hi Markr,

>can cmsb upload to a non-public area of the server?

Yes, you can set custom upload dirs in the fields editor for upload fields.

We've dealt with document security a number of times and there's a few common issues that come up.

- Bot Security, Generally bots won't find your upload directory unless it's linked from somewhere, and if they do it's not usually a problem if it doesn't list all the files (a blank index.html/php will hide directory listings). It's true they could guess at filenames, but this is as secure as passwords which can also be guessed. Assuming a-z is 26 chars, plus 0-9 if another 10, each filename char has 36 possibilities, so a 3 char filenamecould take over 46 thousand guesses (36*36*36). It's usually not a problem unless your filenames follow a pattern, eg: 1001.pdf, 1002.pdf or if they match something else on your site (product SKUs, etc).

- User Security, the next concern is limiting download links to logged in users, since once someone has the link they could just share it and anyone could access it. The easiest way to do this is to create a custom wrapper script that requires users to be logged in and displays the PDF. A link such as memberPdfDownload.php?table=products&num=123 could let them download the PDF, but only if they were logged in so sending that link to others wouldn't help.

- Home PC Security, of course, nothing prevents a user from saving the file to their computer and emailing it around as an attachment. And even complicated systems that don't let a user download a file are still susceptible to someone taking a picture of their screen with their camera. Basically, there's no way to prevent a user from copying the data once they have it, just lots of ways of making it more difficult.

Hope that helps, let me know any questions. Thanks!
Dave Edis - Senior Developer
interactivetools.com

Re: [Dave] archive of pdf uploads safe from bots

By markr - March 5, 2012

Great info. Thanks all.