3

I am attempting to modifying my .htaccess file within a specific directory. If a web user attempts to find any file in this directory that may be named like the following options, I want them to be redirected back to home. Below are some file name examples.

  • /cat_1234.pdf
  • /cat_blahbla.doc
  • /cat_$9989&428.jpg
  • /cat_-309bn-020n.webp

...how can I tell my RewriteCond to look out for these patterns? Here was my best attempt, which I thought would work, but it doesn't...

<IfModule mod_rewrite.c>
RewriteCond %{REQUEST_URI} ^cat_([0-9a-zA-Z_]+)\.(pdf|doc|jpg|webp) [NC]
RewriteRule . /index.php [R=302,L]
</IfModule>

What am I missing?

3
  • 2
    Please question the <IfModule mod_rewrite> in this. It's often recommended because it means "a missing module shouldn't cause the server to not work at all", but it also means if you deploy your software on a server that hasn't mod_rewrite enabled, your cat_* files are completely unprotected. Commented Mar 29 at 6:51
  • For starters, it doesn't look like that regex will match the hyphen in your fourth example. On the other hand, it will match an underscore that you don't appear to need.
    – MikeB
    Commented Mar 29 at 8:30
  • Examples don't make a pattern. Describe the patterns more specifically. Is any character allowed between cat_ and the extension? Or only specific ones? Do they depend on the extension? Does it matter if the regular expression matches anything starting with cat_? Does the extension actually matter?
    – jcaron
    Commented Mar 29 at 14:48

1 Answer 1

4

You've not stated the "specific directory" in which the .htaccess file and files you are protecting is located? (Although that shouldn't matter if we rework the rule.)

RewriteCond %{REQUEST_URI} ^cat_([0-9a-zA-Z_]+)\.(pdf|doc|jpg|webp) [NC]
RewriteRule . /index.php [R=302,L]

The REQUEST_URI server variable contains the full URL-path (including the slash prefix), so this would normally need to include the "specific directory", not just the filename (unless you adjust the regex). You have a start-of-string anchor on the regex (although you have omitted the end-of-string anchor) so this condition (RewriteCond directive) will never match.

Your regex would also fail to match your 3rd and 4th examples because your regex character class ([0-9a-zA-Z_]) omits the special characters $, & and - that are present in these filenames. Although I would surmise you do not need to be so specific and catching cat_<anything>.pdf (for example) would be OK.

However, you do not need a separate condition here. It is easier and more efficient to just use the RewriteRule pattern, which matches relative to the directory that contains the .htaccess file (and excludes the slash prefix), so you do not need to worry about the rest of the URL-path.

I also doubt that you should be redirecting to /index.php. Should this not be simply / (the root directory) and allow the directory index (ie. index.php) to be served by mod_dir? Is that not your canonical URL?

Try the following instead, in the .htaccess file in the directory you are protecting.

RewriteRule ^cat_[^/]+\.(pdf|doc|jpg|webp)$ / [R=302,L]

This regex is perhaps slightly more broad than it needs to be, but that also makes it simpler. ie. [^/] matches anything that is not a / (path separator).

And no need for the <IfModule> wrapper, unless this rule is entirely optional.


However, instead of redirecting to the homepage (which is confusing for users and unnecessary for bots) I would simply block (with a 403 Forbidden) such requests instead. For example:

<FilesMatch "^cat_[^/]+\.(pdf|doc|jpg|webp)$">
    Require all denied
</FilesMatch>
6
  • 1
    MrWhite, thank you for the learning curve on this process. This was a major help for me.
    – klewis
    Commented Mar 28 at 17:53
  • 1
    @klewis If things like cat_.docx are also restricted then consider using this regex: ^cat_[^/]*\.(pdf|docx?|jpe?g|webp)$
    – MonkeyZeus
    Commented Mar 29 at 10:42
  • 2
    @klewis Also, I prefer one-liners and think a 404 would be better than a 403. RewriteRule ^cat_[^/]*\.(pdf|docx?|jpe?g|webp)$ - [R=404,NC,L]. A redirect or forbidden reveals that the asset exists, a 404 at least masks the existence.
    – MonkeyZeus
    Commented Mar 29 at 10:47
  • 1
    Does it actually make sense to allow other extensions? Otherwise just ^cat_ (without anything after) would be enough.
    – Didier L
    Commented Mar 29 at 12:21
  • @DidierL ^cat_[^/]*$
    – MonkeyZeus
    Commented Mar 29 at 15:12

You must log in to answer this question.

Not the answer you're looking for? Browse other questions tagged .