Skip to content

Instantly share code, notes, and snippets.

@bennadel
Created March 24, 2014 23:40
Handling 404 Errors in ColdFusion (via IIS Throwing 404)
<!---
For this example, Let's assume that the URL was:
http://www.bennadel.com/go/prettyurl/
This will be used in my comments below.
--->
<!---
Check to see if this was due to a 404 error. We might be accessing
this page via the Application.cfc onError event.
--->
<cfif Find( "404;", CGI.query_string )>
<!---
This is a 404 error. Now we have to go about figuring out just
what the error was intending. Right now, the error string is in the
form of:
404;http://www.bennadel.com:80/go/prettyurl/
--->
<!---
Get the incorrect URL from the query string (which IIS has
thrown). This should start with "404;". It might also contain
the port number (ex. :80) after the domain extension. We want
to strip those out. We also want to strip out the "www."
since it might not be there.
--->
<cfset strTargetUrl = LCase(
REReplace(
CGI.query_string,
"404;|:80|www\.",
"",
"ALL"
)
) />
<!---
ASSERT: strTargetUrl should now be in the form of:
http://bennadel.com/go/prettyurl/
--->
<!---
Get the site url. We want to strip out any www from it. This
way the site url *should* be part of the string we found above
(where we also stripped out "www").
NOTE: My url is stored in a config object... but you can get that
value from anywhere (or even hard code it right here). It is
http://www.bennadel.com/
--->
<cfset strSiteUrl = LCase(
Replace(
APPLICATION.ServiceFactory.GetConfig().GetUrl(),
"www.",
"",
"ALL"
)
) />
<!---
ASSERT: strSiteUrl should now be in the form of:
http://bennadel.com/
--->
<!---
Now that we have the target url and the site url, we want to
remove the site url from the target url so that we can isolate
the script name that was being accessed.
--->
<cfset strTargetUrl = Replace(
strTargetUrl,
strSiteUrl,
"",
"ONE"
) />
<!---
ASSERT: At this point, the strTargetUrl should hold the suffix url that
was trying to be called. That is, the url of the page minus the site domain:
go/prettyurl/
CAUTION: At this point, the page may contain query params (..?foo=bar).
--->
<!---
Check to see if we have any query params. Since the 404 error
passes the entire script name AND query string into the CGI
query_string, we have to manually pull out the query string
values ourself.
--->
<cfif Find( "?", strTargetUrl )>
<!--- We have query string values. Get the query params. --->
<cfset strTargetQueryParams = ListRest(
strTargetUrl,
"?"
) />
<!---
Now that we have the target query params, we can remove
them from the target page.
--->
<cfset strTargetUrl = ListGetAt( strTargetUrl, 1, "?" ) />
<cfelse>
<!--- There are no query params. Set a blank value. --->
<cfset strTargetQueryParams = "" />
</cfif>
<!---
Make sure all the slashes are web slashes. This should already
be the case, but this is a safe-guard.
--->
<cfset strTargetUrl = REReplace(
strTargetUrl,
"[\\/]+",
"/",
"ALL"
) />
<!--- Strip out trailing or leading slashed. --->
<cfset strTargetUrl = REReplace(
strTargetUrl,
"^[\\/]+|[\\/]+$",
"",
"ALL"
) />
<!---
We need to ge the target directory. Check to see if we are
attempting to hit a file or a directory in the target url.
--->
<cfif REFind( "\.[\w]+$", strTargetUrl )>
<!---
The target item ends in a file ext. This must be a file.
Get the base directory from the file name and remove the
ending slash.
--->
<cfset strTargetDirectory = REReplace(
GetDirectoryFromPath( strTargetUrl ),
"[\\/]+$",
"",
"ONE"
) />
<!---
Get the target script name to be the target url. This
will have the directory AND file.
--->
<cfset strTargetScriptName = strTargetUrl />
<cfelse>
<!---
We are not attempting to access any file, just a directory.
Grab that directory as the target directory.
--->
<cfset strTargetDirectory = strTargetUrl />
<!---
Since we are pointing to a directory, just grab that as
the script name as well.
--->
<cfset strTargetScriptName = strTargetUrl />
</cfif>
<!---
ASSERT: At this point, we have both :
- target url
- target directory
- target query params
that were attempted to get called. The target url does
NOT have any leading or trailing slashes, but it might
have a file name.
--->
<!---
Not that we have all that stuff, we have to figure out
what all that means to us on the LOCAL setup. IE, what
the fake url map to in our framework. Let's test the
tartet url against some regular expressios.
--->
<cfsavecontent variable="strXmlRedirectExpressions">
<!---
In order to narrow down the regular expression that
we have to run, I am checking the first item in the
target url.
--->
<cfswitch expression="#LCase( ListFirst( strTargetUrl, '/' ) )#">
<!---
FOR THIS DEMO i am putting the XML here. In
reality, I am pulling in an xml file form
each section so that each section can fine
tune it's own redirection.
ex:
<cfinclude
template="content/go/_url_redirect.xml.cfm"
/>
For the demo, I have included it in the proper case.
--->
<cfcase value="go">
<redirect
in="^go/ben-?nadel\b.*$"
out="go.bennadel"
/>
<redirect
in="^go/pretty-?url\b.*$"
out="go.demo404"
/>
<!---
Notice in this one how I am using a reg-exp
group reference.
--->
<redirect
in="^go/pretty-?url/([0-9]{4})/\b.*$"
out="go.demo404&search_year=\1"
/>
</cfcase>
</cfswitch>
<!---
After the individual cases, I include a global 404
handler in case none of the others make it.
--->
<redirect
in=".+"
out="home.display"
/>
</cfsavecontent>
<!--- Trim the value on the XML. --->
<cfset strXmlRedirectExpressions = Trim(
strXmlRedirectExpressions
) />
<!---
Check to see if there is a pretty url redirect expression
list that we can use to test the target url.
--->
<cfif Len( strXmlRedirectExpressions )>
<!--- Parse the expressions into an xml document. --->
<cfset xmlRedirectExpressions = XmlParse(
"<redirects>" &
strXmlRedirectExpressions &
"</redirects>"
) />
<!--- Get query string children. --->
<cfset xmlChildren = xmlRedirectExpressions.XmlRoot.XmlChildren />
<!--- Loop through expressions to see if any match. --->
<cfloop index="intChild" from="1" to="#ArrayLen( xmlChildren )#" step="1">
<!--- Get reference to this child's attributes. --->
<cfset objXmlAttributes = xmlChildren[ intChild ].XmlAttributes />
<!---
Check to see if we found a match. Use the regular
expression in our redirects XML and test it against
the target URL.
--->
<cfif REFind( objXmlAttributes.In, strTargetUrl )>
<!--- Get the mapped action (the OUT xml attribute). --->
<cfset strTargetAction = REReplace(
strTargetUrl,
objXmlAttributes.In,
objXmlAttributes.Out,
"ONE"
) />
<!---
Check to see if we have any query params as part of
the target action string.
--->
<cfif Find( "&", strTargetAction )>
<!---
Add the query params to the target query params that
we got from the original 404 error url.
--->
<cfset strTargetQueryParams = ListAppend(
strTargetQueryParams,
ListRest( strTargetAction, "&" ),
"&"
) />
<!---
Get rid of the query string part of the
target action since we just copied it
over to the target query params.
--->
<cfset strTargetAction = ListFirst(
strTargetAction,
"&"
) />
</cfif>
<!---
We found a regular expression match to the
target URL. We don't need to keep searching
so break out of the loop.
--->
<cfbreak />
</cfif>
</cfloop>
<!---
ASSERT: At this point, we have the:
- target url
- the target query params
- mapped action (based on the reg-exp)
--->
<!---
Update script name based on the error. Since we cannot
update the CGI.script_name value directly, I am storing
the target "script name" in a custom variable.
I keep a struct called Environment (CFC), but this could
be any variable that you reference in the page processing.
--->
<cfset REQUEST.Environment.OverrideScriptName(
GetDirectoryFromPath( CGI.script_name ) &
strTargetScriptName
) />
<!---
Remove the 404 error from the attributes. This is
custom struct in my framework that combines the
URL and FORM variables.
--->
<cfloop item="strKey" collection="#REQUEST.Attributes#">
<cfif NOT Compare( "404;", Left( strKey, 4 ) )>
<cfset StructDelete( REQUEST.Attributes, strKey ) />
</cfif>
</cfloop>
<!---
Now, we need to move any target query params into my
framework's attributes scope. Since I never reference
URL for FORM directly, I do NOT bother updating them
at this point, but you could certainly set URL values
here.
--->
<!--- Update the attribute values. Get the array of params. --->
<cfset arrQueryParams = ListToArray(
strTargetQueryParams,
"&"
) />
<!---
Loop over the query param pairs and add them to the
request attributes scope.
--->
<cfloop index="intPair" from="1" to="#ArrayLen( arrQueryParams )#" step="1">
<!--- Get the pair. --->
<cfset arrPair = ListToArray( arrQueryParams[ intPair ], "=" ) />
<!--- Make sure we have two items. --->
<cfif (ArrayLen( arrPair ) NEQ 2)>
<cfset arrPair[2] = "" />
</cfif>
<!--- Set the attributes value. --->
<cfset REQUEST.Attributes[ arrPair[1] ] = arrPair[2] />
</cfloop>
<!---
THIS NEXT IF STATEMENT IS PART OF MY FRAMEWORK. I DO NOT USE
ABSOLUTE URLS IN MY APP. ALL MY URLS ARE RELATVE (IE. ../../../).
BECAUSE OF THIS, I NEED TO UPDATE WHAT THE SERVER THINGS THE
WEB BROWSER IS SEEING. SINCE THE SERVER IS IN THE ROOT AT
THIS PAGE (site_error.cfm) AND THE WEB BROWSER IS IN A SUB
DIRECTORY, THE TWO PATHS DO NOT LINE UP.
HOWEVER, DUE TO THE WAY MY 404 HANDLER WORKS ON DEV, I HAVE TO
DO THIS DIFFERENT ON THE DEV AND LIVE SERVERS.
--->
<!---
Check to see why we are on the site_error.cfm page. If we are,
then we were thrown directly to it (probably on the
developmental server). In this case, use the appropriate web
root (which would be ""). However, if we are not on that page,
then we probably go sent here from another page (probably on
the live server).
--->
<cfif APPLICATION.ServiceFactory.GetConfig().GetIsLive()>
<!--- We are live, get the webroot based on the query string. --->
<cfset REQUEST.Environment.Web.Root = RepeatString(
"../",
ListLen( strTargetDirectory, "/" )
) />
</cfif>
<!--- We do, so set the header to be proper code. --->
<cfheader
statuscode="200"
statustext="OK"
/>
<!--- Store the target action. --->
<cfset REQUEST.TargetAction = strTargetAction />
<!--- Include the index file. --->
<cfinclude template="index.cfm" />
<!---
We have just include the main site controller (index.cfm)
We DO NOT WANT the rest of this template execute.
--->
<cfexit />
<!---
There was no matching Regular Expression file for this
html. Therefore, we are going to state that this page
was reached in error.
--->
<cfelse>
<!--- If we are live. Send an email to alert error. --->
<cfif APPLICATION.ServiceFactory.GetConfig().GetIsLive()>
<cfmail
to=""
from=""
subject="Error Page Reached"
type="HTML">
#CGI.script_name#<br />
#CGI.query_string#<br />
<br />
<cfdump var="#CGI#" />
<cfdump var="#REQUEST#" />
</cfmail>
</cfif>
</cfif>
</cfif>
<!---
ASSERT: This page was reached in error. No 404 error was
mapped. Either someone has a bad link or they are
trying to hack my site!
--->
<!--- DISPLAY STANDARD HTML PAGE HERE. --->
<cfabort />
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment