Skip to content

Instantly share code, notes, and snippets.

@bennadel
Created March 24, 2014 23:40

Revisions

  1. bennadel created this gist Mar 24, 2014.
    513 changes: 513 additions & 0 deletions code-1.cfm
    Original file line number Diff line number Diff line change
    @@ -0,0 +1,513 @@
    <!---
    For this example, Let's assume that the URL was:
    http://www.bennadel.com/go/prettyurl/
    This will be used in my comments below.
    --->



    <!---
    Check to see if this was due to a 404 error. We might be accessing
    this page via the Application.cfc onError event.
    --->
    <cfif Find( "404;", CGI.query_string )>

    <!---
    This is a 404 error. Now we have to go about figuring out just
    what the error was intending. Right now, the error string is in the
    form of:
    404;http://www.bennadel.com:80/go/prettyurl/
    --->

    <!---
    Get the incorrect URL from the query string (which IIS has
    thrown). This should start with "404;". It might also contain
    the port number (ex. :80) after the domain extension. We want
    to strip those out. We also want to strip out the "www."
    since it might not be there.
    --->
    <cfset strTargetUrl = LCase(
    REReplace(
    CGI.query_string,
    "404;|:80|www\.",
    "",
    "ALL"
    )
    ) />

    <!---
    ASSERT: strTargetUrl should now be in the form of:
    http://bennadel.com/go/prettyurl/
    --->


    <!---
    Get the site url. We want to strip out any www from it. This
    way the site url *should* be part of the string we found above
    (where we also stripped out "www").
    NOTE: My url is stored in a config object... but you can get that
    value from anywhere (or even hard code it right here). It is
    http://www.bennadel.com/
    --->
    <cfset strSiteUrl = LCase(
    Replace(
    APPLICATION.ServiceFactory.GetConfig().GetUrl(),
    "www.",
    "",
    "ALL"
    )
    ) />

    <!---
    ASSERT: strSiteUrl should now be in the form of:
    http://bennadel.com/
    --->


    <!---
    Now that we have the target url and the site url, we want to
    remove the site url from the target url so that we can isolate
    the script name that was being accessed.
    --->
    <cfset strTargetUrl = Replace(
    strTargetUrl,
    strSiteUrl,
    "",
    "ONE"
    ) />


    <!---
    ASSERT: At this point, the strTargetUrl should hold the suffix url that
    was trying to be called. That is, the url of the page minus the site domain:
    go/prettyurl/
    CAUTION: At this point, the page may contain query params (..?foo=bar).
    --->


    <!---
    Check to see if we have any query params. Since the 404 error
    passes the entire script name AND query string into the CGI
    query_string, we have to manually pull out the query string
    values ourself.
    --->
    <cfif Find( "?", strTargetUrl )>

    <!--- We have query string values. Get the query params. --->
    <cfset strTargetQueryParams = ListRest(
    strTargetUrl,
    "?"
    ) />

    <!---
    Now that we have the target query params, we can remove
    them from the target page.
    --->
    <cfset strTargetUrl = ListGetAt( strTargetUrl, 1, "?" ) />

    <cfelse>

    <!--- There are no query params. Set a blank value. --->
    <cfset strTargetQueryParams = "" />

    </cfif>


    <!---
    Make sure all the slashes are web slashes. This should already
    be the case, but this is a safe-guard.
    --->
    <cfset strTargetUrl = REReplace(
    strTargetUrl,
    "[\\/]+",
    "/",
    "ALL"
    ) />

    <!--- Strip out trailing or leading slashed. --->
    <cfset strTargetUrl = REReplace(
    strTargetUrl,
    "^[\\/]+|[\\/]+$",
    "",
    "ALL"
    ) />


    <!---
    We need to ge the target directory. Check to see if we are
    attempting to hit a file or a directory in the target url.
    --->
    <cfif REFind( "\.[\w]+$", strTargetUrl )>

    <!---
    The target item ends in a file ext. This must be a file.
    Get the base directory from the file name and remove the
    ending slash.
    --->
    <cfset strTargetDirectory = REReplace(
    GetDirectoryFromPath( strTargetUrl ),
    "[\\/]+$",
    "",
    "ONE"
    ) />

    <!---
    Get the target script name to be the target url. This
    will have the directory AND file.
    --->
    <cfset strTargetScriptName = strTargetUrl />

    <cfelse>

    <!---
    We are not attempting to access any file, just a directory.
    Grab that directory as the target directory.
    --->
    <cfset strTargetDirectory = strTargetUrl />

    <!---
    Since we are pointing to a directory, just grab that as
    the script name as well.
    --->
    <cfset strTargetScriptName = strTargetUrl />

    </cfif>


    <!---
    ASSERT: At this point, we have both :
    - target url
    - target directory
    - target query params
    that were attempted to get called. The target url does
    NOT have any leading or trailing slashes, but it might
    have a file name.
    --->


    <!---
    Not that we have all that stuff, we have to figure out
    what all that means to us on the LOCAL setup. IE, what
    the fake url map to in our framework. Let's test the
    tartet url against some regular expressios.
    --->
    <cfsavecontent variable="strXmlRedirectExpressions">

    <!---
    In order to narrow down the regular expression that
    we have to run, I am checking the first item in the
    target url.
    --->
    <cfswitch expression="#LCase( ListFirst( strTargetUrl, '/' ) )#">

    <!---
    FOR THIS DEMO i am putting the XML here. In
    reality, I am pulling in an xml file form
    each section so that each section can fine
    tune it's own redirection.
    ex:
    <cfinclude
    template="content/go/_url_redirect.xml.cfm"
    />
    For the demo, I have included it in the proper case.
    --->

    <cfcase value="go">

    <redirect
    in="^go/ben-?nadel\b.*$"
    out="go.bennadel"
    />

    <redirect
    in="^go/pretty-?url\b.*$"
    out="go.demo404"
    />

    <!---
    Notice in this one how I am using a reg-exp
    group reference.
    --->
    <redirect
    in="^go/pretty-?url/([0-9]{4})/\b.*$"
    out="go.demo404&search_year=\1"
    />

    </cfcase>

    </cfswitch>


    <!---
    After the individual cases, I include a global 404
    handler in case none of the others make it.
    --->
    <redirect
    in=".+"
    out="home.display"
    />

    </cfsavecontent>


    <!--- Trim the value on the XML. --->
    <cfset strXmlRedirectExpressions = Trim(
    strXmlRedirectExpressions
    ) />


    <!---
    Check to see if there is a pretty url redirect expression
    list that we can use to test the target url.
    --->
    <cfif Len( strXmlRedirectExpressions )>

    <!--- Parse the expressions into an xml document. --->
    <cfset xmlRedirectExpressions = XmlParse(
    "<redirects>" &
    strXmlRedirectExpressions &
    "</redirects>"
    ) />

    <!--- Get query string children. --->
    <cfset xmlChildren = xmlRedirectExpressions.XmlRoot.XmlChildren />

    <!--- Loop through expressions to see if any match. --->
    <cfloop index="intChild" from="1" to="#ArrayLen( xmlChildren )#" step="1">

    <!--- Get reference to this child's attributes. --->
    <cfset objXmlAttributes = xmlChildren[ intChild ].XmlAttributes />

    <!---
    Check to see if we found a match. Use the regular
    expression in our redirects XML and test it against
    the target URL.
    --->
    <cfif REFind( objXmlAttributes.In, strTargetUrl )>

    <!--- Get the mapped action (the OUT xml attribute). --->
    <cfset strTargetAction = REReplace(
    strTargetUrl,
    objXmlAttributes.In,
    objXmlAttributes.Out,
    "ONE"
    ) />

    <!---
    Check to see if we have any query params as part of
    the target action string.
    --->
    <cfif Find( "&", strTargetAction )>

    <!---
    Add the query params to the target query params that
    we got from the original 404 error url.
    --->
    <cfset strTargetQueryParams = ListAppend(
    strTargetQueryParams,
    ListRest( strTargetAction, "&" ),
    "&"
    ) />

    <!---
    Get rid of the query string part of the
    target action since we just copied it
    over to the target query params.
    --->
    <cfset strTargetAction = ListFirst(
    strTargetAction,
    "&"
    ) />

    </cfif>

    <!---
    We found a regular expression match to the
    target URL. We don't need to keep searching
    so break out of the loop.
    --->
    <cfbreak />

    </cfif>

    </cfloop>


    <!---
    ASSERT: At this point, we have the:
    - target url
    - the target query params
    - mapped action (based on the reg-exp)
    --->


    <!---
    Update script name based on the error. Since we cannot
    update the CGI.script_name value directly, I am storing
    the target "script name" in a custom variable.
    I keep a struct called Environment (CFC), but this could
    be any variable that you reference in the page processing.
    --->
    <cfset REQUEST.Environment.OverrideScriptName(
    GetDirectoryFromPath( CGI.script_name ) &
    strTargetScriptName
    ) />


    <!---
    Remove the 404 error from the attributes. This is
    custom struct in my framework that combines the
    URL and FORM variables.
    --->
    <cfloop item="strKey" collection="#REQUEST.Attributes#">

    <cfif NOT Compare( "404;", Left( strKey, 4 ) )>
    <cfset StructDelete( REQUEST.Attributes, strKey ) />
    </cfif>

    </cfloop>


    <!---
    Now, we need to move any target query params into my
    framework's attributes scope. Since I never reference
    URL for FORM directly, I do NOT bother updating them
    at this point, but you could certainly set URL values
    here.
    --->

    <!--- Update the attribute values. Get the array of params. --->
    <cfset arrQueryParams = ListToArray(
    strTargetQueryParams,
    "&"
    ) />

    <!---
    Loop over the query param pairs and add them to the
    request attributes scope.
    --->
    <cfloop index="intPair" from="1" to="#ArrayLen( arrQueryParams )#" step="1">

    <!--- Get the pair. --->
    <cfset arrPair = ListToArray( arrQueryParams[ intPair ], "=" ) />

    <!--- Make sure we have two items. --->
    <cfif (ArrayLen( arrPair ) NEQ 2)>
    <cfset arrPair[2] = "" />
    </cfif>

    <!--- Set the attributes value. --->
    <cfset REQUEST.Attributes[ arrPair[1] ] = arrPair[2] />

    </cfloop>



    <!---
    THIS NEXT IF STATEMENT IS PART OF MY FRAMEWORK. I DO NOT USE
    ABSOLUTE URLS IN MY APP. ALL MY URLS ARE RELATVE (IE. ../../../).
    BECAUSE OF THIS, I NEED TO UPDATE WHAT THE SERVER THINGS THE
    WEB BROWSER IS SEEING. SINCE THE SERVER IS IN THE ROOT AT
    THIS PAGE (site_error.cfm) AND THE WEB BROWSER IS IN A SUB
    DIRECTORY, THE TWO PATHS DO NOT LINE UP.
    HOWEVER, DUE TO THE WAY MY 404 HANDLER WORKS ON DEV, I HAVE TO
    DO THIS DIFFERENT ON THE DEV AND LIVE SERVERS.
    --->

    <!---
    Check to see why we are on the site_error.cfm page. If we are,
    then we were thrown directly to it (probably on the
    developmental server). In this case, use the appropriate web
    root (which would be ""). However, if we are not on that page,
    then we probably go sent here from another page (probably on
    the live server).
    --->
    <cfif APPLICATION.ServiceFactory.GetConfig().GetIsLive()>

    <!--- We are live, get the webroot based on the query string. --->
    <cfset REQUEST.Environment.Web.Root = RepeatString(
    "../",
    ListLen( strTargetDirectory, "/" )
    ) />

    </cfif>


    <!--- We do, so set the header to be proper code. --->
    <cfheader
    statuscode="200"
    statustext="OK"
    />

    <!--- Store the target action. --->
    <cfset REQUEST.TargetAction = strTargetAction />

    <!--- Include the index file. --->
    <cfinclude template="index.cfm" />


    <!---
    We have just include the main site controller (index.cfm)
    We DO NOT WANT the rest of this template execute.
    --->
    <cfexit />


    <!---
    There was no matching Regular Expression file for this
    html. Therefore, we are going to state that this page
    was reached in error.
    --->
    <cfelse>


    <!--- If we are live. Send an email to alert error. --->
    <cfif APPLICATION.ServiceFactory.GetConfig().GetIsLive()>

    <cfmail
    to=""
    from=""
    subject="Error Page Reached"
    type="HTML">

    #CGI.script_name#<br />
    #CGI.query_string#<br />
    <br />

    <cfdump var="#CGI#" />
    <cfdump var="#REQUEST#" />
    </cfmail>

    </cfif>


    </cfif>


    </cfif>


    <!---
    ASSERT: This page was reached in error. No 404 error was
    mapped. Either someone has a bad link or they are
    trying to hack my site!
    --->


    <!--- DISPLAY STANDARD HTML PAGE HERE. --->

    <cfabort />