Created
February 21, 2021 12:01
-
-
Save bennadel/8ebef8173f67c733252dbd0288d499d0 to your computer and use it in GitHub Desktop.
Replacing Blank Lines Using Multiline Mode RegEx Patterns In POSIX And Java In Lucee CFML 5.3.7.47
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<cfsavecontent variable="patternText" | |
>(?mx) | |
^ | |
<!--- | |
By wrapping the "blank line" in a repeating capture group, we use the | |
repeating nature of the pattern to replace adjacent lines rather than leaning | |
entirely on the "all" behavior of the reReplace() function. | |
---> | |
( | |
[\x20\x09]* | |
\n | |
)+ | |
</cfsavecontent> |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
<cfscript> | |
// We're building content that multiple "blank lines" next to each other. | |
content = arrayToList( | |
[ | |
"AAAAA", | |
"BBBBB", | |
"", // Blank line. | |
" ", // Blank line. | |
" #chr( 9 )# ", // Blank line. | |
"", // Blank line. | |
"CCCCC", | |
" ", // Blank line. | |
"", // Blank line. | |
"DDDDD" | |
], | |
chr( 10 ) | |
); | |
// In order to make the Regular Expression (RegEx) pattern easier to read, I am | |
// running it in VERBOSE mode (?x). This ignores incidental whitespace and requires | |
// all whitespace characters to be explicitly provided. As such, I am using the | |
// following HEX codes: | |
// -- | |
// \x20 => Space | |
// \x09 => Tab | |
// -- | |
// This Regular Expression pattern is attempting to match "blank lines" (ie, lines | |
// that have nothing but whitespace) so that I can strip those lines out in the | |
// replacement operation. | |
``` | |
<cfsavecontent variable="patternText" | |
>(?mx) <!--- Multi-Line + Verbose mode enabled. ---> | |
^ <!--- Match at START OF LINE. ---> | |
[\x20\x09]* <!--- Leading Space or Tab characters. ---> | |
\n <!--- Match line-break at end of line. ---> | |
</cfsavecontent> | |
``` | |
// Note that we are using the SAME PATTERN TEXT to apply the changes using the | |
// default ColdFusion Regular Expression engine (POSIX) and the lower-level Java | |
// Regular Expression engine. | |
cfResult = content.reReplace( patternText, "", "all" ); | |
javaResult = javaCast( "string", content ).replaceAll( patternText, "" ); | |
echo( "<h3> POSIX (CFML) Result - reReplace() </h3>" ); | |
echo( "<pre>#encodeForHtml( cfResult )#</pre>" ); | |
echo( "<h3> Java Result - .replaceAll() </h3>" ); | |
echo( "<pre>#encodeForHtml( javaResult )#</pre>" ); | |
</cfscript> |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment