solrClean(input)
Last updated October 02, 2012
Version: 2 | Requires: CF9 | Library: UtilityLib
Description:
Like VerityClean, massages text input to make it Solr compatible. NOTE: requires uCaseWordsForSolr UDF.
Return Values:
Returns a string.
Example:
<cfset cleanSolrSearchText = solrClean(userSearchText) />
Parameters:
Name | Description | Required |
---|---|---|
input | String to run against | Yes |
Full UDF Source:
/**
* Like VerityClean, massages text input to make it Solr compatible.
* v1.0 by Sami Hoda
* v2.0 by Daria Norris to deal with wildcard characters used as the first letter of the search
* v2.1 by Paul Alkema - updated list of characters to escape
* v2.2 by Adam Cameron - Merge Paul's & Daria's versions of the function, improve some regexes, fix logic error with input argument (was both required and had a default), converted wholly to script
*
* @param input String to run against (Required)
* @return Returns a string.
* @author Sami Hoda (sami@bytestopshere.com)
* @version 2.2, October 2, 2012
*/
string function solrClean(required string input){
var cleanText = trim(arguments.input);
// List of bad charecters. "+ - && || ! ( ) { } [ ] ^ " ~ * ? : \"
// http://lucene.apache.org/core/3_6_0/queryparsersyntax.html#Escaping Special Characters
var reBadChars = "\+|-|&&|\|\||!|\(|\)|{|}|\[|\]|\^|\""|\~|\*|\?|\:|\\";
// Replace comma with OR
cleanText = replace(cleanText, "," , " or " , "all");
// Strip bad characters
cleanText = reReplace(cleanText, reBadChars, " ", "all");
// Clean up sequences of space characters
cleanText = reReplace(cleanText, "\s+", " ", "all");
// clean up wildcard characters as first characters
cleanText = reReplace(cleanText, "(^[\*\?]{1,})", "");
// uCaseWords - and=AND, etc - lcase rest. if keyword is mixed case - solr treats as case-sensitive!
cleanText = uCaseWordsForSolr(cleanText);
return trim(cleanText);
}
Search CFLib.org
Latest Additions
Raymond Camden added
QueryDeleteRows
November 04, 2017
Leigh added
nullPad
May 11, 2016
Raymond Camden added
stripHTML
May 10, 2016
Kevin Cotton added
date2ExcelDate
May 05, 2016
Raymond Camden added
CapFirst
April 25, 2016