Skip to content

Instantly share code, notes, and snippets.

@jsbonline2006
Last active August 29, 2015 14:13
Show Gist options
  • Star 0 You must be signed in to star a gist
  • Fork 0 You must be signed in to fork a gist
  • Save jsbonline2006/e04433f9b11cdcafa865 to your computer and use it in GitHub Desktop.
Save jsbonline2006/e04433f9b11cdcafa865 to your computer and use it in GitHub Desktop.
Unable to index the xml file in cloudera solr using flume
# Specify server locations in a SOLR_LOCATOR variable; used later in variable substitutions:
SOLR_LOCATOR : {
# Name of solr collection
collection : SampleSearch
# ZooKeeper ensemble
zkHost : "host:port/solr"
}
morphlines : [
{
id : morphline1
importCommands : ["org.kitesdk.**","com.cloudera.**", "org.apache.solr.**"]
commands : [
{
generateUUID { field : id }
}
{
xquery {
fragments : [
{
fragmentPath :"/"
queryString : """ for $doc in /add/doc return <add><doc>{$doc/field[@name="id"]}
{$doc/field[@name="title"]}{$doc/field[@name="content"]}{$doc/field[@name="locale"]}
{$doc/field[@name="createDate"]}{$doc/field[@name="publishDate"]}</doc></add>"""
}
]
}
}
{
sanitizeUnknownSolrFields {
# Location from which to fetch Solr schema
solrLocator : ${SOLR_LOCATOR}
}
}
# log the record at DEBUG level to SLF4J
{ logDebug { format : "output record: {}", args : ["@{}"] } }
# load the record into a SolrServer or MapReduce SolrOutputFormat.
{
loadSolr {
solrLocator : ${SOLR_LOCATOR}
}
}
]
}
]
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment