Introduction
The XML Query Parser (XmlQueryParser) supports a very wide range of available Apache Solr
search queries--more so than any other query parser that ships with it.
This article will attempt to examine the breadth of that influence released with Solr 6.0.0.
I will be adding separate articles (and linking to them) for the different types of queries so that
more detail may be devoted to it and not overwhelm this main thread.
De-Facto
Example
<BooleanQuery fieldName="description">
<Clause
occurs="must">
<TermQuery>shirt</TermQuery>
</Clause>
<Clause
occurs="mustnot">
<TermQuery>plain</TermQuery>
</Clause>
<Clause
occurs="should">
<TermQuery>cotton</TermQuery>
</Clause>
<Clause
occurs="must">
<BooleanQuery
fieldName="size">
<Clause
occurs="should">
<TermsQuery>S M L</TermsQuery>
</Clause>
</BooleanQuery>
</Clause>
</BooleanQuery>
Difficulties
- How do I get highlighting to
work?
Top-Level
- BooleanQuery
- disableCoord (optional,
false)
- minimumNumberShouldMatch
(optional, 0)
- boost (optional, 1.0)
- Value
- Clause
- occurs: should | must |
mustNot | filter
- Value (Note: Many of the
following can also have children, explained later)
- TermQuery
- TermsQuery
- MatchAllDocsQuery
- BooleanQuery
- LegacyNumericRangeQuery (deprecated)
- PointRangeQuery
- DisjunctionMaxQuery
- UserQuery
- ConstantScoreQuery
- SpanNear
- BoostingTermQuery
- SpanTerm
- SpanOr
- SpanOrTerms
- SpanFirst
- SpanNot
- NOTE: Only the first Clause child is
recognized--others will get silently ignored!
- Ignores any other element types at this
level--i.e. only Clause is recognized, no exceptions thrown if it finds
something else
- MatchAllDocsQuery - Matches
all documents in an index
- TermQuery
- TermsQuery
- [Legacy]NumericRangeQuery (deprecated in lucene 6.0.0ish)
- Not supported as of Solr 6 (solr doesn't support point types yet)
- PointRangeQuery (new in
6.0ish)
- Not supported as of Solr 6 (solr doesn't support point types yet)
- RangeQuery
- DisjunctionMaxQuery
- tieBreaker (optional, 0.0)
- boost (optional, 1.0)
- Value
- May contain multiple
queries of any type of Query defined in this list (i.e.
DisjunctionMaxQuery, RangeQuery, …)
- UserQuery
- fieldName (optional,
defaults to defaultField)
- Value
- Text is passed into
QueryParser.parse
- This appears to support the
classic query syntax
- NOTE: Wraps the query into a
BoostQuery
- ConstantScoreQuery
- boost (optional, 1.0)
- Value
- Only gets the first child
- Child may be any query in
this list
- SpanNear
- boost (optional, 1.0)
- slop
- inOrder (optional, false)
- Value
- A collection of various
types of SpanQuery
- BoostingTermQuery
- fieldName (required either
here or in a parent)
- boost (optional, 1.0)
- Value: fieldName value
- SpanTerm
- fieldName (required either
here or in a parent)
- boost (optional, 1.0)
- Value: fieldName value
- SpanOr
- boost (optional, 1.0)
- Value: a collection of
various types of SpanQuery
- SpanOrTerms
- fieldName (required either
here or in a parent)
- boost (optional, 1.0)
- Value: terms commonly
separated by a space
- Wraps the terms in a SpanOr
query
- SpanFirst
- This limits span matches to
the first N (specified by the end parameter below) positions
- More specifically, match
spans in the subquery whose end position is less than or equal to end.
- boost (optional, 1.0)
- end (optional, 1, integer)
- Value:
- Gets the first child, which
must be a SpanQuery
- All other children are
ignored
- SpanNot
- boost (optional, 1.0)
- Include - First child
element called Include must contain a SpanQuery
- Exclude - First child
element called Exclude must contain a SpanQuery
BooleanQuery
TermQuery
{!xmlparser}
<BooleanQuery
fieldName="headline">
<Clause occurs="must">
<TermQuery>york</TermQuery>
</Clause>
</BooleanQuery>
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<TermQuery
fieldName="headline">york</TermQuery>
</Clause>
</BooleanQuery>
SpanNear
// Headline: new
pre/3 york
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanTerm>york</SpanTerm>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (york or car)
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>york</SpanTerm>
<SpanTerm>car</SpanTerm>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (york or (car w/3 bart))
// Match:
"headline":"New York. Hongkong. Wunsiedel"
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>york</SpanTerm>
<SpanNear slop="3"
inOrder="false">
<SpanTerm>car</SpanTerm>
<SpanTerm>arrives</SpanTerm>
</SpanNear>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (daybook or (employee w/3 onboarding))
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>daybook</SpanTerm>
<SpanNear slop="3"
inOrder="false">
<SpanTerm>employee</SpanTerm>
<SpanTerm>onboarding</SpanTerm>
</SpanNear>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
DisjunctionMaxQuery
{!xmlparser}
<DisjunctionMaxQuery
tieBreaker="1"
boost="2">
<UserQuery
fieldName="headline">uber</UserQuery>
<TermsQuery
fieldName="headline">new york times</TermsQuery>
</DisjunctionMaxQuery>
UserQuery
{!xmlparser}
<UserQuery
fieldName="headline">
"new
computer*"~15
</UserQuery>
ConstantScoreQuery
{!xmlparser}
<ConstantScoreQuery
boost="1.0">
<UserQuery
fieldName="headline">tesla</UserQuery>
</ConstantScoreQuery>
SpanNear
{!xmlparser}
<SpanNear
fieldName="headline" slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanTerm>computer</SpanTerm>
</SpanNear>
BoostingTermQuery
{!xmlparser}
<BoostingTermQuery
fieldName="headline"
boost="1.2">
tesla
</BoostingTermQuery>
SpanTerm
{!xmlparser}
<SpanTerm
fieldName="headline"
boost="1.2">
tesla
</SpanTerm>
SpanOr
{!xmlparser}
<SpanOr
fieldName="headline"
boost="1.2">
<SpanTerm>pizza</SpanTerm>
<SpanTerm>milk</SpanTerm>
</SpanOr>
SpanOrTerms
{!xmlparser}
<SpanOrTerms
fieldName="headline"
boost="1.2">
pizza
milk
</SpanOrTerms>
SpanFirst
{!xmlparser}
<SpanFirst
fieldName="headline"
end="1"
boost="1.2">
<SpanTerm>tesla</SpanTerm>
</SpanFirst>
SpanNot
-- TODO: Redo this--I'm getting some headlines with york in them
{!xmlparser}
<SpanNot
fieldName="headline">
<Include>
<SpanTerm>new</SpanTerm>
</Include>
<Exclude>
<SpanTerm>york</SpanTerm>
</Exclude>
</SpanNot>