Introduction
The XML Query Parser (XmlQueryParser) supports a very wide range of available Apache Solr
search queries--more so than any other query parser that ships with it.
This article will attempt to examine the breadth of that influence released with Solr 6.0.0.
search queries--more so than any other query parser that ships with it.
This article will attempt to examine the breadth of that influence released with Solr 6.0.0.
I will be adding separate articles (and linking to them) for the different types of queries so that
more detail may be devoted to it and not overwhelm this main thread.
more detail may be devoted to it and not overwhelm this main thread.
De-Facto
Example
<BooleanQuery fieldName="description">
<Clause
occurs="must">
<TermQuery>shirt</TermQuery>
</Clause>
<Clause
occurs="mustnot">
<TermQuery>plain</TermQuery>
</Clause>
<Clause
occurs="should">
<TermQuery>cotton</TermQuery>
</Clause>
<Clause
occurs="must">
<BooleanQuery
fieldName="size">
<Clause
occurs="should">
<TermsQuery>S M L</TermsQuery>
</Clause>
</BooleanQuery>
</Clause>
</BooleanQuery>
Difficulties
- How do I get highlighting to work?
Top-Level
- BooleanQuery
- disableCoord (optional, false)
- minimumNumberShouldMatch (optional, 0)
- boost (optional, 1.0)
- Value
- Clause
- occurs: should | must | mustNot | filter
- Value (Note: Many of the following can also have children, explained later)
- TermQuery
- TermsQuery
- MatchAllDocsQuery
- BooleanQuery
- LegacyNumericRangeQuery (deprecated)
- PointRangeQuery
- DisjunctionMaxQuery
- UserQuery
- ConstantScoreQuery
- SpanNear
- BoostingTermQuery
- SpanTerm
- SpanOr
- SpanOrTerms
- SpanFirst
- SpanNot
- NOTE: Only the first Clause child is recognized--others will get silently ignored!
- Ignores any other element types at this level--i.e. only Clause is recognized, no exceptions thrown if it finds something else
- MatchAllDocsQuery - Matches all documents in an index
- TermQuery
- TermsQuery
- [Legacy]NumericRangeQuery (deprecated in lucene 6.0.0ish)
- Not supported as of Solr 6 (solr doesn't support point types yet)
- PointRangeQuery (new in 6.0ish)
- Not supported as of Solr 6 (solr doesn't support point types yet)
- RangeQuery
- DisjunctionMaxQuery
- tieBreaker (optional, 0.0)
- boost (optional, 1.0)
- Value
- May contain multiple queries of any type of Query defined in this list (i.e. DisjunctionMaxQuery, RangeQuery, …)
- UserQuery
- fieldName (optional, defaults to defaultField)
- Value
- Text is passed into QueryParser.parse
- This appears to support the classic query syntax
- NOTE: Wraps the query into a BoostQuery
- ConstantScoreQuery
- boost (optional, 1.0)
- Value
- Only gets the first child
- Child may be any query in this list
- SpanNear
- boost (optional, 1.0)
- slop
- inOrder (optional, false)
- Value
- A collection of various types of SpanQuery
- BoostingTermQuery
- fieldName (required either here or in a parent)
- boost (optional, 1.0)
- Value: fieldName value
- SpanTerm
- fieldName (required either here or in a parent)
- boost (optional, 1.0)
- Value: fieldName value
- SpanOr
- boost (optional, 1.0)
- Value: a collection of various types of SpanQuery
- SpanOrTerms
- fieldName (required either here or in a parent)
- boost (optional, 1.0)
- Value: terms commonly separated by a space
- Wraps the terms in a SpanOr query
- SpanFirst
- This limits span matches to the first N (specified by the end parameter below) positions
- More specifically, match spans in the subquery whose end position is less than or equal to end.
- boost (optional, 1.0)
- end (optional, 1, integer)
- Value:
- Gets the first child, which must be a SpanQuery
- All other children are ignored
- SpanNot
- boost (optional, 1.0)
- Include - First child element called Include must contain a SpanQuery
- Exclude - First child element called Exclude must contain a SpanQuery
BooleanQuery
TermQuery
{!xmlparser}
<BooleanQuery
fieldName="headline">
<Clause occurs="must">
<TermQuery>york</TermQuery>
</Clause>
</BooleanQuery>
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<TermQuery
fieldName="headline">york</TermQuery>
</Clause>
</BooleanQuery>
SpanNear
// Headline: new
pre/3 york
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanTerm>york</SpanTerm>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (york or car)
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>york</SpanTerm>
<SpanTerm>car</SpanTerm>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (york or (car w/3 bart))
// Match:
"headline":"New York. Hongkong. Wunsiedel"
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>york</SpanTerm>
<SpanNear slop="3"
inOrder="false">
<SpanTerm>car</SpanTerm>
<SpanTerm>arrives</SpanTerm>
</SpanNear>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
// Headline: new
pre/3 (daybook or (employee w/3 onboarding))
{!xmlparser}
<BooleanQuery>
<Clause occurs="must">
<SpanNear fieldName="headline"
slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanOr>
<SpanTerm>daybook</SpanTerm>
<SpanNear slop="3"
inOrder="false">
<SpanTerm>employee</SpanTerm>
<SpanTerm>onboarding</SpanTerm>
</SpanNear>
</SpanOr>
</SpanNear>
</Clause>
</BooleanQuery>
DisjunctionMaxQuery
{!xmlparser}
<DisjunctionMaxQuery
tieBreaker="1"
boost="2">
<UserQuery
fieldName="headline">uber</UserQuery>
<TermsQuery
fieldName="headline">new york times</TermsQuery>
</DisjunctionMaxQuery>
UserQuery
{!xmlparser}
<UserQuery
fieldName="headline">
"new
computer*"~15
</UserQuery>
ConstantScoreQuery
{!xmlparser}
<ConstantScoreQuery
boost="1.0">
<UserQuery
fieldName="headline">tesla</UserQuery>
</ConstantScoreQuery>
SpanNear
{!xmlparser}
<SpanNear
fieldName="headline" slop="3" inOrder="true">
<SpanTerm>new</SpanTerm>
<SpanTerm>computer</SpanTerm>
</SpanNear>
BoostingTermQuery
{!xmlparser}
<BoostingTermQuery
fieldName="headline"
boost="1.2">
tesla
</BoostingTermQuery>
SpanTerm
{!xmlparser}
<SpanTerm
fieldName="headline"
boost="1.2">
tesla
</SpanTerm>
SpanOr
{!xmlparser}
<SpanOr
fieldName="headline"
boost="1.2">
<SpanTerm>pizza</SpanTerm>
<SpanTerm>milk</SpanTerm>
</SpanOr>
SpanOrTerms
{!xmlparser}
<SpanOrTerms
fieldName="headline"
boost="1.2">
pizza
milk
</SpanOrTerms>
SpanFirst
{!xmlparser}
<SpanFirst
fieldName="headline"
end="1"
boost="1.2">
<SpanTerm>tesla</SpanTerm>
</SpanFirst>
SpanNot
-- TODO: Redo this--I'm getting some headlines with york in them
{!xmlparser}
<SpanNot
fieldName="headline">
<Include>
<SpanTerm>new</SpanTerm>
</Include>
<Exclude>
<SpanTerm>york</SpanTerm>
</Exclude>
</SpanNot>
No comments:
Post a Comment