Google Developer Relations
Introduction
This lesson teaches you how to sort the results of a query search in the order you want them:
- You can sort on a field value or on expressions that include it.
- You can sort on a score based on the document's frequency ranking, or you can use that score in a more complex ranking expression.
- Sorts can be multidimensional, sorting primarily on one field expression, then secondarily on another, and so on.
The example application lets a user sort on the relevance measure or perform two-dimensional sorts on document fields.
Objectives
Learn how to sort the results of a Search API search query
Prerequisites
The precursor to this class, Getting Started with the Python Search API
You should also:
- Python 2.7 and the Google App Engine SDK for Python
- Familiarity with Python and the basics of App Engine applications
The SortOptions class
The SortOptions class provides
the basis for defining query result sort orders. Once defined, a SortOptions
object is used as the sort_options parameter of a QueryOptions object:
from google.appengine.api import search
search.QueryOptions(
sort_options=search.SortOptions(...)
...)
Sort expressions
You can pass the SortOptions
constructor an
expressions argument, which is an iterable of
SortExpression objects.
This lets you perform a multidimensional sort based on document field values.
In the example application's docs.Product class, you'll see a list like this:
_SORT_OPTIONS = [
[AVG_RATING, 'average rating', search.SortExpression(
expression=AVG_RATING,
direction=search.SortExpression.DESCENDING, default_value=0)],
[PRICE, 'price', search.SortExpression(
expression=PRICE,
direction=search.SortExpression.ASCENDING, default_value=9999)],
[UPDATED, 'modified', search.SortExpression(
expression=UPDATED,
direction=search.SortExpression.DESCENDING, default_value=1)],
[CATEGORY, 'category', search.SortExpression(
expression=CATEGORY,
direction=search.SortExpression.ASCENDING, default_value='')],
[PRODUCT_NAME, 'product name', search.SortExpression(
expression=PRODUCT_NAME,
direction=search.SortExpression.ASCENDING, default_value='zzz')]
]
This defines a SortExpression object for a subset of product document fields.
In each definition, the expression parameter is simply the field name. Later
you will see some other expression variants. Notice that each sort expression
defines a sort direction (ASCENDING or DESCENDING) and a default value, used
if a document doesn't include the given field in the expression.
In handlers.py, the ProductSearchHandler.doProductSearch() method uses these
definitions to define a two-dimensional sort, by defining an ordered list of
SortExpression objects and passing this list as the expressions parameter of
the SortOptions constructor. If the user requests a sort by average rating, do
a secondary sort by price:
sortopts = search.SortOptions(expressions=[
search.SortExpression(expression=docs.Product.AVG_RATING, direction='DESCENDING', default_value=0),
search.SortExpression(expression=docs.Product.PRICE, direction='ASCENDING', default_value=9999)])
Otherwise (say if the user requests a sort by price), the secondary sort is by average rating:
sortopts = search.SortOptions(expressions=[
search.SortExpression(expression=docs.Product.PRICE, direction='ASCENDING', default_value=9999),
search.SortExpression(expression=docs.Product.AVG_RATING, direction='DESCENDING', default_value=0)])
Then, as before, you pass this object to the QueryOptions constructor:
search.QueryOptions(
sort_options=sortopts
...)
Using expression functions in a sort expression
The expression property in a SortExpression object can be more than just a
field name: you can also use the expression
functions
(such as max) that were mentioned in the previous lesson. For example, the
following sort expression sorts by the value of the price field, adjusted to
include sales tax:
search.SortExpression(
expression='price * 1.08',
direction=search.SortExpression.ASCENDING, default_value=9999)]
Sorting GeoPoint fields by distance
Finding all stores within a given distance is nice, but it would be even better
to sort them by how far away they are. This can be done by adding a sort
expression using the built-in distance function. The following query finds all
stores within 4.5 kilometers of the user's location and sorts them by ascending
distance, from nearest to farthest:
index = search.Index(config.STORE_INDEX_NAME)
user_location = (-33.857, 151.215)
query = "distance(store_location, geopoint(%f, %f)) < %f" % (
user_location[0], user_location[1], 45000)
loc_expr = "distance(store_location, geopoint(%f, %f))" % (
user_location[0], user_location[1])
sortexpr = search.SortExpression(
expression=loc_expr,
direction=search.SortExpression.ASCENDING, default_value=45001)
search_query = search.Query(
query_string=query,
options=search.QueryOptions(
sort_options=search.SortOptions(expressions=[sortexpr])))
results = index.search(search_query)
Match scoring
The MatchScorer class is used to sort documents in ascending order, using a
score based on term frequency. A MatchScorer object is passed as the
match_scorer parameter of the SortOptions constructor:
sortopts = search.SortOptions(match_scorer=search.MatchScorer())
search.QueryOptions(
sort_options=sortopts
...)
The example application uses this sort when the user selects relevance from the query sort menu in the UI.
You can also access a document's score in a sort expression, using the special
field name _score. In this case, your SortOptions object should include a
scorer: for example,
search.SortOptions(match_scorer=search.MatchScorer(),
expressions=[search.SortExpression(...),...])
Summary and review
In this lesson, you learned how to define multidimensional sorts on our search
results, by specifying SortExpression and MatchScorer objects and using them
to build the SortOptions object passed to the QueryOptions constructor.
Try experimenting with sort expressions a bit more, to check that you understand
how to use them. For instance, try changing the sort dimensions defined by the
SortOptions objects in ProductSearchHandler.doProductSearch(), by performing
a secondary sort on a different dimension. Or try changing the sort direction
for one of the sort dimensions.
In the next lesson, you'll learn how to retrieve, delete, and re-index Search API Documents.