Order

In my discussion of RethinkDB, I mentioned that ReQL is inspired by functional languages.

What I neglected to mention is that this will serve as an initial stumbling block for some developers. Being functional, it makes use of chaining, and results from one function are fed into the next. This means that order matters in ReQL.

This does open up some power you don’t have in SQL (and some non-SQL query languages too), but simultaneously introduces a potential for error.

Let’s take a look at a simple use case: We want the results ordered, but we only want 5 of them.

In standard SQL, this is unambiguous. There’s only one allowable way to write the query. If you try to LIMIT before ORDERing, your DB (PostgreSQL, in this case) will drop the hammer:

SELECT avg_max_f FROM heat_index_data WHERE state = 'Arizona' LIMIT 5 ORDER BY date DESC;

ERROR:  syntax error at or near "ORDER"
LINE 1: ...M heat_index_data WHERE state = 'Arizona' LIMIT 5 ORDER BY d...

Write it correctly, with LIMIT at the end, and we’re good:

SELECT avg_max_f FROM heat_index_data WHERE state = 'Arizona' ORDER BY date DESC LIMIT 5;

weather_dev=# SELECT avg_max_f FROM heat_index_data WHERE state = 'Arizona' ORDER BY date DESC LIMIT 5;
 avg_max_f
-----------
     59.16
     60.57
     58.51
     54.11
      51.8
(5 rows)

But ReQL is functional. It doesn’t care where you stick your limits or orderBys and it doesn’t care how many of them you have. You can orderBy(r.asc(date)).orderBy(r.desc(date)) for all it cares. Data comes out of one function and is fed into the next.

So we can limit before orderBy:

r.getAll('Arizona', {index: 'state'}).limit(5).orderBy('date').pluck('avg_max_f')

[
  {
    "avg_max_f": 48.81
  } ,
  {
    "avg_max_f": 44.35
  } ,
  {
    "avg_max_f": 56.4
  } ,
  {
    "avg_max_f": 66.15
  } ,
  {
    "avg_max_f": 57.04
  }
]

And we can orderBy before limit:

r.getAll('Arizona', {index: 'state'}).orderBy('date').limit(5).pluck('avg_max_f')

[
  {
    "avg_max_f": 48.81
  } ,
  {
    "avg_max_f": 41.85
  } ,
  {
    "avg_max_f": 40.26
  } ,
  {
    "avg_max_f": 45.54
  } ,
  {
    "avg_max_f": 48.79
  }
]

…and as is plainly visible here, you get different results because they’re different queries.

Wanting to limit before ordering is certainly an edge case (which is why SQL doesn’t allow for it), but it — along with many other unconventional queries — is perfectly possible with the limited constraints of ReQL. If you run into unexpected results working with RethinkDB, take a step back and ask yourself: “What am I really asking for here? Does it make sense?”