Blog

active-record hash versus array syntax

While migrating a ruby app using active-record from using character columns in our postgres database to uuids I learned about a difference in using the hash syntax versus the array syntax in where clauses.

TLDR; Prefer the hash syntax over array syntax. When using data from an untrusted source like the web in a where clause for non string columns you will need to validate it like you’d validate your models using a save if you want predictable behaviour.

Consider the following table and test data

create_table :players do |t|
  t.uuid :player_id
  t.number_of_games :number_of_games
end

class Player < ActiveRecord::Base
end

Player.create!(player_id: SecureRandom.uuid, number_of_games: 1)
Player.create!(player_id: SecureRandom.uuid, number_of_games: 0)

It is not uncommon for applications to use inputs from the web in queries that are directly taken from an url. For instance consider the following Sinatra code.

get '/player/:id' do
  Player.where(player_id: params[:id])
end

Unlike when saving ActiveRecord::Base objects, where you typically validate before saving, validation is usually skipped when querying. Since active-record takes case of building the query in order to avoid injection attacks this is considered safe.

Now let’s do some querying to see the difference between the hash and array syntax:

# The hash syntax
[8] pry(main)> Player.where(player_id: 'foo').first
DEBUG, 20161118 13:49:14-363:   Player Load (0.2ms)  SELECT  "players".* FROM "players" WHERE "players"."player_id" = $1  ORDER BY "players"."id" ASC LIMIT 1  [["player_id", nil]]
=> nil

Nothing is returned. This makes sense since 'foo' isn’t a valid uuid. As you can see active-record converts this to player_id = NULL which always yields to false since NULL = NULL is never true (unlike NULL is NULL).

What about the array syntax?

# the array syntax
[9] pry(main)> Player.where('player_id = ?', 'foo').first
DEBUG, 20161118 13:58:36-126:   Player Load (8.1ms)  SELECT  "players".* FROM "players" WHERE (player_id = 'foo')  ORDER BY "players"."id" ASC LIMIT 1
ActiveRecord::StatementInvalid: PG::InvalidTextRepresentation: ERROR:  invalid input syntax for uuid: "foo"
LINE 1: ...CT  "players".* FROM "players" WHERE (player_id = 'foo') ORDER BY "players"."id" ASC LIMIT 1

Unlike the hash syntax this fails with a ActiveRecord::StatementInvalid. I was never aware of this difference until we migrated to actual uuid‘s in the database. Some specs failed that were using invalid uuid’s as test data. We fixed the code by always using the hash syntax (since that is the behaviour we wanted)

Then we also checked how this would work for integer columns…

First let’s check the array syntax:

# the array syntax
[19] pry(main)> Player.where('number_of_games = ?', 'foo').first
DEBUG, 20161118 14:14:30-711:   Player Load (1.3ms)  SELECT  "players".* FROM "players" WHERE (number_of_games = 'foo')  ORDER BY "players"."id" ASC LIMIT 1
ActiveRecord::StatementInvalid: PG::InvalidTextRepresentation: ERROR:  invalid input syntax for integer: "foo"
LINE 1: ...layers.* FROM "players" WHERE (number_of_games = 'foo') ORDER BY "players"."id" ASC LIMIT 1

This is consistent with the uuid example, both fail with a ActiveRecord::StatementInvalid.

How about the hash syntax?

# The hash syntax
[18] pry(main)> Player.where(number_of_games: 'foo').first
DEBUG, 20161118 14:08:09-310:   Player Load (0.2ms)  SELECT  "players".* FROM "players" WHERE "players"."number_of_games" = $1  ORDER BY "players"."id" ASC LIMIT 1  [["number_of_games", 0]]
=> #<Player:0x007fed85cfeef8 id: 2, player_id: "0728be62-e766-4c29-ad52-be4a43dc7f9f", number_of_games: 0>

Wait what? This actually returns our second player because the value 'foo' is transformed into 0. Although this is reported as bug a couple of times, it was dismissed because it is considered consistent with a create or update operation. So I guess we have to live with this difference for now. If you were not aware of this, it leads to unexpected behaviour when using it like:

get '/players/number_of_games/:count' do
  Player.where(number_of_games: params[:count])
end

As we saw this will return all players with 0 number_of_games when querying it like ‘/players/number_of_games/foobar’

So to have the same behaviour of uuid's (and predictable behaviour) the above code can to be written as

get '/players/number_of_games/:count' do
  valid_integer?(params[:count]) ? 
    Player.where(number_of_games: params[:count]) : 
    Player.none
end

helpers do
  def valid_integer?(string)
    Integer(string)
  rescue
    false
  end
end