We were unable to load Disqus. If you are a moderator please see our troubleshooting guide.
Thanks!
Thanks for the detailed performance comparation. Except the app layer type convert using HSTORE, how do you feel the performance against jsonb?
I didn't measure this use case, but if the JSON structure is as flat as the hstore, I don't think the performance will be any different, because both column types can benefit from GIN/GiST indexes.
Thanks for the article! Is there any size difference between data stored in json and jsonb columns? If you had e.g. 10 million users, would there be any noticeable difference?
relation | size
--------------------+-------
public.users | 372 MB
public.customers | 372 MB
The users table has the jsonb column, while customers table has the json column. I don't know if there are other implications, but looks like there's no difference in terms of storage usage.
Thanks for checking it out
Nice :), sharing
I'm glad you liked! ;)
Hello, how does the custom serializer load method works if hash is getting into this method as a string?
With the default column value '{}' I get back a string and with_indifferent_access obviously won't work, resulting in an undefined method `with_indifferent_access' for "{}":String when I try to instantiate a record.
Generally I learned that if want to use a serialized field I should leave text as datatype and let the coder interface do the rest. I had some luck with hstore, though.
Does it still makes sense to use json/jsonb as data type instead of a text/hstore and let rails/your custom serializer do the rest?
Personally I hacked this class to allow for some options and Oj: https://gist.github.com/mas...
Thanks for the writeup.
how does the custom serializer load method works if hash is getting into this method as a string?
I don't think that's happening at all. The serializer will always receive the object after the type cast is performed. That means that you always receive an object or nil (so that's why I have a guard for nils). Here's a script that shows that.
If you're having different results, could you create a sample demonstrating the behavior you're describing?
your example code is giving me two errors, the first is that probably you meant `store` instead of `store_accessor`? afaik it's `store` that accepts a coder.
Having fixed that I get this using the provided repro script:
https://gist.githubusercont...
which is the same error as in my app.
Postgres 9.4.1
I changed a thing on the article. Since HashWithIndifferentAccess is a loaded constant (ActiveSupport), I don't think your serializer with the same name is going to be loaded, unless you explicitly do it so. I renamed it to HashSerializer instead.
That wouldn't justify the error (about string not having with_indifferent_access method), but still...
Here's a sample app with tests included.
Yeah I noticed app/serializers, at first I thought it to be a new autoloaded rails dir (I recall something about files being automatically preloaded when in /app -- and I just upgraded to 4.2), however I quickly noticed it wasn't the case and did a require_dependency and it started to take into account the custom code (but not automatically reloaded); I also checked for good measure if there was already a dump/load method implementation in the original class but I wasn't able to find it.
I'm redoing some of my migrations, it's a complex project with a huge codebase and I'm trying to archive something into hstore or json datatypes, so it's quite a slow process, but perhaps I noticed that some columns still were json and not jsonb, I'm not sure yet if that will make a difference or not when all the involved columns will be jsonb. Will keep you posted, thanks for the sample code and clarifications.
I love you.
One note, Rails 4.1 supports json but not jsonb, you need to add
serialize :col, JSON
to make work jsonb in rails 4.1
When upgrading to Rails 4.2, will I need to remove all occurrences of `serialize :col, JSON`? That is, what does this do in 4.2, assuming `:col` is a `jsonb` column?
Yes, you have remove the serialize method from the model
Thanks for the great post. There is a small error in your Migration: The executed SQL command in the change method will yield a ActiveRecord::IrreversibleMigration on a rake db:rollback. This is due to the fact that rails doesn't know how to revert what happens inside of the SQL command. An easy fix for this is to move the adding of the index to the up method and the removal of the index to the down method like so:
class AddIndexToPreferencesPathOnUsers < ActiveRecord::Migration
def up
execute <<-SQL
CREATE INDEX user_prefs_newsletter_index ON users ((preferences->>'newsletter'))
SQL
end
def down
remove_index :users, name: :user_prefs_newsletter_index
end
end
hey coffeejunk long time no see and now finding you on the comments #rgffm :)
😇
Awesome writeup, thanks for sharing!
This small patch adds support for defining expression indexes without dumping to SQL: https://gist.github.com/shu.... Hopefully support gets added to Rails soon, but this has been working well for me in the meantime.
I have read this article several times but I get this error when I try to add a jsonb column PG::UndefinedObject: ERROR: type "jsonb" does not exist !!! my psql --version is 9.4.1 and rails -v is 4.2, any help please
Try re-installing the pg gem (in order to re-build the native extensions) - but first make sure you're really using postgresql 9.4 and rails 4.2
Great write-up in general.
Two comments:
1. `app/serializers` dir isn't auto-loaded, so you'll need to add it to autoload paths in your `application.rb` like so:
config.autoload_paths += %W(#{config.root}/app/serializers)
2. With your `HashSerializer`, I get a "ActiveRecord::StatementInvalid: PG::NotNullViolation: ERROR: null value in column "fields" violates not-null constraint" when trying to save an empty hash `{}`. This seems to get serialized to JSON `null`, even though `{}.to_json` is "{}". Any hints? How did you work around this? I don't have this issue when not using the custom serializer.
1. This is not true for a while. All directories inside app are automatically added to the load path, as per https://github.com/rails/ra....
2. Can you upload a sample app replicating this issue somewhere?
1. Hmm, I just had this issue though – Rails 4.2.0.
2. If I get the chance, sure. A little strapped for time currently.
Well, if you created the app/serializers directory you have to restart your application the first time, since the directory list isn't refreshed on every request. That's probably what happened here.
2. I came across that same issue and figured out why this happens (at least in my case): ActiveRecord::Type::Serialized#type_cast_for_database checks if the value is #default_value?. If so, it returns nothing which leads to nil as type casted value that is supposed to be put in the database colum.
This can be fixed by inserting the following to the beginning of HashSerializer.load:
return nil if hash.nil?Thus #default_value? will be false, since coder.load(nil) does not equal {}. Not sure if this is an elegant solution but it works and does not cause any side effects on my test cases.
But I don't really understand why this is happening... for what reason default values are dropped (happens on an update action). Allthough it seems that only json columns are affected.
I ran into this problem as well. Your fix worked for me. it is a mystery.
I am also using json columns (not jsonb) and that may be related to it.
json is stored as a string, so it may be a casting issue. I've seen some other comment on the web about people needing to aggressively cast the {} as a default value in their migrations
It worked for me. I was having the same problem when trying to save twice the same object. Ex:
payment.save
payment.save
I got this error at the second time calling 'save'. I will try to find out why. I saw that it calls 'load' in the serializer many times when trying to save the object the second time.
Just verified another thing:
to load the object, it called 'load' 4 times.(rails c, Payment.last)
to save the object the first time, it called 'load' 2 times. (payment.save)
to save the object the second time, it called 'load' 13 times. (payment.save)
Weird. =/
I am using jsonb.
For a majority of things the content you would store in a json or jsonb column wouldn't need translation such as a user preference. Also this type isn't a one type fits all. Choose what your app needs not what is easiest or coolest at the time
Very helpful, Nando! Muito top!
And if you want to add a lot of store accessors by using a method, do something like this:
after_initialize :add_preferences_accessors
...
def preferences_keys
[ :twitter, :github, :facebook, :blog, :language ]
end
def add_preferences_accessors
self.class.store_accessor :preferences, *preferences_keys
end
This allows you to have default preferences properties and values, like so:
after_initialize :add_preferences_accessors
...
def preferences_defaults
{ language: "en", font_size: "16", color: "black", background: "white" }
end
def add_preferences_accessors
self.class.store_accessor :preferences, *(preferences_defaults.keys)
end
This one actually works better than the serializer's use shown
So... when using JSONB + Rails4.2, do I still need to use the serialize :column_name, JSON in the model?
No you don't need to add serialize for 4.2
Great post, thank you!
nice :)
So happy to find this article. Thank you
Just a quick note, I am using Rails API only mode. It took me some time to notice it, but the HashSerializer obviously has also to inherit from ActiveModel::Serializer in order to use jsonb symbol querying from models.
Thanks for the post. Helped a lot.
Thank you, so much, for this.
this post being highly informative for many I guess, have to say that, as of today, with latest rails version, the hash_serializer.rb thing shown here does not seem useful - returns an error
O.k. even though this is about a year old it's the best explanation of how to use jsonb with rails.
In fact it's better than that. It's awesome.
I think the best way to use it (at the moment) to is add one column - for example "metadata" - to a traditional table and add all attributes to metadata that you want.
What's amazing is the getting and setting of attributes is so easy and they can be nested.
thing.metadata["sgt_pepper"] = "Lonely Hearts Club Band"
thing.save
<%=thing.metadata["sgt_pepper"]%>
I really wonder what mysql is going to do if postgres keeps innovating so quickly, and defining functionality first, and frameworks like rails cater to that specific database functionality.
Anyway this post is killer, thanks so much
Great post but why I don't see how to push key-value to an existing jsonb field ?
How can I setup the if twitter was a hash twitter{user1:"",user2:""}
This is pure gold, thank you!!
If the jsonb is a array like: [{"from": "contact@test.com", "subject": "SUBJECT", "reply_to": "contact@test.com", "transaction_id": "ce0d9434"}, {"from": "new_contact@test.com", "subject": "NEW SUBJECT", "reply_to": "contact@test.com", "transaction_id": "ce0d945f"}]
how is the better way to find or update a object in array by transaction_id ?
O.k. I'm WAY late to the party here but this is the best article I've seen on indexing a jsonb column for postgres 9.5 and rails 4.2.
As I wrote earlier (see above) I'm just use a "metadata" jsonb column...but it's nested and it looks something like this in the user table in the db:
{"something": "here", "subscribed_ids": [49, 56, 57, 102], "unsubscribed_ids": [1], "more":"stuff...etc.}
Now I think I know the answer to this question but would adding a GIN index on the metadata column need an expression for subscribed_ids and unsubscribed_ids? Or is the index enough? And would it speed up a query like the following (I'm obviously writing this by hand at the moment - so here's the query in rails):
sql = "SELECT count(*) as readers FROM users WHERE banned = false AND metadata @> \'{\"subscribed_ids\": [" + id.to_s + "]}\'"
There are a lot of moving parts here and I don't quite understand if just a GIN index is what I want or if I need an expression. I'm also assuming I'll need the active_record.rb addition and the execute...I think.
Basically I'm just trying to add an index on the two arrays to speed things up.
Any help would be greatly appreciated.
I can't tell you how many times I've referenced this article over the past few years. Thank you so much.
really nice post!