Find/Replace on a JSON Object Graph

Today I had cause to implement a method for finding and replacing a value that appears at the end of a certain JSON path in an object graph. I couldn't find a preexisting tool to the dirty work so I wrote it myself and then this article. :)

Here's a concrete example. Imagine you have the following JSON:

Now imagine that you want to make the shirt color of every employee with a status of awesome orange. Why? No clue. Work with me here. How would you accomplish that?

After looking for someone else having already done this, I set out to do it myself and was surprised at how simple this was in Ruby. The technique I thought of was to search through every  node in the object graph and call a special replace function on each one. If the given node matched the criteria, then it or its children would be updated accordingly.

The following code amounts to a depth first search of the object graph

Really all of the "magic" is in the search method which really just knows how to enumerate either a Hash or an Array, call the replace method and then recursively search its children. If the child object fails some aspect of the replace criteria, nothing happens, we just move on searching through all of its children's Hash or Array children and so on until no other options exist.

What's most surprising to me is the simplicity and elegance of the solution. I probably spent more time looking for an alternate than it took me to write that code.

Now before you think that this will only work in the simplest of cases, here is the actual replacement code I needed for my real world scenario:

5 responses
this is interesting (also, btw, you're on hn). i've been working on something similar in python, but which uses "types" to help target changes. i have no idea if ruby has a concept of types (i imagine they're similar to classes), but you might consider a similar idea.

for example, you might define a routine that says "replace Cls(X) with Y" where Cls(X) indicates the class of X. that's trivial, of course, but you could also extend it (if Ruby has a framework for types, or you add one yourself, which is what I was doing in Python) to something more like "replace Map(a=int, b=Seq(Opt(int)) with Y" where Map(...) is a dictionary that contains a name a (whose value is an interger) and a name b (whose value is optional ints - in Python that means an int or None).

python (like ruby, i think) isn't "(statically) typed", but that doesn't mean the ideas of types can't be useful in some situations...

I'm sure the quick coding will be off by an index somewhere, but I don't see why this has to be complex?

Straghtforward Python is:

company_data = json.loads(JSON_STR)

for e in company_data['Empoyees']:
if "Awesome" in e['Status']:
e['ShirtColor'] = 'Purple'

I'm sure it's even easier with a map and a lambda

Obviously mine's situational specific but walking the dictionary tree should be straightforward enough.

Jay:
That's a great question (Good eye!). My original use case for this is over a very large file of JSON that has the given path I'm looking for nested at multiple unknown levels. I've updated the sample in this article to reflect a slightly more complicated version that illustrates my point better (and that's also a bad example of how to model data in JSON ;) )

That's why I can't simply enumerate and why I need to check for Hash or Array which is the complexity I believe you're referring to.

Andrew:
I don't quite get what you're gunning for. Do you have a link to somewhere I can read more about it or see a full concrete example in Python or Ruby?

Instead of:

node.class() == Hash

You can write:

node.class == Hash

But that will fail for HashWithIndifferentAccess, so use:

node.is_a? Hash

Which is (typically) equivalent to:

Hash === node

Which is how case statements work:

case node
when Hash
...
when Array
...
end

I guess an alternative approach would be to make it into a string and do one or a few regexps, should be fairly simple. I have no clue which one would be faster.