Nested queries in MongoDB shell (almost joins)

I’m going to show you how to easily query data across multiple collections in MongoDB shell. Which is almost equal to SQL’s JOIN but not ;)

Assume we have two collections users

> db.users.insert({_id: "usr-1", name: "Tom"})
> db.users.insert({_id: "usr-2", name: "John"})
> db.users.insert({_id: "usr-3", name: "Kate"})

> db.users.find()

{ "_id" : "usr-1", "name" : "Tom" }
{ "_id" : "usr-2", "name" : "John" }
{ "_id" : "usr-3", "name" : "Kate" }

and posts

> db.posts.insert({_id: "pst-1", user_id: "usr-1", body: "body 1", tags: ["apples", "oranges"]})
> db.posts.insert({_id: "pst-2", user_id: "usr-1", body: "body 2", tags: ["apples"]})
> db.posts.insert({_id: "pst-3", user_id: "usr-2", body: "body 3", tags: ["oranges"]})
> db.posts.insert({_id: "pst-4", user_id: "usr-1", body: "body 4", tags: []})
> db.posts.insert({_id: "pst-5", user_id: "usr-3", body: "body 5", tags: ["apples"]})

> db.posts.find()

{ "_id" : "pst-1", "user_id" : "usr-1", "body" : "body 1", "tags" : [ "apples", "oranges" ] }
{ "_id" : "pst-2", "user_id" : "usr-1", "body" : "body 2", "tags" : [ "apples" ] }
{ "_id" : "pst-3", "user_id" : "usr-2", "body" : "body 3", "tags" : [ "oranges" ] }
{ "_id" : "pst-4", "user_id" : "usr-1", "body" : "body 4", "tags" : [ ] }
{ "_id" : "pst-5", "user_id" : "usr-3", "body" : "body 5", "tags" : [ "apples" ] }

…and let’s say we need to find all the users who have at least one post tagged with oranges.

There are quite a few options to do it in half manual way or with MongoDB v3.2 and new $lookup operator from aggregation framework, but here is a pretty simple solution which is easy to remember and it’s a one liner:

db.users.find({_id: {$in: db.posts.find({tags: {$in: ["oranges"]}}).map(function(post){ return post.user_id })}})

{ "_id" : "usr-1", "name" : "Tom" }
{ "_id" : "usr-2", "name" : "John" }

Let’s break it down:

  • First part is just a standard finding users with _id matching values $in a given array db.users.find({_id: {$in:...})

  • Second part is a sub query which uses JavaScript map() function on the query cursor and maps results into array of users’ _id (as we know mongodb query does not return an array of objects, but cursor)

  • Inside the map() functions we specify a callback functions and what field we want to return from every post object. If we run just the second query we would get an array of users’ _id

> db.posts.find({tags: {$in: ["oranges"]}}).map(function(post){ return post.user_id })

[ "usr-1", "usr-2" ]
  • So basically under the hood our original query becomes this
> db.users.find({_id: {$in: [ "usr-1", "usr-2" ]}})

Another trick is if we have a case when second query users .FindOne() which returns a singular object instead of cursor, we do not need to use map() function and our query becomes a bit simpler

> db.users.find({_id: db.posts.findOne({_id: "pst-5"}).user_id})

{ "_id" : "usr-3", "name" : "Kate" }

Hope you enjoyed it! And keep hacking!


Related Posts

How to remove multiple Docker images

A simple shell command will help you to remove multiple docker images.

Good to see you automation for meetups

Simple JavaScript functions which does say 'Good to see you' to everybody in one action.