Using MongoDB in Clojure

I’ve already showed you in previous posts, how to do simple CRUD queries and how to execute aggregate functions in the Mongo Console. I’ve also shown you, how to use MongoDB with Kotlin. Today we’ll look at how to do these things from Clojure using the library Monger.

Monger is a wrapper of the Java Mongo Driver that exposes its functionalities in an idiomatic way. As in my previous posts, we’ll again use the JSON dump of the CIA factbook.

First we’ll create a project using Leiningen:

lein create new mongo-factbook

This will create a directory mongo-factbook. Enter the directory and open the file project.clj. Then add following dependency:

 [com.novemberain/monger "3.5.0"]

Note that I had some trouble when using Monger in combination with clojurescript, as it uses an older Guava version as Clojurescript does. So if you use Monger in projects that also use Clojurescript and you get errors during cljs compilations, exclude Guava:

 [com.novemberain/monger "3.5.0" :exclusions [com.google.guava/guava]]

Now you are ready to use Monger in your project. To connect to a MongoDB server, you have to configure your connection. Add the following lines to the file src/mongo_factbook/core.clj

(ns mongo-factbook.core
  (:require [monger.core :as mg]
            [monger.collection :as mc])
  (:import [com.mongodb MongoOptions ServerAddress]
           org.bson.types.ObjectId))

(def conn (mg/connect))

(def db (mg/get-db conn "factbook"))

Now you can use the db object to query the DB factbook of the MongoDB server localhost:. If the DB exists or not doesn't matter, as Mongo automatically creates a db if you insert something into its collections.

The monger.collection and the ObjectId type, will be used in further steps.

You can of course connect to any other server by providing additional parameters:

(def conn (mg/connect {:host "some.other.server" :port 7878}))

There are of course more advanced options, which you can lookup in the Monger documentation.

Now we have a connection object, so let’s do the basic CRUD (create, read, update, insert) operations.

  (mc/insert db "countries" {:name "Germany" :official-language "German" :population 82000000})

This will insert a new document to the “countries” collection of the “factbook” database.

To get documents from Mongo, you can query them with find (returns a DBCursor object) or with find-maps (returns a clojure map where property names are transformed to clojure maps).

  (mc/find db "countries" {:name "Germany"})
  (mc/find-maps db "countries" {:name "Germany"})

Next we will insert data from a json file into the database. We’ll use the CIA world factbook data. Download the file and put it into the “resources” directory of your project.

As we want a clean database before we start, we first clear our factbook collection, by calling remove without further filter:

  (mc/remove db "countries")

If we only wanted to remove some elements, we could hav added a filter as third parameter, e.g.

  (mc/remove db "countries" {:name "Germany"})

Now let’s read the file and transforme it to a clojure map using the Cheshire library. We first need to add Cheshire as dependency:

 [cheshire "5.8.1"]

And then we can execute the following code:

(def factbook-as-map (-> "factbook.json"
    clojure.java.io/resource
    slurp
    parse-string
    (get "countries")))

We now have the factbook stored in a form like this:

{"germany" {"data" {"name" "Germany", ...} "metadata" {...}}

Depending on your preference regarding ids for MongoDB documents, we can now either use generated ObjectIds for each country document or use the key provided by the document. I personally prefer to have generated ObjectId as I can still use the “name” attribute to lookup a country by it’s name. Here is the code for it:

(doseq [[_ document] factbook-as-map]
  (mc/insert db "countries" (document "data")))

Now we have our data into the db, so we can start querying for the data. Let’s say, we want do get the informations about the population of Germany, then we can execute following query:

  (mc/find-maps db
                "countries"
                {:name "Germany"}
                {:_id 0 :people.population 1 :name 1})

The first map specifies a filter and the second a projection. The :_id property is selected implicitly if not specified, so if you don’t want it, you have to specify it in the projection map, like I did it here. All other properties will not be displayed and don’t need to be excluded explicitly.

The last thing we will do, is to retrieve a list of all countries speaking a specific language (e.g. French) using Mongo Aggregate Pipelines.

First we will require Monger operators, so we can use them in our query:

  [monger.operators :refer :all]

Then we will define a function to get all countries speaking a language:

(defn countries-by-language [lang]
  (mc/aggregate db "countries"
                [{$match {:people.languages.language {$elemMatch {:name lang}}}}
                 {$unwind :$people.languages.language}
                 {$match {:people.languages.language.name lang}}
                 {"$addFields"
                  {:country :$name
                   :speakersPercentage :$people.languages.language.percent}}
                 {$project {:country 1 :speakersPercentage 1 :_id 0}}
                 {$sort {:speakersPercentage -1}}]))

Let’s recap, what we just did:

we used elemMatch to select all objects that have an element in the array “people.languages.language” that has the value of the variable lang as “name” property. For example the document {“people”: {“languages”: {“language”: [{“name”: “French”}, {“name”: “German”}}}} would match for lang=”French”
we unwind the documents that matched, so that for every entry in the “people.languages.language” array, we create a separate document where all other properties are the same. So our document from above transforms into two documents:
- {“people”: {“languages”: {“language”: {“name”: “French”}}}}
- {“people”: {“languages”: {“language”: {“name”: “German”}}}}
Next we filter these documents to only keep the ones that match our language
With “$addField” we create additional properties derived country (derived from name) and speakersPercentage (derived from people.languages.language.percent). The $addField operator does not exist in Monger, so we had to write it as a string here. You can do it for all operators, if you prefer.
We then project to these two fields as they are all we want to return
We sort by speakersPercentage with -1 meaning descending order

As you could see, it is pretty straightforward to use MongoDB with Clojure. You can more or less type the queries like in the Mongo Console, but with a more Clojure-like syntax.