Debugging MongoDB’s 2dsphere geoIndex

When I first learned about MongoDB’s geospatial capabilities similar to PostGIS, I was beyond excited because it meant I would not have to write helper functions to find data along geohash borders.  Currently my data is indexed by geohash, which has great ordering properties, but is a little unwieldy for the “find me x near y”-type queries.

Unfortunately, it seems as though the 2dsphere index doesn’t cover geoWithin queries.  As my collection of geoJSON documents grew from GBs to TBs,  I noticed my tools taking extraordinary amounts of time to perform find queries, which caused me to investigate what was happening under the hood.

First, I double-checked that my geospatial index was present:

mongos> db.my_collection.getIndexes()
[
  {
    "v" : 1,
    "key" : {
      "_id" : 1
    },
    "name" : "_id_",
    "ns" : "telemetry.********"
  },
  {
    "v" : 1,
    "key" : {
      "value.avg_location" : "2dsphere"
    },
    "name" : "geoindex",
    "ns" : "telemetry.*******",
    "2dsphereIndexVersion" : 2
  }
]

So, the index was clearly there.  I also verified that its size was appropriate — it was around 50mb, which seemed fine for this amount of test data.

Next, I ran .explain() on a query to see if it was hitting the index:


var query = {
    "value.avg_location": {
        "$geoWithin": {
            "$polygon": [
                [-122.07496047019957,
                    37.65882781135178
                ],
                [-122.07496047019957,
                    37.663906822384874
                ],
                [-122.0871913433075,
                    37.663906822384874
                ],
                [-122.0871913433075,
                    37.65882781135178
                ],
                [-122.07496047019957,
                    37.65882781135178
                ]
            ]
        }
    }
};
db.my_collection.find(query).explain("executionStats");

Nope:
"winningPlan" : {
  "stage" : "COLLSCAN",
  "filter" : {
    "value.avg_location" : {
      "$geoWithin" : {
        "$polygon" : [
          [
            -122.07496047019957,
            37.65882781135178
          ],
          [
            -122.07496047019957,
            37.663906822384874
          ],
          [
            -122.0871913433075,
            37.663906822384874
          ],
          [
            -122.0871913433075,
            37.65882781135178
          ],
          [
            -122.07496047019957,
            37.65882781135178
          ]
        ]
      }
    }
  },
  "direction" : "forward"
},
"rejectedPlans" : [ ]

"executionStats" : {
  "nReturned" : 164,
  "executionTimeMillis" : 341340,
  "totalKeysExamined" : 0,
  "totalDocsExamined" : 10760184,
  "executionStages" : {
    "stage" : "SINGLE_SHARD",
    "nReturned" : 164,
    "executionTimeMillis" : 341340,
    "totalKeysExamined" : 0,
    "totalDocsExamined" : 10760184,
    "totalChildMillis" : NumberLong(341334),
    "shards" : [
      {
        "shardName" : "rs0",
        "executionSuccess" : true,
        "executionStages" : {
          "stage" : "COLLSCAN",
          "filter" : {
                                           ....
          },
          "nReturned" : 164,
          "executionTimeMillisEstimate" : 334786,
          "works" : 10760186,
          "advanced" : 164,
          "needTime" : 10760021,
          "needFetch" : 0,
          "saveState" : 84076,
          "restoreState" : 84076,
          "isEOF" : 1,
          "invalidates" : 0,
          "direction" : "forward",
          "docsExamined" : 10760184
        }
      }
    ]
  }
},
"ok" : 1

So, from the “stage” param COLLSCAN, we can clearly see it’s doing a collection scan to find the documents.  We want it to say something like “IXSCAN” to indicate that it’s using the index.

I tried deleting and rebuilding the index:

<span class="s1">mongos&gt; db.mycoll.dropIndex("geoindex")
true
mongos&gt; db.mycoll.createIndex({"value.avg_location" : "2dsphere" }, {name: "geoindex"});
mongos&gt; 
</span>

Still, no dice!

Okay, well Mongo’s docs says that the most efficient queries will query on AND return only the indexed fields.  So, I tried limiting (“projecting” in Mongo-parlance) the query to just return the “value.avg_location” field, but we ended up still doing a “COLLSCAN.”

UPDATE: I ended up fixing this by re-structuring the query.  It turns out the original syntax I used for the polygon query is the older syntax, which doesn’t hit the index.  The newer syntax looks like this:

{
  "location": {
    $geoWithin: {
      $geometry: {
        type: "Polygon",
        coordinates: [ [ &lt;the coords&gt; ] ]
      }
    }
  }
}

2 thoughts on “Debugging MongoDB’s 2dsphere geoIndex”

Leave a Reply

Your email address will not be published. Required fields are marked *