Sequencing Mongodb
Many of us who have moved from the traditional RDMS based applications, would be in for a rude shock if I said auto-increment Integer fields are not a norm in the Distributed NoSQL database MongoDB. Why?
Although traditional databases often use increasing sequence numbers for primary keys. In MongoDB, the preferred approach is to use Object IDs instead. The concept is that in a very large cluster of machines, it is easier to create an object ID than have global, uniformly increasing sequence numbers.
The above is taken from the MongoEngine documentation. And mongo has a good enough document that explains how to create an auto-incrementing field. But the Python’s ODM for MongoDB, MongoEngine, provides an special field call the SequenceField to around this pain of creating the sequencing generator by hand.
Setup Sequence #
It is fairly interesting to see how the SequenceField works, it just implements whatever is written here. Lets first setup an example of using Sequence Fields.
from mongoengine import connect
from mongoengine import Document, SequenceField, StringField
from mongoengine.connection import get_db
connect("sample_db", host="mongodb://127.0.0.1:49153")
db = get_db()
Lets create a collection class called Person, with one SequenceField(seq) and one StringField(name).
class Person(Document):
seq = SequenceField()
name = StringField()
Lets create a couple of the documents on Person collection.
for x in xrange(10):
Person(name="Person %s" % x).save()
The above code would create 10 Person documents. So now lets go ahead and see what does the seq hold.
p = Person.objects(name="Person 0").first()
print p.seq
q = Person.objects(name="Person 9").first()
print q.seq
Note that p.seq is 1 and q.seq is 10. What exactly is happening and how does the seq field get incremented?
Dissecting the Sequence #
As I had mentioned before like in the documentation, the sequence field works by setting up a collection mongoengine.counters, and creating a document in it with an _id point to the <collection>.<fieldname>(in our case, person.seq), and next(a fieldname) to hold the current sequence number.
f = db['mongoengine.counters'].find_one({'_id': 'person.seq'})
print f['next']
f[‘next’] willl hold the value 10. So if you happened to have another SequenceField in your document definition, like this.
class Person(Document):
seq = SequenceField()
counter = SequenceField()
name = StringField()
This would create another document in the mongoengine.counters collection with an _id, person.counter and next field.
SequenceField has a parameter called collection_name, which tells the MongoEngine to create a seperate collection with the provided name. For example.
class Place(Document):
seq = SequenceField(collection_name="place.counters")
name = StringField()
for x in xrange(10):
Place(name="Place %s" % x).save()
Now this would have created a collection place.counters. To see this you could need to see the list of collections.
db.collection_names()
[u'mongoengine.counters',
u'system.indexes',
u'sample',
u'person',
u'place.counters',
u'place']
n = db['place.counters'].find_one({'_id': 'place.seq'})
print n['next']
This would give an output of 10. If there is anything more I would write a sequel to this.