These are some notes from my recent MongoDB training session with 10gen’s Richard Kreuter.
Lack of Schema
MongoDB does not impose or require any schema for the documents you store in a collection. The application must enforce any schema requirements, for instance when inserting a user record you may require the presence of at least a username field. The application must validate this before writing the record. Its possible to add and subtract elements from the document as the application’s needs change over time. This is both good and bad. Compared to an RDMS adding columns is a snap, but your application must be able to deal with the fact that it may get documents in different formats from the same collection. An excellent way to deal with this is to use a version number field in your documents. The application will know what document version number its dealing with and you can add routines to upgrade any documents you find while doing other work such as reads, and or updates. Or you could just have a little worker program that upgrades documents to the newest specs in the background. The point is you have lots of flexibility but that comes with some management overhead.
Document padding is the idea of adding extra blank data to your documents. For instance if you have User document that contains an array of Roles that the User is assigned to. When store the document initially the User may only have one Role and so you store an array of one item. As the User gets more Roles the document’s array will grow. The problem is that the documents are stored on disk. If the document grows beyond the size allocated on the disk the document must be moved to a larger space. This is an expensive operation. A good strategy is to pad the Roles array to a reasonable size with empty values to reduce the possibility of disk relocation of the record.
To Join or not to Join
In an RDMS that has been normalized to contain no duplicate data you have to join your tables together in order to get a document. This means you dont have to worry about maintaining duplicate values. In MongoDB you generally want to store to documents as you would want to see them which means you may (probably will) have duplicate values. Take for instance the idea of a collection of IP addresses that contain domain information and a collection of domains that contain IP information. You would have both collections because you would want to index on different fields and quickly fetch data from both points of view but you would also have duplicate data. You also have the possibility that in between writing your domain info and then writing your IP info the DB dies in some ugly way. This would leave you in the bad position of having out of sync data. A good strategy to handle this is to create a third collection that holds a sort of “transaction”. Its not a transaction in the sense of an RDMS transaction, but I dont have a better term at the moment because my coffee has not kicked in yet. So what you do is create this third collection and store a document that describes what action you want to perform, add, update, or delete, the data you want to do it with, and a timestamp. Then your application can attempt to “play” the action as many times as it has to, hopefully once, to fulfill the original request. Once it completes successfully you remove the record from the “transaction” collection.
Replication and Sharding
To be clear, a replica set is a group of mongod processes that all contain the same document data whose purpose is to prevent data loss. At least one of the processes acts as a writeable server at any one time and is known as the primary node. In general you would run each replica on a separate piece of hardware. There’s not much point to having them reside on the same box excepting a development deployment and even then, VMs would be a better option. But failing that, each replica must have it’s own data directory, log file and unique tcp port to communicate on. Any modifications made to the replica set config file must be made on the primary node.
Every data manipulation query sent to a mongod process is collected in the op log. Once you have setup your replica sets you will see a collection in the local database called oplog. By default the oplog is 5% of the available disk space. However you can change this when starting mongod like so:
mongod --oplogSize 200 // in MB
So why would you want to do this? The OpLog is a capped collection which makes it work like queue. Once the collection gets to the cap size the oldest records are removed one by one as new records come in. Depending on the amount of write operations your database is handling and the speed at which the replicas can pick up those changes its possible, although unlikely, that your oplog would roll off stale data before the replicas had a chance to write the data. The replicas would no longer be able to sync after that point. A more likely scenario is that for some reason your replicas lose connectivity for some period of time that is greater than the period of time recorded in the oplog. When the replica comes back online it will not be able to sync. So its better to err on the side of a bigger oplog within reason.
(that joke is not as funny as you think)
Sharding is used to spread data and writing operations around. Its that simple. Well ok not really. Sharded clusters consists of several, at least 1, shards each of which is a replica set. I’ll say it again, a single shard is a replica set. There are some rules about shards that may or may not be intuitive given what you know about Mongo.
- In a sharded system every document in a collection must contain the same fields.
- You are required to have exactly 3 config servers that store the metadata and routing information for the cluster.
- You must have at least one but usually more mongos processes running. MongoS is the name of the binary not multiple mongo.
- This is what your application will connect to and it also does not have to reside on the database server. In fact I think its better if its resides on the application server.
- A shard key’s value is immutable! The only way to change the value is pull the document, make the change and then insert it as a new document a and delete the original.
- Shard key values dont have to be unique in a collection. This means that if you want to update a specific document you must include some other identifying field in addition to the shard key.
|Shard 1||Shard 2||Shard 3|
Following the replica set rules, each member of the replica sets should ideally reside on separate hardware. So in the example above you would have 9 physical pieces of hardware.
Picking a shard key is something you must consider carefully. While its possible to change a shard key in a running database its an incredibly painful process that will effectively require shutting down your database which is not easy in a production situation.
One reasonable strategy is to generate a hash and store that in the document as your shard key. Something else to watch out for are keys that are monotonic in nature. Because shards are split up in ranges based upon the values of the shard keys if you have a monotonic shard key you will end up always writing to one shard which means you lose write balancing in your cluster.
Assume your shard key is an integer and you have 3 shards. The shards will balance the number of records by splitting the data over the shards
based upon the values of the key.
|Shard 1||Shard 2||Shard 3|
|key range: 1-10||key range: 11-20||key range: 21->infinity|
So one of the shards is always going to contain the higher range of keys and because your key values are always increasing mongos is always going to write to that shard. At some point the data will split and balance but you will always be writing to one shard. So beware of monotonic keys.