Indexing
Saving a document
The indexing takes place after saving in mongodb and is a deferred process. One can check the end of the indexation by catching the es-indexed event.
doc.save(function(err){
if (err) throw err;
/* Document indexation on going */
doc.on('es-indexed', function(err, res){
if (err) throw err;
/* Document is indexed */
});
});
Removing a document
Removing a document, or unindexing, takes place when a document is removed by calling .remove()
on a mongoose Document instance.
One can check the end of the unindexing by catching the es-removed event.
doc.remove(function(err) {
if (err) throw err;
/* Document unindexing in the background */
doc.on('es-removed', function(err, res) {
if (err) throw err;
/* Docuemnt is unindexed */
});
});
Note that use of Model.remove
does not involve mongoose documents as outlined in the documentation. Therefore, the following will not unindex the document.
MyModel.remove({ _id: doc.id }, function(err) {
/* doc remains in Elasticsearch cluster */
});
Indexing Nested Models
In order to index nested models you can refer following example.
var Comment = new Schema({
title: String,
body: String,
author: String
})
var User = new Schema({
name: { type: String, es_indexed: true },
email: String,
city: String,
comments: { type: [Comment], es_indexed: true }
})
User.plugin(mongoosastic)
Elasticsearch Nested datatype
Since the default in Elasticsearch is to take arrays and flatten them into objects,
it can make it hard to write queries where you need to maintain the relationships
between objects in the array.
The way to change this behavior is by changing the Elasticsearch type from object
(the mongoosastic default) to nested
var Comment = new Schema({
title: String,
body: String,
author: String
})
var User = new Schema({
name: { type: String, es_indexed: true },
email: String,
city: String,
comments: {
type:[Comment],
es_indexed: true,
es_type: 'nested',
es_include_in_parent: true
}
})
User.plugin(mongoosastic)
Indexing Mongoose References
In order to index mongoose references you can refer following example.
var Comment = new Schema({
title: String,
body: String,
author: String
});
var User = new Schema({
name: { type: String, es_indexed: true },
email: String,
city: String,
comments: {
type: Schema.Types.ObjectId,
ref: 'Comment',
es_schema: Comment,
es_indexed:true,
es_select: 'title body'
}
})
User.plugin(mongoosastic, {
populate: [
{ path: 'comments', select: 'title body' }
]
})
es_schema
field - the referenced schema.
By default every field of the referenced schema will be mapped. Use es_select
field to pick just specific fields.
populate
is an array of options objects you normally pass to
Model.populate.
Indexing An Existing Collection
Already have a mongodb collection that you'd like to index using this plugin? No problem! Simply call the synchronize method on your model to open a mongoose stream and start indexing documents individually.
var BookSchema = new Schema({
title: String
});
BookSchema.plugin(mongoosastic);
const Book = mongoose.model('Book', BookSchema)
const stream = Book.synchronize();
const count = 0;
stream.on('data', function(err, doc) {
count++;
});
stream.on('close', function() {
console.log('indexed ' + count + ' documents!');
});
stream.on('error', function(err) {
console.log(err);
});
You can also synchronize a subset of documents based on a query!
var stream = Book.synchronize({ author: 'Arthur C. Clarke' })
As well as specifying synchronization options
var stream = Book.synchronize({}, { saveOnSynchronize: true })
Options are:
saveOnSynchronize
- triggers Mongoose save (and pre-save) method when synchronizing a collection/index. Defaults to globalsaveOnSynchronize
option.
Bulk Indexing
You can also specify bulk
options with mongoose which will utilize Elasticsearch's bulk indexing api. This will cause the synchronize
function to use bulk indexing as well.
Mongoosastic will wait 1 second (or specified delay) until it has 1000 docs (or specified size) and then perform bulk indexing.
BookSchema.plugin(mongoosastic, {
bulk: {
size: 10, // preferred number of docs to bulk index
delay: 100 //milliseconds to wait for enough docs to meet size constraint
}
});
Filtered Indexing
You can specify a filter function to index a model to Elasticsearch based on some specific conditions.
Filtering function must return True for conditions that will ignore indexing to Elasticsearch.
var MovieSchema = new Schema({
title: { type: String },
genre: { type: String, enum: ['horror', 'action', 'adventure', 'other'] }
});
MovieSchema.plugin(mongoosastic, {
filter: function(doc) {
return doc.genre === 'action';
}
});
Instances of Movie model having 'action' as their genre will not be indexed to Elasticsearch.
Indexing On Demand
You can do on-demand indexes using the index
function
const dude = await Dude.findOne({ name:'Jeffrey Lebowski' });
dude.awesome = true;
await dude.index();
The index method takes as arguments:
options
(optional) - { index: string } - the index to publish to. Defaults to the standard index that the model was setup with.
Note that indexing a model does not mean it will be persisted to
mongodb. Use save()
for that.
Unindexing on demand
You can remove a document from the Elasticsearch cluster by using the unIndex
function.
await doc.unIndex();
Truncating an index
The static method esTruncate
will delete all documents from the associated index. This method combined with synchronize()
can be useful in case of integration tests for example when each test case needs a cleaned up index in Elasticsearch.
await GarbageModel.esTruncate();
Restrictions
Auto indexing
Mongoosastic try to auto index documents in favor of mongoose's middleware feature.
Mongoosastic will auto index when:
document.save
Model.findOneAndUpdate
Model.insertMany
document.remove
Model.findOneAndRemove
but not include Model.remove
& Model.update
.
And you should have new: true
options when findOneAndUpdate
so that mongoosastic can get new values in post hook.
Search immediately after es-indexed event
Elasticsearch by default refreshes each shard every 1s, so the document will be available to search 1s after indexing it.
The event es-indexed
means that elasticsearch received the index request, and if you want to search the document, please try after 1s. See Document not found immediately after it is saved.