[syslog-ng] [RFC]: MongoDB destination plans (for 3.4 and beyond)

From: Gergely Nagy <algernon_at_nospam>
Date: Tue Jun 21 2011 - 14:08:21 GMT
To: Syslog-ng users' and developers' mailing list <>


I've been working on a few mongodb destination related features
recently, and I thought I'll ask for comments here, before I proceed
further, to see if my ideas can be improved, and if there's anyone
actually interested in the stuff I play with (mostly out of sheer
curiosity; I'll scratch my own itches even if noone else has similar
needs :P).

Before I go further, let me introduce the current mongodb destination
features, available in the syslog-ng 3.3 branch:

* We can connect to a single MongoDB server per destination, which we
  expect to be the master (or a standalone server).
* We can send logs to it, structured in various interesting ways via

...aaand that's about it. Pretty bare bones, but it's enough for a lot
of stuff.

Now, I plan to extend this in two ways: first, it will be possible to
connect to a ReplicaSet, which is basically a set of MongoDB servers
that replicate from a master. The advantage of this, is that if the
master goes down, one of the secondaries will automatically take over,
and the mongodb driver will reconnect to it automatically aswell. In
case that fails too, the destination driver will fall back to queuing
within syslog-ng, and retry after a configured interval.

Another advantage will be the use of "safe mode", which when turned on,
will verify that the message could be inserted into a MongoDB
collection, and it will not be ACKed on the syslog-ng side until it
is. After a number of retries, it can be dropped along with an
appropriate log message, so that these won't fill up the queue. (This,
of course, would be optional, with never dropping messages by default,
unless the internal queue is full.)

So we could have a destination configured like this:

d_mongo {
    servers("" "" "")

This would connect to on port 27017 by default, and if that
becomes inaccessible, it'd retry with .2 and .3, in that order. If any
of them listed other servers as part of their replicaset, the driver
would retry with those aswell. It would also turn on safe-mode, which
are a bunch of extra checks to ensure that data arrived to MongoDB safe
and sound.

We'd get more reliability this way, and with safe-mode, better data
safety aswell.

Now, the question is: is there anything else that may be worth adding?
If anyone here used MongoDB, is there something else you'd like to see
added to the mongodb destination?

(I could add GridFS support aswell, but so far, I haven't found an
acceptable use-case for that yet.)

