An aggregation pipeline transforms a stream of documents through an ordered list of stages. In OMGDB a pipeline is a JSON array of stage documents — each stage is an object with exactly one $-prefixed key — passed to the CLI as a single JSON-string argument:
omgdb aggregate app.omgdb <collection> '[<stage>, <stage>, ...]'
The pipeline runs against the documents scanned from <collection>, then each stage feeds its output documents into the next. Stages, group accumulators, and expression operators are each dispatched by a single flat match on the operator name in the omgdb-agg crate. The design is deliberately additive — supporting a new operator means adding one match arm — but it is not a runtime registry: there is no registration API, trait objects, or plugin mechanism, and an unrecognized operator returns an “unknown aggregation operator” error.
Expressions are evaluated by a recursive engine. A string beginning with $ (e.g. "$a.b") is a field path resolved against the current document (missing paths yield null); any other string is a literal. An object whose first key starts with $ is a single-operator expression (and must have exactly one key); any other object is a literal sub-document evaluated field by field; arrays are evaluated element-wise.
This is a compatible subset of MongoDB’s aggregation framework, not a complete implementation. See query operators for the filter syntax used inside $match.
Stages
Stages run in array order. Each stage object must have exactly one $-key; otherwise the pipeline is malformed.
| Stage | Description | Example |
|---|---|---|
$match | Keeps documents matching a query filter (compiled via the query engine). | {"$match":{"age":{"$gte":25}}} |
$project | Inclusion/exclusion projection plus computed fields. 1/true includes, 0/false excludes, any other value is an expression. | {"$project":{"_id":0,"full":1}} |
$addFields / $set | Adds or overwrites top-level fields from evaluated expressions. $set is an exact alias. | {"$addFields":{"sum":{"$add":["$a","$b"]}}} |
$group | Groups by an _id expression and computes accumulators per group. | {"$group":{"_id":"$dept","total":{"$sum":"$sal"}}} |
$sort | Stable-sorts by one or more keys; each direction must be 1 or -1. | {"$sort":{"age":-1}} |
$limit | Keeps at most the first N documents. | {"$limit":1} |
$skip | Drops the first N documents. | {"$skip":1} |
$count | Collapses the stream to one document with the named field set to the document count. | {"$count":"n"} |
$unwind | Emits one document per element of an array field. | {"$unwind":"$tags"} |
$replaceRoot / $replaceWith | Promotes an embedded document (or an expression that evaluates to an object) to the root. | {"$replaceRoot":{"newRoot":"$meta"}} |
$lookup | Left-outer join of another collection by field equality. Requires a store. | {"$lookup":{"from":"items","localField":"item","foreignField":"_id","as":"itemDocs"}} |
$facet | Runs several named sub-pipelines over the same input and emits one document of result arrays. | {"$facet":{"count":[{"$count":"n"}]}} |
Stage notes
$projectoperates on top-level fields. Inclusion and exclusion cannot be mixed (except for_id, which may be excluded with_id:0in an otherwise-inclusion projection). An inclusion projection keeps_idfirst unless_id:0.$grouprequires an_id. Every non-_idfield must be a single-accumulator object. Groups are keyed by the canonical JSON of the_idvalue and stored in an ordered map, so output is ordered by that canonical key rather than by input order.$sortkeys support dotted paths. Missing field values sort before present ones. A direction other than1or-1is a malformed error.$unwindaccepts a"$field"string or{"path":"$field"}. It operates on a top-level field (the leading$is stripped). A missing ornullarray value silently drops the document; a non-array value is passed through as a single document. There are nopreserveNullAndEmptyArraysorincludeArrayIndexoptions.$replaceRooterrors ifnewRootdoes not evaluate to an object.$replaceWith: <expr>is sugar for$replaceRoot: { newRoot: <expr> }.$facetcollapses the stream to a single document. Sub-pipelines inherit the store, so they may use$lookup.
Note:
$lookupjoins another collection and therefore needs a backing store. It is available only when the pipeline runs against a collection (as theomgdb aggregatecommand does); running a pipeline over an in-memory document list without a store makes$lookupa malformed error. Only thefrom/localField/foreignField/asform is supported — there is nolet+ sub-pipeline join form. The join is array-aware on both sides: a scalar on either side matches an element of an array on the other.
Limitation:
$bucketand$bucketAutoare not implemented. There is no stage arm for them, and they will return an unknown-operator error.
Accumulators
Accumulators appear inside $group, one per output field, applied to the values produced by evaluating the accumulator’s argument expression for each document in the group.
| Accumulator | Description | Example |
|---|---|---|
$sum | Sums numeric inputs; all-integer inputs yield an integer (checked, overflow is an error), otherwise a float. Non-numeric values are ignored. Also used as a counter via {"$sum":1}. | {"total":{"$sum":"$sal"}} |
$avg | Mean of numeric inputs as a float; ignores non-numeric values; null if there are none. | {"avg":{"$avg":"$sal"}} |
$min | Smallest value by the engine’s comparison ordering; null for empty input. | {"lo":{"$min":"$sal"}} |
$max | Largest value by the comparison ordering; null for empty input. | {"hi":{"$max":"$sal"}} |
$first | First accumulated value in the group (input order); null if empty. | {"f":{"$first":"$sal"}} |
$last | Last accumulated value in the group; null if empty. | {"l":{"$last":"$sal"}} |
$push | Collects all accumulated values into an array. | {"all":{"$push":"$sal"}} |
Limitation:
$addToSetis not implemented. Use$pushif duplicates are acceptable.
Expression operators
Expression operators are used inside $project, $addFields/$set, $group accumulator arguments, $replaceRoot, and as $match-free computed values. They take a single argument or an argument array, each element of which is itself an expression evaluated against the current document.
Literal and conditional
| Operator | Description | Example |
|---|---|---|
$literal | Returns its operand unevaluated, so $-prefixed strings and operator objects are treated as literal data. | {"$literal":"$notAFieldPath"} |
$cond | Ternary: returns the then branch when the condition is truthy, else else. Accepts [if, then, else] or {if, then, else}. | {"$cond":[{"$gte":["$age",18]},true,false]} |
$ifNull | Returns the first argument unless it is null, in which case the second. Exactly 2 arguments. | {"$ifNull":["$nickname","anon"]} |
$switch | Evaluates branches in order, returning the then of the first truthy case; falls back to default. | {"$switch":{"branches":[{"case":{"$gte":["$score",90]},"then":"A"}],"default":"F"}} |
$type | BSON-style type name of the argument as a string (e.g. an integer reports "long"). | {"$type":"$score"} |
Note: No matching
$switchbranch and nodefaultis a type error.
Arithmetic
All arithmetic operators require numeric arguments; a non-numeric argument is a type error.
| Operator | Description | Example |
|---|---|---|
$add | Adds arguments; all-integer args stay integer (checked overflow), otherwise float. | {"$add":["$a","$b"]} |
$subtract | First minus second; integer when both integral (checked), else float. Exactly 2 arguments. | {"$subtract":["$a","$b"]} |
$multiply | Multiplies arguments; integer when all integral (checked overflow), else float. | {"$multiply":[2,3]} |
$divide | First divided by second; always returns a float. Divide-by-zero is a type error. Exactly 2 arguments. | {"$divide":["$a",2]} |
$mod | Remainder of first by second; integer when both integral (checked), else float. Mod-by-zero is a type error. | {"$mod":["$a",-1]} |
Comparison
Each comparison takes exactly 2 arguments and returns a boolean, comparing under the engine’s ordering.
| Operator | Description | Example |
|---|---|---|
$eq | True when the arguments compare equal. | {"$eq":["$a",1]} |
$ne | True when the arguments are not equal. | {"$ne":["$a",1]} |
$gt | True when the first is greater than the second. | {"$gt":["$score",90]} |
$gte | True when the first is greater than or equal to the second. | {"$gte":["$score",90]} |
$lt | True when the first is less than the second. | {"$lt":["$a",10]} |
$lte | True when the first is less than or equal to the second. | {"$lte":["$a",10]} |
Boolean logic
A value is falsy if it is null, false, integer 0, or float 0.0; everything else (including empty string or array) is truthy.
| Operator | Description | Example |
|---|---|---|
$and | True when all arguments are truthy. | {"$and":[{"$gte":["$a",1]},{"$lt":["$a",9]}]} |
$or | True when any argument is truthy. | {"$or":[{"$eq":["$a",1]},{"$eq":["$a",2]}]} |
$not | Negates the truthiness of its (first) argument. | {"$not":["$flag"]} |
Note:
$andand$ordo not short-circuit — all arguments are evaluated before the result is combined.
Math
| Operator | Description | Example |
|---|---|---|
$abs | Absolute value; integer stays integer (checked overflow), float stays float. | {"$abs":-7} |
$ceil | Smallest integer-valued result not less than the number; an integer is returned unchanged. | {"$ceil":2.1} |
$floor | Largest integer-valued result not greater than the number; an integer is returned unchanged. | {"$floor":2.9} |
$round | Rounds to the nearest integer value (ties away from zero); an integer is returned unchanged. | {"$round":2.5} |
$trunc | Truncates toward zero to an integer value; an integer is returned unchanged. | {"$trunc":2.9} |
$sqrt | Square root as a float. | {"$sqrt":16} |
Limitation:
$roundand$trunctake no precision/decimal-places argument (unlike MongoDB).$ceil/$floor/$round/$truncapplied to a float return a float with an integral value (e.g.3.0), not an integer-typed value.
Strings
| Operator | Description | Example |
|---|---|---|
$concat | Concatenates string arguments; returns null if any argument is null; a non-string, non-null argument is a type error. | {"$concat":["$first"," ","$last"]} |
$toUpper | Uppercases a string; null/missing yields an empty string. | {"$toUpper":"$name"} |
$toLower | Lowercases a string; null/missing yields an empty string. | {"$toLower":"$name"} |
$strLenCP | Number of Unicode code points in a string; a non-string argument is a type error. | {"$strLenCP":"héllo"} |
$split | Splits a string by a delimiter into an array; an empty delimiter returns the whole string as a single element. Requires [string, string]. | {"$split":["a,b,c",","]} |
Arrays
| Operator | Description | Example |
|---|---|---|
$size | Length of an array as an integer; a non-array argument is a type error. | {"$size":"$tags"} |
$arrayElemAt | Element at an integer index; negative indexes count from the end; out-of-range returns null. Requires [array, integer]. | {"$arrayElemAt":["$tags",-1]} |
$in | True when the first argument equals any element of the second (array) argument. Requires [value, array]. | {"$in":["b",["a","b"]]} |
$isArray | True when the argument is an array. | {"$isArray":"$tags"} |
$concatArrays | Concatenates array arguments into one array; a non-array argument is a type error. | {"$concatArrays":[[1,2],[3]]} |
Limitation: System variables (
$$ROOT,$$CURRENT,$$NOW, etc.) are unsupported — a$$-prefixed string raises a malformed error. There are no$map,$filter,$reduce,$mergeObjects,$arrayToObject,$reverseArray,$sortArray, date operators,$regexMatch,$substr/$substrCP,$trim, or type-conversion operators ($toString,$toInt, …).
Worked example
Group by a field with $sum and $avg, then sort the groups. Given a salaries collection of {"dept": ..., "sal": ...} documents:
omgdb aggregate app.omgdb salaries '[
{"$group":{"_id":"$dept","total":{"$sum":"$sal"},"avg":{"$avg":"$sal"},"n":{"$sum":1}}},
{"$sort":{"total":-1}}
]'
For input documents:
[
{"dept":"a","sal":100},
{"dept":"a","sal":200},
{"dept":"b","sal":50}
]
The $group stage produces one document per department, then $sort orders them by total descending:
[
{"_id":"a","total":300,"avg":150.0,"n":2},
{"_id":"b","total":50,"avg":50.0,"n":1}
]
Note that total is an integer (300) because every summed value was an integer, while avg is a float (150.0). The {"$sum":1} accumulator counts documents per group.
A second example combines computed fields with a projection — adding a concatenated full field, then keeping only it:
omgdb aggregate app.omgdb people '[
{"$addFields":{"full":{"$concat":["$first"," ","$last"]}}},
{"$project":{"_id":0,"full":1}}
]'
For {"first":"ada","last":"lovelace"} this yields {"full":"ada lovelace"}.
Error behavior
Pipeline errors are surfaced rather than silently swallowed:
- A pipeline that is not an array, or a stage that is not a single
$-operator object, is a malformed error. - An unrecognized stage, accumulator, or expression operator is an unknown operator error.
- An operand of the wrong type (e.g.
$concaton a number,$divideby zero, integer overflow in$add/$subtract/$multiply/$mod/$sum) is a type error. Integer arithmetic uses checked operations, so overflow returns an error instead of panicking. - A
$matchfilter that fails to compile surfaces the underlying query error.