xiuhy/mongoDB笔记.md

## mongoDB笔记.md

      
    Raw
  

              mongoDB笔记.md
            
          
    mogodb 笔记

暂时还未整理

curd操作

insert

插入的时候如果集合不存在，那么插入操作会创建集合

插入的时候如果集合不存在，则创建集合并插入
如果插入时未提供_id字段，系统会提供一个ObjectId类型的_id。
MongoDB中所有的写操作在单一文档层级上是原子的

db.collection.insert

与insertMany()参数一致。返回参数不同
返回值:
单条插入时返回 写入结果 对象。批量插入时返回批量插入结果 对象
插入结果

正常插入成功返回结构：

WriteResult({ "nInserted" : 1 })

如果发生异常则会writeConcernError

WriteResult({

"nInserted" : 1,

"writeConcernError" : {

"code" : 64,

"errmsg" : "waiting for replication timed out at shard-a"

}

})

官方引用
批量插入结果

返回BulkWriteResult对象
官方引用
db.collection.insertOne


v3.2 版本新加
syntax:


db.collection.insertOne(
document,
{
writeConcern: document
}
)


Parameter
Type
description


document    
document
待插入的文档


writeConcern
document
Optional. A document expressing the write concern. Omit to use the default write concern. 这个暂时不明白


返回值：

{

  "acknowledged" : true操作成功，false失败.

  "insertedId" :插入成功的id

}

db.collection.insertMany


v3.2 版本新加
syntax:


db.collection.insertMany(

[ document 1 , document 2, ... ] ,

{

writeConcern: document,

ordered: boolean
}

)


Parameter
Type
description


ordered    
boolean
可选参数，boolean类型。表示mongoDB插入数据时是否需要按照顺序插入，默认true。显然无序的插入性能更加


其他参数同insertOne
返回值：

{
"acknowledged" : A boolean acknowledged as true if the operation ran with write concern or false if write concern was disabled.
  "insertedId" : 数组值，返回插入成功的_id的
}

insertMany()方法是分组插入，每组上限1000，超过服务内部自动分小于等于1000，然后完成插入命令
insertMany() 不支持**explain()**方法，只能使用insert()方法代替
update


一旦设定_id 不能更改该字段值，也不能用有不同 _id 字段值的替换文档来替换已经存在的文档。
mongodb 默认情况下 update 语法默认更新第一个匹配的数据，其他不匹配。

updateOne


v3.2
即使可能有多个文档通过过滤条件匹配到，但是也最多也只更新一个文档
syntax:


db.collection.updateOne(
filter,
update,
{
upsert:boolean,
writeConcern:document,
collation:document
}
)


Parameter
Type
description


filter    
document
过滤器，和find查询操作符相同，可以参考find的


update
document
更新功能。更新字段模式：field:value. 更多更新操作符


upsert
boolean
true:如果没有查询符合数据,结合查询条件和更新内容组成一个完整文档，然后插入新数据。默认false


writeConcern
document
选填项，参考insert()


collation
document
选填项，排序规则


返回对象：


{
   acknowledged: boolean.操作结果，true:成功，false:失败或者writeConcern连接失败
   matchedCount: int,匹配到数据数量
   modifiedCount: int,更新的数量
   upsertedId:更新_id 集合
}

updateMany


v3.2
即使可能有多个文档通过过滤条件匹配到，更新符合匹配条件的多个资源数据
syntax:


db.collection.updateMany(
filter,
update,
{
upsert:boolean,
writeConcern:document,
collation:document
}
)

参数同updateOne.
update


修改一个或者多个集合中已经存在的文档记录。
update 方法默认情况下只会更新找到的第一个数据。更新多个使用， multi: true
syntax:


db.collection.updateOne(
filter,
update,
{
upsert:boolean,
writeConcern:document,
   multi:boolean
}
)


Parameter
Type
description


multi    
boolean
可选项，true:更新所有匹配的数据。反之，只更新匹配的第一个数据。默认false


Collation 排序规则

collation对象：

{
   locale:string,
caseLevel:boolean,
caseFirst:string,
strength:int,
numbericOrdering:boolean,
alternate:string,
maxVariable:string,
backwards:boolean

}


locale是必填其他选填，详细

remove


使用remvoe,deleteOne,delteMany删除集合所有文档，也不会删除索引。可以通过drop删除集合文档并且删除集合索引。

remove()


syntax:


db.collection.remove(
filter,
{
justOne:boolean,
writeConcern:document,
collation:document
}
)


Parameter
Type
description


justOne    
boolean
可选项,true:限制只删除匹配数据第一个数据。反之，删除所有匹配的数据文档。默认false


deleteOne()


删除第一个匹配的文档，可以通过唯一索引精确删除
syntax:


db.collection.deletOne(
filter,
{
writeConcern:document,
collation:document
}
)

deleteMany()


syntax:


db.collection.deletMany(
filter,
{
writeConcern:document,
collation:document
}
)

delete 操作不会影响索引。drop()方法会删除整个集合，包括索引，重建该集合和索引或许更高效。
如果需要删除大部分记录，可以先备份集合，然后drop() 集合，效率更高
参考
remove 必须有参数，不能没有参数（估计防止误删除）
表删除，db.documentName.drop() .删除该表
deleteOne()
删除第一个符合条件的记录
聚合aggregate


聚合框架暂时不能写入，聚合结果必须在16MB以内 取自《MongoDB权威指南2版》


db.user.aggregate(
{match},

{project},

{group},

.....

)

$match


$match 用于对文档集合筛选，然后在该基础上做集合
$match 可以使用所有查询操作符
不可以使用空间操作符
$match 尽量放在管道前面.优点：
   - 过滤不必要文档数据，减少管道I/O
   - 可以使用索引，提高查询效率
语法


{$match:{query operator}}

$project


自定义显示字段
重命名字段显示。 e.g:{$project:{userId:$_id}} id重命名显示userId
_id 字段会被默认显示，可以通过_id:0去除

数学表达式


$add:[expre1[,expre2,.....expreN]] 将表达式相加
$substract:[expre1,expre2] 用第一个表达式减去第二个表达式
$multiply:[expre1[,expre2,.....expreN]] 将表达式相乘
$divide:[expr1,expr2] 第一个表达式除以第二个表达式的商作为结构
$mod:[expre1,expr2] 第一个表达式除以第二个表达式得到的余数作为结果集

日期表达式


聚合框架中提供提取日期表达式。只能对日期类型数据字段进行日期操作
$year,$month,$week,$dayOfMonth,$dayOfWeek,
$dayOfYear,$hour,$minute,$second
e.g:


db.user.aggregate({

$project:{

tmpDate:{"$month":$createdTime}

}

})

字符串表达式


字符串基本操作使用

$substr:[expr, startOffset,numToReturn]


参数
类型
备注


expr
string
被截取的字符串


startOffset
int 
截取开始字节


numToReturn
int
总共截取多少字节


$concat:[expr1[,pxpr2,...,exprN]]

将给定的表达式链接在一起作为返回结果
$toLower:expre


小写转换
expre:string

$toUpper:expre


大写转换
expre:string

逻辑表达式


$cmp:[expre1,expre2]
比较expre1,expre2。如果expre1等于expr2,返回0，
expre1<expr2 返回负数
expre1?expre2 返回正数


$strcasecmp:[str1,str2]
比较str1和str2，区分大小写。


$eq/ne/gt/gte/lt/lte [expr1,expre2]


$and：[expr1[,expr2,.....exprN]]
所有true，返回true,反之false


$or:[expr1[,expr2,.....exprN]]
只要有一个true,返回true,反之false


$not:expr
对expr取反


$cond: [booleanExpr,trueExpr,falseExpr]
booleanExpr值是true,那就返回trueExpr,否则返回falseExpr


$ifNull:[expr,replacementExpr]
如果expr是null,返回replacementExpr，否则返回expr.


$group

可以将文档依据特定字段不同分组。选择字段传递给_id
算数操作符

$sum:value

对于分组中的每个文档，将value与计算结果相加
$avg:value

返回每个组的平局值
极值操作


$max:expr 返回分组内最大值
$min:expr 返回分组最小值
$first:expr 返回分组的第一个值，忽略后面所有值，只有排序之后，明确排序顺序，使用该操作才有意义
$last:expr 与$first相反

$max,$min 会查文档极值。如果无序也可以有效工作，有序操作会比较浪费
如果是按照希望方式排序情况下，$first,$last比更有效
数组操作符


$addToSet:expr 如果数组不存在该数据，则添加到结果集中。返回结果集，每个元素最多出现一次，顺序不固定


$push:expr 不管expr是什么值，都将它添加到数组中，返回数据集合


$unwind


拆分可以将数组中每一个值拆分成单独文档。

$sort


可以根据任何字段（映射的重命名字段）进行排序
强烈建议在管道第一阶段排序，可以使用索引
与$group一样，不能使用流式工作方式操作，必须接受所有的文档之后才可以排序。

$limit


接受一个数字n,返回结果集中的前n个文档

$skip


接受一个数字n,丢弃结果集中前n个文档。
不建使用该方法跳过大量数据，效率低下


db.collection.aggregate(  
   {$skip:n}

)

管道优化


$sort 放在$match 减少排序文档个数，提高效率
$skip +$limit .优化器会把limt放在skip前面。limit数量是原limit+skip的值

更多管道优化
mapReduce

db.collection.find() 查询


synctax
db.collection.find(query, projection)


Parameter
Type
description


query    
document
可选参数，查询过滤器即查询条件


projection    
document
可选参数，指定返回字段，默认返回全部字段。格式：{field:1}. 1:显示;非1表示不现实，或者不填写。[官方引用][projection_url]


区间条件查询

field 在value和value2之间。

db.collection.find({field:{$gt:value1,$lt:value2}});

数组查询

查询一个数组元素

数组字段中包含value的集合

db.collction.find({arrayField:value})

查询操作符（Query Operators）

对比操作符


Name
Description


$eq
Matches values that are equal to a specified value.  相等


$gt
Matches values that are greater than a specified value.  大于


$gte
Matches values that are greater than or equal to a specified value. 大于等于


$lt
Matches values that are less than a specified value. 小于


$lte
Matches values that are less than or equal to a specified value. 小于等于


$ne
Matches all values that are not equal to a specified value. 不等于


$in
Matches any of the values specified in an array.


$nin
Matches none of the values specified in an array.


逻辑操作符


Name
Description


$or
{$or:[ {  }, {  }, ... , {  } ] }.只有所有表达式支持索引才会用索引扫描，否则全表扫描


$and
语法与$or一致。支持短路


$not
{field:{$not:{expression}}}.


$nor
正对一个或者多个条件查询。 {$nor:[{expression1},{expression2}]}。包含不存在该字段的情况


更多查询操作符
[projection_url]:http://docs.mongoing.com/manual-zh/reference/method/db.collection.find.html#find-projection
索引

索引类型：


_id 索引：默认索引


单键索引
单个值内容


多键索引
为存储数组的字段创建的索引。


复合索引
多个字段一起查询使用。只有在首先使用索引键进行排序时，索引才有用。


过期索引（TTL）
1.索引字段必须是ISODate,或者ISODate数组，否则不能被系统删除
2. 如果 是数组时间，则按照最小时间删除
3.不能是复合索引
4.时间不精确，后台60秒跑一次。


全文索引（文本）
（{$text:{$search:"xx "}}）
或查询： 空格
非查询： -
与查询:  ""
全文索引相似度
$meta操作符：{score:{$meta:"textScore"}}
写在查询条件后面可以返回返回结果的相似度
与sort一起使用，可以达到效果
一个数据集合只存在一个全文索引?


地理位置索引


哈希索引
为了支持mongodb分片服务，将字段哈希散列分布。这些索引在其范围内具有更随机的值分布，但仅支持等式匹配，并且不支持基于范围的查询


地理位置索引：
分类：2d 索引，用于存储和查找平面上的点
2dsphere 索引，用于存储和查找球面上的点
2d:
创建2d索引：db.collection.createIndex({w:"2d"})
位置表示方式：经纬度[经度,纬度]
取值范围：经度：-180 180，纬度：-90 90
查询方式：
1.$near查询:查询距离模电最近的点。返回100个最近的点。$maxDistance:x。最大距离
2.$geoWinthin:查询某个形状内的点
$box：[x1,y1],[x2,y2] 矩形
$center:[[x,y],r] 圆形
$polygon:[],[],[] 多边形


索引属性：

1.name 自定义名称，默认mongoDB使用命名规则
2.唯一性 unique，boolean 类型。默认false. hashed Index 无效
3.稀疏性 sparse.默认false .如果为true，则索引仅引用具有指定字段的文档
4.expireAfterSeconds 过期索引。秒为单位。控制改索引再集合中保留时间
5.background 后台。在后台构建索引，以便创建索引不会阻止其他数据库活动，默认false
6.storageEngine.文本类型，咱不解释。
索引管理


查看当前集合存在索引
db.collection.getIndexes();


删除索引
** 不能删除默认索引_id **


db.collection.dropIndex(index)
语法：


Parameter
Type
description


index    
string or document
Specifies the index to drop. You can specify the index either by the index name or by the index specification document.无参数说明删除全部索引


e.g
db.users.dropIndex("name_1") db.users.dropIndex({"name":-1})
3. 修改索引

可以先删除索引，然后再创建该索引
使用db.collection.reIndex(). 该方法会删除所有索引包括_id索引，然后重建索引。但是会将数据库的写入锁定。


创建索引
db.collection.createIndex(keys, options)
| parameter | type | description |
| ----------: | -------: | --------: |
| keys | document | 设定索引的字段，和排序方式：1：正向排序；2：倒序。e.g: {name:1}或者{name:-1} |
| options | document | 一般不用填写。 |

优化查询

1.创建索引优化。索引还能提升在指定字段上进行常规排序的查询的效率
2.限制查询结构的数目以减少网络需求。 db.collection.xxx().limit()
3.只显示需要内容数据：格式如下：
db.collection.find({},{field1:1,field2:1,field3:1});
4.使用$shint选择一个特定的所引
db.collection.hint(String|document)
e.g:db.user.find().hint({age:1}) ==db.user.find().hint("age_1");
题外话
{$natural:1} 可以强制性全表扫描。-1，反向扫描
5.使用增量操作符（$inc）
递增或者递减。提高效率，避免竞争锁，单个文本操作具有原子性。
如果字段不存在，则自动创建并设置制定值
如果该字段为null则报错
{$inc:{:,......}}
注：在嵌入式文档或者数组中使用字段，使用点（.）;field.subfield
这里数组内容是通过0开始的下坐标映射，且引号括弧。
详情
高效使用索引


取反操作是全表扫描
范围：设计复合索引时，查询多个条件时，设定索引时将精确匹配放在前面，范围匹配放在最后。这样查询效率是范围放在前面的10倍，虽然同样使用了索引。
or 是使用两个子查询然后合并结构集，用$in性能更加，但是无法确定返回顺序。


查询计划

语法：


Parameter
Type
Description


db.collection.explain(params)
String
"queryPlanner", "executionStats","allPlansExecution".Default mode is "queryPlanner"


其他

ObjectId:

空间小，几乎唯一，生成快。objectId 由12-btye组成：前四位：时间戳，三位：
设备id.objectId 排序通过创建时间排序.可以通过ObjectId.getTimestamp()方法获取时间
多线程，多个进程或者多个系统在同一秒内创建objectId,不严格排序插入时间
一旦设定，你不能更新 _id 字段的值，你也不能用有不同 _id 字段值的替换文档来替换已经存在的文档。
原子性

在MongoDB中，对单个文档读写是原子性的。多个文档操作是不具原子性
mongoDB 支持多个引擎：

WiredTiger Storage Engine
http://docs.mongoing.com/manual-zh/core/wiredtiger.html
MMAPv1 Storage Engine
http://docs.mongoing.com/manual-zh/core/mmapv1.html
mongoDB 内嵌数据还是引用数据

mongoDB 不完全像关系型数据库一样建模。一下可以参考使用引用还是内嵌


内嵌
引用


子文档较小
子文档较大


数据不会定期改变
数据经常改变


最终数据一致即可
中间阶段数据必须一致


文档数据小幅增加
文档数据大幅增加


数据通常需要二次查询才能获得
数据通常不包含在结果中


快速读取
快速写入
Parameter	Type	description
document	document	待插入的文档
writeConcern	document	Optional. A document expressing the write concern. Omit to use the default write concern. 这个暂时不明白
Parameter	Type	description
filter	document	过滤器，和find查询操作符相同，可以参考find的
update	document	更新功能。更新字段模式：field:value. 更多更新操作符
upsert	boolean	true:如果没有查询符合数据,结合查询条件和更新内容组成一个完整文档，然后插入新数据。默认false
writeConcern	document	选填项，参考insert()
collation	document	选填项，排序规则
参数	类型	备注
expr	string	被截取的字符串
startOffset	int	截取开始字节
numToReturn	int	总共截取多少字节
Parameter	Type	description
query	document	可选参数，查询过滤器即查询条件
projection	document	可选参数，指定返回字段，默认返回全部字段。格式：{field:1}. 1:显示;非1表示不现实，或者不填写。[官方引用][projection_url]
Name	Description
$eq	Matches values that are equal to a specified value. 相等
$gt	Matches values that are greater than a specified value. 大于
$gte	Matches values that are greater than or equal to a specified value. 大于等于
$lt	Matches values that are less than a specified value. 小于
$lte	Matches values that are less than or equal to a specified value. 小于等于
$ne	Matches all values that are not equal to a specified value. 不等于
$in	Matches any of the values specified in an array.
$nin	Matches none of the values specified in an array.
Name	Description
$or	{$or:[ { }, { }, ... , { } ] }.只有所有表达式支持索引才会用索引扫描，否则全表扫描
$and	语法与$or一致。支持短路
$not	{field:{$not:{expression}}}.
$nor	正对一个或者多个条件查询。 {$nor:[{expression1},{expression2}]}。包含不存在该字段的情况
内嵌	引用
子文档较小	子文档较大
数据不会定期改变	数据经常改变
最终数据一致即可	中间阶段数据必须一致
文档数据小幅增加	文档数据大幅增加
数据通常需要二次查询才能获得	数据通常不包含在结果中
快速读取	快速写入