本文主要是介绍MongoDB University课程M001: MongoDB Basics 学习笔记,希望对大家解决编程问题提供一定的参考价值,需要的开发者们随着小编来一起学习吧!
课程地址为:https://university.mongodb.com/mercury/M001/2020_November_17
Chapter 1: What is MongoDB?
MongoDB是NoSQL document数据库。也就是说存放的是document,document存放在collection中。
document由一些列filed和value组成。也可以认为是键值对。
MongoDB Atlas是MongoDB的数据库云服务。底层使用的是AWS, Google Cloud Platform或Azure。
Replica Set是一组存放相同数据的MongoDB Server,有主有从。
Cluster是一组存放数据的MongoDB Server,可以认为就是Shard,每一个Shard可以由Replica Set组成。
如果你有MongoDB University的账号,就可以创建free tier的Atlas环境。配置如下:
Atlas Organization: MDBU
Project: M001
Free Tier Atlas cluster: Sandbox
登录Atlas的URL为:
https://account.mongodb.com/account/login
Project附属于Organization, Cluster附属于Project。
创建Cluster时可以选择公有云基础设施,我选的是AWS Singapore Region,约有512M存储。
创建完成后,可导入样例数据,参考这里。基于此文档,也可以下载样例数据导入你自己创建的MongoDB实例:
curl https://atlas-education.s3.amazonaws.com/sampledata.archive -o sampledata.archive
mongorestore --archive=sampledata.archive --drop
导入数据和监控都可以用Atlas IDE来做。最终得到的连接串为:
mongo "mongodb+srv://m001-student:m001-mongodb-basics@sandbox.iyxgf.mongodb.net/admin"
此连接串也可以用于mongo shell。
Chapter 2: Importing, Exporting, and Querying Data
document的格式是JSON,但实际存放的编码是BSON。
JSON文档以{}
括起,key和value间用:
分割。每一个key/value对之间用,
分割。key必须用""
括起。在MongoDB中,key称为field。JSON的语法参见这里
JSON可读性好,但耗空间。BSON空间效率和性能更好,也更灵活(支持的数据类型更多)。
JSON格式文件的导入和导出使用mongoimport和mongoexport;BSON格式文件的导入和导出使用mongorestore和mongodump。
在以下示例中,--drop
表示先删除再导入导出:
mongodump --uri "mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies"mongoexport --uri="mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --collection=sales --out=sales.jsonmongorestore --uri "mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --drop dumpmongoimport --uri="mongodb+srv://<your username>:<your password>@<your cluster>.mongodb.net/sample_supplies" --drop sales.json
<databaseName>.<collectionName>
称为Namespace。可以用db表示当前数据库,例如:
> db.getName()
sample_training
> db.zips.findOne()
下面介绍MongoDB查询语言,简称MQL,类似于SQL。
find类似于SQL中的select,下例中的it是iterate的缩写,类似于cursor:
> show dbs
admin 0.000GB
config 0.000GB
local 0.000GB
sample_airbnb 0.050GB
sample_analytics 0.009GB
sample_geospatial 0.001GB
sample_mflix 0.040GB
sample_restaurants 0.006GB
sample_supplies 0.001GB
sample_training 0.039GB
sample_weatherdata 0.002GB
> use sample_training
switched to db sample_training
> db.getName()
sample_training
> show collections
companies
grades
inspections
posts
routes
trips
zips
> db.zips.find({"state": "NY"})
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f89"), "city" : "FISHERS ISLAND", "zip" : "06390", "loc" : { "y" : 41.263934, "x" : 72.017834 }, "pop" : 329, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f8a"), "city" : "NEW YORK", "zip" : "10001", "loc" : { "y" : 40.74838, "x" : 73.996705 }, "pop" : 18913, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f8b"), "city" : "NEW YORK", "zip" : "10003", "loc" : { "y" : 40.731253, "x" : 73.989223 }, "pop" : 51224, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f8c"), "city" : "GOVERNORS ISLAND", "zip" : "10004", "loc" : { "y" : 40.693604, "x" : 74.019025 }, "pop" : 3593, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f8d"), "city" : "NEW YORK", "zip" : "10005", "loc" : { "y" : 40.705649, "x" : 74.008344 }, "pop" : 202, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f8f"), "city" : "NEW YORK", "zip" : "10006", "loc" : { "y" : 40.708451, "x" : 74.013474 }, "pop" : 119, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f90"), "city" : "NEW YORK", "zip" : "10009", "loc" : { "y" : 40.726188, "x" : 73.979591 }, "pop" : 57426, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f92"), "city" : "NEW YORK", "zip" : "10010", "loc" : { "y" : 40.737476, "x" : 73.981328 }, "pop" : 24907, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f93"), "city" : "NEW YORK", "zip" : "10002", "loc" : { "y" : 40.715231, "x" : 73.987681 }, "pop" : 84143, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f94"), "city" : "NEW YORK", "zip" : "10012", "loc" : { "y" : 40.72553, "x" : 73.998284 }, "pop" : 26365, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f95"), "city" : "NEW YORK", "zip" : "10011", "loc" : { "y" : 40.740225, "x" : 73.99963 }, "pop" : 46560, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f96"), "city" : "NEW YORK", "zip" : "10007", "loc" : { "y" : 40.713905, "x" : 74.007022 }, "pop" : 3374, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f97"), "city" : "NEW YORK", "zip" : "10013", "loc" : { "y" : 40.718511, "x" : 74.002529 }, "pop" : 21860, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f98"), "city" : "NEW YORK", "zip" : "10014", "loc" : { "y" : 40.73393, "x" : 74.005421 }, "pop" : 31147, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f9a"), "city" : "NEW YORK", "zip" : "10017", "loc" : { "y" : 40.75172, "x" : 73.970661 }, "pop" : 12465, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f9b"), "city" : "NEW YORK", "zip" : "10018", "loc" : { "y" : 40.754713, "x" : 73.992503 }, "pop" : 4834, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f9c"), "city" : "NEW YORK", "zip" : "10019", "loc" : { "y" : 40.765069, "x" : 73.985834 }, "pop" : 36602, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72f9f"), "city" : "NEW YORK", "zip" : "10020", "loc" : { "y" : 40.759729, "x" : 73.982347 }, "pop" : 393, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72fa0"), "city" : "NEW YORK", "zip" : "10021", "loc" : { "y" : 40.768476, "x" : 73.958805 }, "pop" : 106564, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca72fa1"), "city" : "NEW YORK", "zip" : "10016", "loc" : { "y" : 40.744281, "x" : 73.978134 }, "pop" : 51561, "state" : "NY" }
Type "it" for more
find的常用形式:
> db.zips.find({"state": "NY"}).count()
1596
>
> db.zips.find({"state": "NY", "city": "ALBANY"})
{ "_id" : ObjectId("5c8eccc1caa187d17ca731d0"), "city" : "ALBANY", "zip" : "12204", "loc" : { "y" : 42.684667, "x" : 73.735364 }, "pop" : 6927, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731d4"), "city" : "ALBANY", "zip" : "12206", "loc" : { "y" : 42.668326, "x" : 73.774406 }, "pop" : 17230, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731d5"), "city" : "ALBANY", "zip" : "12207", "loc" : { "y" : 42.658133, "x" : 73.752327 }, "pop" : 2709, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731d6"), "city" : "ALBANY", "zip" : "12208", "loc" : { "y" : 42.655989, "x" : 73.796357 }, "pop" : 22041, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731d7"), "city" : "ALBANY", "zip" : "12209", "loc" : { "y" : 42.641665, "x" : 73.785385 }, "pop" : 10008, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731db"), "city" : "ALBANY", "zip" : "12202", "loc" : { "y" : 42.641314, "x" : 73.764071 }, "pop" : 11097, "state" : "NY" }
{ "_id" : ObjectId("5c8eccc1caa187d17ca731de"), "city" : "ALBANY", "zip" : "12210", "loc" : { "y" : 42.65677, "x" : 73.76052 }, "pop" : 9374, "state" : "NY" }
>
> db.zips.find({"state": "NY", "city": "ALBANY"}).pretty()
{"_id" : ObjectId("5c8eccc1caa187d17ca731d0"),"city" : "ALBANY","zip" : "12204","loc" : {"y" : 42.684667,"x" : 73.735364},"pop" : 6927,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731d4"),"city" : "ALBANY","zip" : "12206","loc" : {"y" : 42.668326,"x" : 73.774406},"pop" : 17230,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731d5"),"city" : "ALBANY","zip" : "12207","loc" : {"y" : 42.658133,"x" : 73.752327},"pop" : 2709,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731d6"),"city" : "ALBANY","zip" : "12208","loc" : {"y" : 42.655989,"x" : 73.796357},"pop" : 22041,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731d7"),"city" : "ALBANY","zip" : "12209","loc" : {"y" : 42.641665,"x" : 73.785385},"pop" : 10008,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731db"),"city" : "ALBANY","zip" : "12202","loc" : {"y" : 42.641314,"x" : 73.764071},"pop" : 11097,"state" : "NY"
}
{"_id" : ObjectId("5c8eccc1caa187d17ca731de"),"city" : "ALBANY","zip" : "12210","loc" : {"y" : 42.65677,"x" : 73.76052},"pop" : 9374,"state" : "NY"
}
可利用findOne随机返回一条记录,适合于查看数据样式:
> db.zips.findOne()
{"_id" : ObjectId("5c8eccc1caa187d17ca6ed16"),"city" : "ALPINE","zip" : "35014","loc" : {"y" : 33.331165,"x" : 86.208934},"pop" : 3062,"state" : "AL"
}
Chapter 3: Creating and Manipulating Documents
这部分主要介绍插入(insert)、更新(updateOne, updateMany)和删除(deleteOne,deleteMany)。
每一个document都有一个"_id"字段,使得此document为唯一。可以指定或使用系统默认值ObjectId()。
insert可以一次插入多个document:
db.inspections.insert([ { "test": 1 }, { "test": 2 }, { "test": 3 } ])
insert可以带ordered属性,默认为true,表示遇到错误即停止余下document的插入,例如以下语句只插入第一条记录:
db.inspections.insert([{ "_id": 1, "test": 1 },{ "_id": 1, "test": 2 },{ "_id": 3, "test": 3 }])
如果设为false则遇到错误时,余下记录仍会尝试插入。例如下例会插入第1条和第3条记录:
db.inspections.insert([{ "_id": 1, "test": 1 },{ "_id": 1, "test": 2 },{ "_id": 3, "test": 3 }],{ "ordered": false })
updateOne更新满足条件的第一条记录;updateMany更新满足条件的所有记录。
如下例所示,update操作符包括$inc
, $set
,push
和$unset
:
db.zips.updateMany({ "city": "HUDSON" }, { "$inc": { "pop": 10 } })
db.zips.updateOne({ "zip": "12534" }, { "$set": { "pop": 17630 } })
db.grades.updateOne({ "student_id": 250, "class_id": 339 },{ "$push": { "scores": { "type": "extra credit","score": 100 }}})
$set
是设置新值,$push
是增加新元素,或键值对。当$set
的key不存在时,其效果等于$push
。$unset
用于删除元素。
删除也分为deleteOne和deleteMany两个命令。
db.inspections.deleteMany({ "test": 1 })
db.inspections.deleteOne({ "test": 3 })
删除collection可使用drop:
db.inspection.drop()
当collection中没有数据时,collection仍会存在。但当数据库中所有collection都被删除后,数据库就不存在了。
> db.xy.insert({a:1})
WriteResult({ "nInserted" : 1 })
> show collections
companies
grades
inspections
posts
routes
trips
xy
zips
> db.xy.deleteMany({})
{ "acknowledged" : true, "deletedCount" : 1 }
> show collections
companies
grades
inspections
posts
routes
trips
xy
zips
> db.xy.drop()
true
Chapter 4: Advanced CRUD Operations
类似于更新语句,查询语句也有自己的操作符,如$eq
, $neq
, $gt
, $lt
, $gte
, $lte
。实际上默认的操作符就是$eq
。
首先是比较操作符,格式为:
{<field> : { <operator> : <value> } }
操作符以$
作为前缀,但$
还有其它用途,如用于Aggregation pipeline等。
操作符使用格式如下,当然也可以组合:
{ field : {$operator: value}}
示例:
db.zips.find({pop:{"$lt":1000}}).count()db.trips.find({ "tripduration": { "$lte" : 70 },"usertype": { "$ne": "Subscriber" } }).pretty()
db.trips.find({"birth year" : 1998})
然后是逻辑操作符,包括$and
, $or
, $nor
, $not
。
$not
的语法为:
{ "operator" : { <clause> }}
其它逻辑操作符的语法为:
{ "operator" : [ { <clause> }, { <clause> }, ...]}
和$eq
类似,当没有指定逻辑操作符时,默认就是$and
,称为隐式and。例如以下几个语句效果一样:
> db.inspections.find({"$and":[{"result": "Out of Business"}, {"sector": "Home Improvement Contractor - 100"}]}).count()> db.inspections.find({"result": "Out of Business", "sector": "Home Improvement Contractor - 100"}).count()
以下两个语句的效果一样,但不是我们希望的$and
操作:
db.inspections.find({"result": "Out of Business"}, {"sector": "Home Improvement Contractor - 100"}).count()db.inspections.find({"result": "Out of Business"}).count()
下面3个语句效果一样,但一个比一个更优:
{"$and" : [ { id : { "$gt" : 25 } }, { id : { "$lt" : 100 }}]}
{ id : { "$gt" : 25 } }, { id : { "$lt" : 100 }}
{ id : {"$gt" : 25, "$lt" : 100} }
示例:
db.zips.find({pop : {"$lte": 1000000, "$gte":5000}}).count()
隐式and和显式and其实不太好区分,例如以下语句如果使用隐式and,结果就完全不一样:
db.companies.find({"$and" : [{"$or" : [{"category_code" : "web"}, {"category_code" : "social"} ]}, {"$or": [{"founded_month" : 10}, {"founded_year" : 2004}]}]}
)
$expr
表示expressive,格式为:
{ "$expr" : { <expression>}}
$expr
运行使用变量和条件判断。例如:
db.trips.find({ "$expr": { "$eq": [ "$end station id", "$start station id"] }}).count()
此例可知,$
除了做操作符外,还可以表示field的value。
例如:
db.trips.find({ "$expr": { "$eq": [ "$end station id", "$start station id"] }}).count()db.trips.find({ "$expr": { "$and": [ { "$gt": [ "$tripduration", 1200 ]},{ "$eq": [ "$end station id", "$start station id" ]}]}}).count()
$expr
类似shell中的eval,可以认为$expr
后出现的所有$
开始的,如果不是操作符,都算是变量。
一般MQL的语法为{ <field> : { <operator> : <value> } }
, 而$expr
的语法则类似aggregation语法,为:
{ <operator> : { <field> : <value> } }
再来看array operator,前面已经介绍过$push
。另一个操作符为$all
和$size
:
db.listingsAndReviews.find({ "amenities": {"$size": 20,"$all": [ "Internet", "Wifi", "Kitchen","Heating", "Family/kid friendly","Washer", "Dryer", "Essentials","Shampoo", "Hangers","Hair dryer", "Iron","Laptop friendly workspace" ]}}).pretty()
$all
表示包含指定数组中所有的值,$size
表示数字的大小。
示例:
db.listingsAndReviews.find({ "reviews": { "$size":50 },"accommodates": { "$gt":6 }})db.listingsAndReviews.find({ "property_type": "House","amenities": "Changing table" }).count()
find方法如果跟两个{}
,第二个即为projection。0表示排除,1表示包括:
db.<collection>.find({<query>}, {<projection>})
例如:
db.listingsAndReviews.find({ "amenities": "Wifi" },{ "price": 1, "address": 1, "_id": 0 }).pretty()
默认情况下"_id"总是返回。
另一个array操作符为$elemMatch
,表示至少一个元素满足条件:
db.grades.find({ "class_id": 431 },{ "scores": { "$elemMatch": { "score": { "$gt": 85 } } }}).pretty()db.grades.find({ "scores": { "$elemMatch": { "type": "extra credit" } }}).pretty()
访问子文档可以用dot notation:
db.trips.findOne({ "start station location.type": "Point" })
注意dot notation需要加双引号:
db.inspections.find({ "address.city": "NEW YORK" }).count()
以下是错误的:
db.inspections.find({ address.city: "NEW YORK" }).count()
如果此元素为数组,可以用0表示第一个元素:
db.companies.find({ "relationships.0.person.first_name": "Mark","relationships.0.title": {"$regex": "CEO" } },{ "name": 1 }).pretty()
也可以与array operator “$elemMatch” 配合:
db.companies.find({ "relationships":{ "$elemMatch": { "is_past": true,"person.first_name": "Mark" } } },{ "name": 1 }).pretty()
Chapter 5: Indexing and Aggregation Pipeline
aggregation(以下称为聚合)之所以称为框架,是因为其建立在MQL之上.
处理流程类似于管道,分阶段处理。
siphon: 虹吸
amenity:便利设施
以下两个语句的作用是一样的:
db.listingsAndReviews.find({ "amenities": "Wifi" },{ "price": 1, "address": 1, "_id": 0 }).pretty()db.listingsAndReviews.aggregate([{ "$match": { "amenities": "Wifi" } },{ "$project": { "price": 1,"address": 1,"_id": 0 }}]).pretty()
聚合操作符也是以$
开头,例如$match
,$project
以及$group
。
$group
可以实现distinct, group by, sum等聚合操作,注意引用field时必须用$
,不能写成"address.country":
> db.listingsAndReviews.aggregate([ { "$project": { "address": 1, "_id": 0 }},{ "$group": { "_id": "$address.country" }}])
{ "_id" : "Canada" }
{ "_id" : "Brazil" }
{ "_id" : "Australia" }
{ "_id" : "China" }
{ "_id" : "Portugal" }
{ "_id" : "Turkey" }
{ "_id" : "Hong Kong" }
{ "_id" : "Spain" }
{ "_id" : "United States" }> db.listingsAndReviews.aggregate([{ "$project": { "address": 1, "_id": 0 }},{ "$group": { "_id": "$address.country","count": { "$sum": 1 } } }]){ "_id" : "Canada", "count" : 649 }
{ "_id" : "Brazil", "count" : 606 }
{ "_id" : "Australia", "count" : 610 }
{ "_id" : "China", "count" : 19 }
{ "_id" : "Portugal", "count" : 555 }
{ "_id" : "Turkey", "count" : 661 }
{ "_id" : "Hong Kong", "count" : 600 }
{ "_id" : "Spain", "count" : 633 }
{ "_id" : "United States", "count" : 1222 }
cursor命令不会改变数据,例如pretty, sort, limit, count:
db.zips.find().sort({ "pop": 1 }).limit(1)db.zips.find().sort({ "pop": -1 }).limit(10)db.zips.find().sort({ "pop": 1, "city": -1 })
sort和limit组合时,通常sort在前,如果你写反了,MQL也会把它纠正过来。sort命令中,1表示升序。
注意排序时,null被认为是最低位,因此有时需要排除null。例如以下两个语句效果一样:
db.companies.find({ "founded_year": { "$ne": null }},{ "name": 1, "founded_year": 1 }).sort({ "founded_year": 1 }).limit(5)db.companies.find({ "founded_year": { "$ne": null }},{ "name": 1, "founded_year": 1 }).limit(5).sort({ "founded_year": 1 })
索引是提升查询效率最有效的手段,类似于书籍最后的Index一节,可以查询关键字所在的页面。
示例如下,
db.trips.find({ "birth year": 1989 })db.trips.find({ "start station id": 476 }).sort( { "birth year": 1 } )db.trips.createIndex({ "birth year": 1 })db.trips.createIndex({ "start station id": 476, "birth year": 1 })
排序消耗资源较多,应尽量通过索引避免排序。
data modeling是指在document中如何组织field,以保证应用性能和查询能力。最重要的原则是:
数据按照使用的方式存放。Data is stored in the way that it is used.
allergy:过敏
prescription:处方
一起访问的数据最好也存放在一起(一个document中)。数据模型会随应用演进而演进。
upsert是update和insert的混合操作,用于条件更新。如果查询到结果则update,否则insert。只在必要时使用。通过upsert属性来指定。默认upsert为false。
示例:
db.iot.updateOne({ "sensor": r.sensor, "date": r.date,"valcount": { "$lt": 48 } },{ "$push": { "readings": { "v": r.value, "t": r.time } },"$inc": { "valcount": 1, "total": r.value } },{ "upsert": true })
upsert是updateOne语句的属性,这表示其最多只能插入一条记录。
Chapter 6: Next Steps
Atlas Data Explorer有很多高级功能,包括:
- Aggregation Builder
- Anti-Pattern Advisory
- Performance Advisor
- Advanced Text Search
Atlas中,Organization是计费主体,包括多个Project,一个Project可包括多个Cluster,Cluster名字必须唯一。
Realm
Charts是做可视化的,例如做Dashboard。
Compass是MongoDB管理工具。
参考
https://github.com/mongodb/mongo
https://docs.atlas.mongodb.com/sample-data/available-sample-datasets
https://docs.mongodb.com/manual/reference/method
MongoDB Developer Hub
MongoDB Community Forums
Case study: Bosch Leads Charge into Internet of Things
Case study: Coinbase
Case study: SEGA
How the Financial Sector Uses MongoDB
总结
这个课程非常适合入门,推荐,而且视频可以下载和直接访问。
实验环境可以用Atlas的,我个人建议自己搭一个单实例的,无需cluster,无需replica set。
12月19日上午10:52完成此课程:
https://university.mongodb.com/course_completion/b1acdb82-3667-46a9-abab-320d2627119d?utm_source=copy&utm_medium=social&utm_campaign=university_social_sharing
这篇关于MongoDB University课程M001: MongoDB Basics 学习笔记的文章就介绍到这儿,希望我们推荐的文章对编程师们有所帮助!