跳到内容
b 加入表与查询c 迁移,多对多关系

a

用 Sequelize 使用关系型数据库

在本节中,我们将探讨使用关系型数据库的节点应用。在这一节中,我们将为第3-5节中熟悉的笔记应用建立一个使用关系数据库的节点后端。要完成这一部分,你需要对关系型数据库和SQL有合理的了解。有许多关于SQL数据库的在线课程,例如,SQLbolt

可汗学院的SQL介绍

这一部分有24个练习,你需要完成每个练习才能完成课程。练习是通过提交系统提交的,就像前几部分一样,但与第0至7部分不同的是,提交到一个不同的 "课程实例"。

Advantages and disadvantages of document databases

我们在本课程的所有前几部分都使用了MongoDB数据库。Mongo是一个文档数据库,它的一个最大特点是它是无模式的,也就是说,数据库对其集合中存储了什么样的数据只有非常有限的认识。数据库的模式只存在于程序代码中,它以一种特定的方式解释数据,例如,通过识别一些字段是对另一个集合中的对象的引用。

在第3和第4章节的应用实例中,数据库存储了笔记和用户。

一个存储笔记的笔记集合如下所示:下面这样。

[
  {
    "_id": "600c0e410d10256466898a6c",
    "content": "HTML is easy"
    "date": 2021-01-23T11:53:37.292+00:00,
    "important": false
    "__v": 0
  },
  {
    "_id": "600c0edde86c7264ace9bb78",
    "content": "CSS is hard"
    "date": 2021-01-23T11:56:13.912+00:00,
    "important": true
    "__v": 0
  },
]

保存在users集合中的用户如下所示:下面这样。

[
  {
    "_id": "600c0e410d10256466883a6a",
    "username": "mluukkai",
    "name": "Matti Luukkainen",
    "passwordHash" : "$2b$10$Df1yYJRiQuu3Sr4tUrk.SerVz1JKtBHlBOARfY0PBn/Uo7qr8Ocou",
    "__v": 9,
    notes: [
      "600c0edde86c7264ace9bb78",
      "600c0e410d10256466898a6c"
    ]
  },
]

MongoDB确实知道存储实体的字段类型,但是它没有关于用户记录id指的是哪个实体集合的信息。MongoDB也不关心存储在集合中的实体有哪些字段。因此,MongoDB完全让程序员来确保正确的信息被存储在数据库中。

没有模式既有优点也有缺点。其中一个优点是模式无关性带来的灵活性:由于模式不需要在数据库级别上定义,所以在某些情况下,应用开发可能会更快,更容易,在任何情况下,定义和修改模式所需的努力都会减少。没有模式的问题与易错性有关:一切都由程序员决定。数据库本身没有办法检查其中的数据是否是诚实的,即是否所有的强制字段都有值,参考类型字段是否引用了一般正确类型的现有实体,等等。

本节重点讨论的关系型数据库,则在很大程度上依赖于模式的存在,与非模式型数据库相比,模式型数据库的优势和劣势几乎相反。

本课程的前几节之所以使用MongoDB,正是因为它的无模式特性,这使得对关系型数据库了解不多的人更容易使用该数据库。对于本课程的大部分用例,我个人会选择使用关系型数据库。

Application database

对于我们的应用,我们需要一个关系型数据库。有很多选择,但我们将使用目前最流行的开源解决方案PostgreSQL。一个更简单的选择是将Postgres作为云服务,例如ElephantSQL。你也可以利用课程第12部分中的课程,使用Docker在本地使用Postgres。

然而,我们将利用在Heroku云服务平台上为应用创建Postgres数据库的优势,这一点在第3和第4章节中已经很熟悉。

在本节的理论材料中,我们将从第3和第4节中建立的笔记存储应用的后端建立一个支持Postgres的版本。

现在让我们在Heroku应用中创建一个合适的目录,在其中添加一个数据库,并使用heroku config命令来获得连接字符串,这是连接数据库所需要的。

heroku create
# Returns an app-name for the app you just created in heroku.

heroku addons:create heroku-postgresql:hobby-dev -a <app-name>
heroku config -a <app-name>
=== cryptic-everglades-76708 Config Vars
DATABASE_URL: postgres://<username>:thepasswordishere@<host-of-postgres-addon>:5432/<db-name>

特别是在使用关系型数据库时,直接访问数据库也是非常必要的。有很多方法可以做到这一点,有几个不同的图形用户界面,如pgAdmin。然而,我们将使用Postgres psql命令行工具。

通过在Heroku服务器上运行psql命令可以访问数据库,如下所示(注意,命令参数取决于Heroku数据库的连接网址)。

heroku run psql -h <host-of-postgres-addon> -p 5432 -U <username> <dbname> -a <app-name>

输入密码后,让我们试试主要的psql命令d,它告诉你数据库的内容。

Password for user <username>:
psql (13.4 (Ubuntu 13.4-1.pgdg20.04+1))
SSL connection (protocol: TLSv1.3, cipher: TLS_AES_256_GCM_SHA384, bits: 256, compression: off)
Type "help" for help.

postgres=# \d
Did not find any relations.

正如你可能猜到的,目前数据库中没有任何东西。

让我们为笔记创建一个表。

CREATE TABLE notes (
    id SERIAL PRIMARY KEY,
    content text NOT NULL,
    important boolean,
    date time
);

几点:列id被定义为主键,这意味着该列的值对于表中的每一行都必须是唯一的,并且该值不能为空。这个列的类型被定义为SERIAL,这不是实际的类型,而是一个整数列的缩写,Postgres在创建行的时候会自动分配一个唯一的、增加的值。类型为text的名为content的列是这样定义的,它必须被分配一个值。

让我们从控制台看一下这个情况。首先,d命令,它告诉我们数据库中有哪些表。

postgres=# \d
                 List of relations
 Schema | Name | Type | Owner
--------+--------------+----------+----------------
 public | notes | table | username
 public | notes_id_seq | sequence | username
(2 rows)

In addition to the notes table, Postgres created a subtable called notes_id_seq, which keeps track of what value is assigned to the id column when creating the next note.

With the command \d notes, we can see how the notes table is defined:

postgres=# \d notes;
                                     Table "public.notes"
  Column | Type | Collation | Nullable | Default
-----------+------------------------+-----------+----------+-----------------------------------
 id | integer | not null | nextval('notes_id_seq'::regclass)
 content | text | | not null |
 important | boolean | | | |
 date | time without time zone | | | |
Indexes:
    "notes_pkey" PRIMARY KEY, btree (id)

因此列id有一个默认值,它是通过调用Postgres的内部函数nextval获得的。

让我们在表中添加一些内容。

insert into notes (content, important) values ('Relational databases rule the world', true);
insert into notes (content, important) values ('MongoDB is webscale', false);

让我们看看创建的内容是什么样子的。

postgres=# select * from notes;
 id | content | important | date
----+-------------------------------------+-----------+------
  1 | relational databases rule the world | t |
  2 | MongoDB is webscale | f |
(2 rows)

If we try to store data in the database that is not according to the schema, it will not succeed. The value of a mandatory column cannot be missing:

postgres=# insert into notes (important) values (true);
ERROR: null value in column "content" of relation "notes" violates not-null constraint
DETAIL: Failing row contains (9, null, t, null).

The column value cannot be of the wrong type:

postgres=# insert into notes (content, important) values ('only valid data can be saved', 1);
ERROR: column "important" is of type boolean but expression is of type integer
LINE 1: ...tent, important) values ('only valid data can be saved', 1); ^

Columns that don't exist in the schema are not accepted either:

postgres=# insert into notes (content, important, value) values ('only valid data can be saved', true, 10);
ERROR: column "value" of relation "notes" does not exist
LINE 1: insert into notes (content, important, value) values ('only ...

Next it's time to move on to accessing the database from the application.

Node application using a relational database

Let's start the application as usual with the npm init and install nodemon as a development dependency and also the following runtime dependencies:

npm install express dotenv pg sequelize

Of these, the latter sequelize is the library through which we use Postgres. Sequelize is a so-called Object relational mapping (ORM) library that allows you to store JavaScript objects in a relational database without using the SQL language itself, similar to Mongoose that we used with MongoDB.

Let's test that we can connect successfully. Create the file index.js and add the following content:

require('dotenv').config()
const { Sequelize } = require('sequelize')

const sequelize = new Sequelize(process.env.DATABASE_URL, {
  dialectOptions: {
    ssl: {
      require: true,
      rejectUnauthorized: false
    }
  },
})

const main = async () => {
  try {
    await sequelize.authenticate()
    console.log('Connection has been established successfully.')
    sequelize.close()
  } catch (error) {
    console.error('Unable to connect to the database:', error)
  }
}

main()

The database connect string, which is revealed by the heroku config command should be stored in a .env file, the contents should be something like the following:

$ cat .env
DATABASE_URL=postgres://<username>:thepasswordishere@ec2-54-83-137-206.compute-1.amazonaws.com:5432/<databasename>

Let's test for a successful connection:

$ node index.js
Executing (default): SELECT 1+1 AS result
Connection has been established successfully.

If and when the connection works, we can then run the first query. Let's modify the program as follows:

require('dotenv').config()
const { Sequelize, QueryTypes } = require('sequelize')
const sequelize = new Sequelize(process.env.DATABASE_URL, {
  dialectOptions: {
    ssl: {
      require: true,
      rejectUnauthorized: false
    }
  },
});

const main = async () => {
  try {
    await sequelize.authenticate()
    const notes = await sequelize.query("SELECT * FROM notes", { type: QueryTypes.SELECT })    console.log(notes)    sequelize.close()  } catch (error) {
    console.error('Unable to connect to the database:', error)
  }
}

main()

Executing the application should print as follows:

Executing (default): SELECT * FROM notes
[
  {
    id: 1,
    content: 'Relational databases rule the world',
    important: true,
    date: null
  },
  {
    id: 2,
    content: 'MongoDB is webscale',
    important: false,
    date: null
  }
]

Even though Sequelize is an ORM library, which means there is little need to write SQL yourself when using it, we just used direct SQL with the sequelize method query.

Since everything seems to be working, let's change the application into a web application.

require('dotenv').config()
const { Sequelize, QueryTypes } = require('sequelize')
const express = require('express')const app = express()
const sequelize = new Sequelize(process.env.DATABASE_URL, {
  dialectOptions: {
    ssl: {
      require: true,
      rejectUnauthorized: false
    }
  },
});

app.get('/api/notes', async (req, res) => {  const notes = await sequelize.query("SELECT * FROM notes", { type: QueryTypes.SELECT })  res.json(notes)})const PORT = process.env.PORT || 3001app.listen(PORT, () => {  console.log(`Server running on port ${PORT}`)})

The application seems to be working. However, let's now switch to using Sequelize instead of SQL as it is intended to be used.

Model

When using Sequelize, each table in the database is represented by a model, which is effectively it's own JavaScript class. Let's now define the model Note corresponding to the table notes for the application by changing the code to the following format:

require('dotenv').config()
const { Sequelize, Model, DataTypes } = require('sequelize')const express = require('express')
const app = express()

const sequelize = new Sequelize(process.env.DATABASE_URL, {
  dialectOptions: {
    ssl: {
      require: true,
      rejectUnauthorized: false
    }
  },
});

class Note extends Model {}Note.init({  id: {    type: DataTypes.INTEGER,    primaryKey: true,    autoIncrement: true  },  content: {    type: DataTypes.TEXT,    allowNull: false  },  important: {    type: DataTypes.BOOLEAN  },  date: {    type: DataTypes.DATE  }}, {  sequelize,  underscored: true,  timestamps: false,  modelName: 'note'})
app.get('/api/notes', async (req, res) => {
  const notes = await Note.findAll()  res.json(notes)
})

const PORT = process.env.PORT || 3001
app.listen(PORT, () => {
  console.log(`Server running on port ${PORT}`)
})

A few comments on the code. There is nothing very surprising about the Note definition of the model, each column has a type defined, as well as other properties if necessary, such as whether it is the main key of the table. The second parameter in the model definition contains the sequelize attribute as well as other configuration information. We also defined that the table does not have frequently used timestamp columns (created_at and updated_at).

We also defined underscored: true, which means that table names are derived from model names as plural snake case versions. Practically this means that, if the name of the model, as in our case is "Note", then the name of the corresponding table is the plural of the name written in a small initial letter, i.e. notes. If, on the other hand, the name of the model would be "two-part", e.g. StudyGroup, then the name of the table would be study_groups. Instead of automatically inferring table names, Sequelize also allows explicitly defining table names.

The same naming policy applies to columns as well. If we had defined that a note is associated with creationYear, i.e. information about the year it was created, we would define it in the model as follows:

Note.init({
  // ...
  creationYear: {
    type: DataTypes.INTEGER,
  },
})

The name of the corresponding column in the database would be creation_year. In code, reference to the column is always in the same format as in the model, i.e. in "camel case" format.

We have also defined modelName: 'note', the default "model name" would be capitalized Note. However we want to have a lowercase initial, it will make a few things a bit more convenient going forward.

The database operation is easy to do using the query interface provided by models, the method findAll works exactly as it is assumed by it's name to work:

app.get('/api/notes', async (req, res) => {
  const notes = await Note.findAll()  res.json(notes)
})

The console tells you that the method call Note.findAll() causes the following query:

Executing (default): SELECT "id", "content", "important", "date" FROM "notes" AS "note";

Next, let's implement an endpoint for creating new notes:

app.use(express.json())

// ...

app.post('/api/notes', async (req, res) => {
  console.log(req.body)
  const note = await Note.create(req.body)
  res.json(note)
})

Creating a new note is done by calling the model's Note method create and passing as a parameter an object that defines the values of the columns.

Instead of the create method, it is also possible to save to a database using the build method first to create a Model-object from the desired data, and then calling the save method on it:

const note = Note.build(req.body)
await note.save()

Calling the build method does not save the object in the database yet, so it is still possible to edit the object before the actual save event:

const note = Note.build(req.body)
note.important = trueawait note.save()

For the use case of the example code, the create method is better suited, so let's stick to that.

If the object being created is not valid, there is an error message as a result. For example, when trying to create a note without content, the operation fails, and the console reveals the reason to be SequelizeValidationError: notNull Violation Note.content cannot be null:

(node:39109) UnhandledPromiseRejectionWarning: SequelizeValidationError: notNull Violation: Note.content cannot be null
    at InstanceValidator._validate (/Users/mluukkai/opetus/fs-psql/node_modules/sequelize/lib/instance-validator.js:78:13)
    at processTicksAndRejections (internal/process/task_queues.js:93:5)

Let's add some simple error handling when adding a new note:

app.post('/api/notes', async (req, res) => {
  try {
    const note = await Note.create(req.body)
    return res.json(note)
  } catch(error) {
    return res.status(400).json({ error })
  }
})

Creating database tables automatically

Our application now has one unpleasant side, it assumes that a database with exactly the right schema exists, i.e. that the table notes has been created with the appropriate create table command.

Since the program code is being stored on GitHub, it would make sense to also store the commands that create the database in the context of the program code, so that the database schema is definitely the same as what the program code is expecting. Sequelize is actually able to generate a schema automatically from the model definition by using the models method sync.

Let's now destroy the database from the console by entering the following command:

drop table notes;

The \d command reveals that the table has been lost from the database:

postgres=# \d
Did not find any relations.

The application no longer works.

Let's add the following command to the application immediately after the model Note is defined:

Note.sync()

When the application starts, the following is printed on the console:

Executing (default): CREATE TABLE IF NOT EXISTS "notes" ("id" SERIAL , "content" TEXT NOT NULL, "important" BOOLEAN, "date" TIMESTAMP WITH TIME ZONE, PRIMARY KEY ("id"));

That is, when the application starts, the command CREATE TABLE IF NOT EXISTS "notes"... is executed which creates the table notes if it does not already exist.

Other operations

Let's complete the application with a few more operations.

Searching for a single note is possible with the method findByPk, because it is retrieved based on the id of the primary key:

app.get('/api/notes/:id', async (req, res) => {
  const note = await Note.findByPk(req.params.id)
  if (note) {
    res.json(note)
  } else {
    res.status(404).end()
  }
})

Retrieving a single note causes the following SQL command:

Executing (default): SELECT "id", "content", "important", "date" FROM "notes" AS "note" WHERE "note". "id" = '1';

If no note is found, the operation returns null, and in this case the relevant status code is given.

Modifying the note is done as follows. Only the modification of the important field is supported, since the application's frontend does not need anything else:

app.put('/api/notes/:id', async (req, res) => {
  const note = await Note.findByPk(req.params.id)
  if (note) {
    note.important = req.body.important
    await note.save()
    res.json(note)
  } else {
    res.status(404).end()
  }
})

The object corresponding to the database row is retrieved from the database using the findByPk method, the object is modified and the result is saved by calling the save method of the object corresponding to the database row.

The current code for the application is in its entirety on GitHub, branch part13-1.

Printing the objects returned by Sequelize to the console

The JavaScript programmer's most important tool is console.log, whose aggressive use gets even the worst bugs under control. Let's add console printing to the single note path:

app.get('/api/notes/:id', async (req, res) => {
  const note = await Note.findByPk(req.params.id)
  if (note) {
    console.log(note)    res.json(note)
  } else {
    res.status(404).end()
  }
})

We can see that the end result is not exactly what we expected:

note {
  dataValues: {
    id: 1,
    content: 'Notes are attached to a user',
    important: true,
    date: 2021-10-03T15:00:24.582Z,
  },
  _previousDataValues: {
    id: 1,
    content: 'Notes are attached to a user',
    important: true,
    date: 2021-10-03T15:00:24.582Z,
  },
  _changed: Set(0) {},
  _options: {
    isNewRecord: false,
    _schema: null,
    _schemaDelimiter: '',
    raw: true,
    attributes: [ 'id', 'content', 'important', 'date' ]
  },
  isNewRecord: false
}

In addition to the note information, all sorts of other things are printed on the console. We can reach the desired result by calling the model-object method toJSON:

app.get('/api/notes/:id', async (req, res) => {
  const note = await Note.findByPk(req.params.id)
  if (note) {
    console.log(note.toJSON())    res.json(note)
  } else {
    res.status(404).end()
  }
})

Now the result is exactly what we want.

{ id: 1,
  content: 'MongoDB is webscale',
  important: false,
  date: 2021-10-09T13:52:58.693Z }

In the case of a collection of objects, the method toJSON does not work directly, the method must be called separately for each object in the collection:

router.get('/', async (req, res) => {
  const notes = await Note.findAll()

  console.log(notes.map(n=>n.toJSON()))
  res.json(notes)
})

The print looks like the following:

[ { id: 1,
    content: 'MongoDB is webscale',
    important: false,
    date: 2021-10-09T13:52:58.693Z },
  { id: 2,
    content: 'Relational databases rule the world',
    important: true,
    date: 2021-10-09T13:53:10.710Z } ]

However, perhaps a better solution is to turn the collection into JSON for printing by using the method JSON.stringify:

router.get('/', async (req, res) => {
  const notes = await Note.findAll()

  console.log(JSON.stringify(notes))
  res.json(notes)
})

This way is better especially if the objects in the collection contain other objects. It is also often useful to format the objects on the screen in a slightly more reader-friendly format. This can be done with the following command:

console.log(JSON.stringify(notes, null, 2))

The print looks like the following:

[
  {
    "id": 1,
    "content": "MongoDB is webscale",
    "important": false,
    "date": "2021-10-09T13:52:58.693Z"
  },
  {
    "id": 2,
    "content": "Relational databases rule the world",
    "important": true,
    "date": "2021-10-09T13:53:10.710Z"
  }
]