Storing data in Node.js
Published on 2019-01-15
When working with web applications, we often need to work with data. For example, if we're building a social media website, we need to store user posts, friend lists, direct messages, etc. If we're building a project management application, we need to store details about projects, updates, comments, maps and other data.
If you've read some earlier posts on this blog, you already know how to store data with Node.js. To be more specific, in a previous blog post, we talked about the 'fs' module that allows us to work with the file system – therefore, we can read and write files, and store our data there.
However, this problem is so common, that several (better) alternatives exist. Today, when working with data, we often store information in either 1) a relational (SQL) database, or 2) a non-relational (NoSQL) database. A relational database is a lot more common, so let's talk about that first.
Getting started
To get started, make sure you have Node.js installed, and install the "cloudapps" package from npm (if you haven't done so already, in one of our previous blog posts). You can do that by simply running the command "npm install -g cloudapps", or read through our previous blog post about Node.js.
Then, go to a directory where you want to store our new project, open a Terminal or a Command Prompt, and run the "cloudapps new" command to create a new project. Type a name for your new project, and from the list of project templates select an "Express.js app with Authentication". Finally, go into the new project folder, run "npm install" to install all dependencies, and then "npm run start" to run the new app.
You can now use a browser to open "http://localhost:3000". In there, login inside the application (creating/registering a new user if necessary), and try to add a couple of notes inside the app. The app will use a SQL database to store the notes. You can restart the app or refresh your browser, and you'll notice that your data is still there (persisting). If you go into the terminal (or Command Prompt) where you've run the "npm run start" command, you'll notice a couple of "SELECT …" and "INSERT …" statements (if you have debugging enabled). So, let's learn what these mean, and how Node.js (and Sequelize) work with a database in the demo project that was created for us.
Relational databases
A relational database is a digital database that uses the relational model to store data – i.e. a model where data is organized in tables (or relations) with rows and columns, similarly to what you see when working with Excel. Each row (record) has a unique "primary key", allowing us to easily find a record. The tables/relations can be linked to other tables.
For example, we can have a "User" table for storing users, and a "Note" table for storing notes created by users. The Note table can be linked to the User table if we want to know which user created (or edited) specific notes. This is done with something called "foreign keys". To be even more specific, imagine a table called "User" where we have a record with a primary key of "1", the name "Donald Trump" and an email address "trump@us.gov" (inside the table we can have multiple rows, this is just an example). On the other hand, inside the "Note" table we can have a note with a primary key of "9123" (or anything really), the content "This is a note" and a foreign key pointing to the "User" table equal to "1" (i.e. the key of the user "Donald Trump"). This feature allows us to use relational databases to tackle various problems – in practice, this is also a lot faster in comparison to storing data in simple files.
Virtually all relational databases use "SQL" (Structured Query Language) to run queries that select, insert or edit data. Although you don't need to be an expert in SQL to use relational databases with Node.js (because Sequelize offers us functions that create SQL queries for us automatically), you still need to understand the basics of SQL to be able to debug issues with your application, to run complex queries that need to be fast, etc.
So, let's mention the four most common SQL statements: SELECT, INSERT, UPDATE and DELETE. You can probably guess what each of them does (by glancing at their name), but looking at a couple of examples of using each of them will help us understand them better, as well as understand relational databases in more detail.
Let's imagine that we have a table called "User", having the following columns: "id" (the primary key), name (a string containing the user name), email (a string containing an email address) and age (a number containing the user's age – in years). A SELECT statement can be used to extract the necessary data from this table:
SELECT * FROM User; (selects all columns and all users) SELECT name, age FROM User; (selects the name and age columns, for all users) SELECT id, name, age FROM User WHERE age > 30; (selects id, name and age for users older than 30).
Please notice how we can write powerful queries using SQL. For example, after running the last statement, the database management system (called RDBMS in relational databases – short for Relational Database Management System) will only return those rows (records) where the user is more than 30 years old.
For the same table, let's see how we can insert a new record (user):
INSERT INTO User (name, email, age) VALUES ('Donald Trump', 'trump@us.gov', 67);
Please note that we didn't provide an "id" – that's because relational databases can be setup to create separate id's for each record automatically.
To update one (or more records), we can use the UPDATE statement. If we want to update just one record, we usually use the id to point to the record we want updated (example 1). To update multiple records, we can write any condition we want to use - to mark the records that need to be updated (example 2).
UPDATE User SET age = 60 WHERE id = 1; (updates the age of the User with id = 1) UPDATE User SET age = age + 1 WHERE age > 50; (increases the age by 1, of all records where age is larger than 50).
Similarly, we can use DELETE to delete one or more records.
DELETE FROM User where id = 1; DELETE FROM User where age > 50;
SQL is very easy to understand, but also very powerful. Although we're going to be talking about Sequelize below (and we won't be writing SQL directly), it's worth taking the time to read through the examples we showed above. SQL is commonly used in all kinds of applications, written in various programming languages (Sequelize is specific to Node.js).
Ok, let's wrap this part up with something a lot more difficult. In the initial SELECT statement examples, we only extracted data from one table (this is the most common type of SELECT statements). However, what if we want to select a "Note" (with it's content), but we also want to select the "User" that created it (so, we want to SELECT all columns from both the "Note" and "User" tables). To solve that problem, we can use JOINS. The most common type of JOIN is the INNER JOIN, where we select data that can be found in both tables (for example, if we can't find a "User" for a note – i.e. it hasn't been setup in the "Note" table, then it won't show up). We always try to have complete data, so INNER JOINs are usually what you want.
SELECT n.content, u.name, u.email, u.age FROM Note n INNER JOIN User u ON n.userId = u.id;
Ok, a couple of things are happening here. First, note how we named our tables during the query ("n" for the Note table, and "u" for the User table), as this allows us to be more specific when selecting columns (for example, n.content for the "content" column from the Note table). Ok, but how about the last line? Why did we have that "ON …" part of the query? The reason is actually quite simple – we need to let the query know how to connect those tables (i.e. what is the relation between them). Consequentially, the RDBMS system will know that we want to get the data for the User with id equal to the "userId" foreign key of the "Note" table (otherwise, the system has no way of knowing how to JOIN the rows of both tables). Don't be too afraid here – this is really the only thing you need to know for now: tell the system the primary key of one table (u.id) and the foreign key that is stored in the other table (n.userId), so the RDBMS can join them.
Sequelize
Sequelize enables us to work with relational databases, without writing any SQL statements ourselves. It is a promise-based ORM for Node.js – meaning we can easily work with promises and async/await. ORM stands for Object-relational Mapping, which means that database data will be automatically converted into JavaScript objects for us. As an example, if we want to find a user, Sequelize will return a JavaScript object with the user details (object.name, object.age, etc).
Let's see an example of how we can insert a user into a database with Sequelize, and how we can find a user by id (just to get things started). Later, we'll see how we can retrieve multiple rows at once, update objects, and more.
//to insert a new row await models.User.create({ name: 'Donald Trump', email: 'trump@us.gov', age: 67 }); //to find a user by id let user = await models.User.findById(1); console.log(user.name);
We need to mention a couple of important things here. First, you should have in mind that Sequelize operations return promises, that are fulfilled when the database operation has been completed. That's why we used "await" with the two Sequelize operations presented above.
Second, please notice that we ran the "create" and "findById" methods on something called "models.User". With Sequelize, these models need to be defined upfront – i.e., in a separate file, where we tell Sequelize that "User" should point to a database table named "User" (for example). Similarly, we'll define the columns that are part of that table. (The nice thing here is that Sequelize can then create these tables in a database by itself – or we can do that manually).
Models in Sequelize are defined with code that looks something like this:
const Note = sequelize.define("Note", { name: DataTypes.STRING, content: DataTypes.STRING, addedOn: DataTypes.DATE }, { classMethods: { associate: function (models) { models.User.hasMany(Note, {foreignKey: 'userId'}); Note.belongsTo(models.User, {foreignKey: 'userId'}); } } });
Here, we define a model called Note (later, we can use it by accessing models.Note), and we define that the database table for notes contains a name, content, and a timestamp addedOn (where, for example, we can store when the note was added). Similarly, Sequelize allows us to store relations between models. In the example given above, we define a relationship between "User"s and "Note"s, where a user can have multiple notes (in practice, this means that the "userId" foreign key will be stored on the side of the Note).
All in all, Sequelize is a very powerful tool, allowing us to tackle various tasks.
Updating an object
Here comes the fun part. We can easily update the objects we receive from Sequelize (for example, with the findById method), and then call the save() method on that instance to persist the data in the relational database (Sequelize will do that by executing UPDATE statements in SQL by itself).
Let's see an example of how easy this is:
let user = await models.User.findById(1); user.name = 'Mark Jackson'; user.age = 50; await user.save();
The important thing here is that we used await both for retrieving the user, and then for saving the user. This is because both of those operations will result in trips to the database (a SELECT statement for retrieving a user, and then an UPDATE statement for storing the updated data).
As a side-note, we've already talked about how you can insert entities, retrieve them and perform updates. To delete a row from the database, we can just call the destroy() method on the instance, like this:
await user.destroy();
That's it. Sequelize saves us a lot of time, but we failed to mention one of the key benefits of using it – it's support for multiple relational database management systems. This gives us the ability to easily switch from one system to another if we discover some limitation of our current approach, or we find a better hosting alternative with another technology stack.
Multiple rows
We can use the findAll() method to select multiple rows. For example, maybe we want to find all users with the same age, or with a certain name. When using the findAll() method, we can set our conditions with the "where" property. Similarly, we usually set the "raw" parameter to true, because we don't want to retrieve real instances (that have methods such as "save" – see above), and instead want simple JavaScript objects (This is because we usually do updates on single rows after a findById call, and we rarely want to update multiple rows at once).
Let's see an example:
let users = await models.User.findAll({ where: { age: 70 }, raw: true });
The query presented above will return an array of users, having the age of 70 – i.e. age=70. They can be found in the "users" variable after the call. As with other operations that need to access the database, we use "await" to wait for the promise to be fulfilled.
We often want our results to be ordered in some way. We can achieve this with the "order" property. Here is where things get a little complicated: "order" is an array. Imagine we want to sort users by age (descending, from oldest to youngest), but if multiple people have the same age, we want to sort them by name (alphabetically). Then, we need to set order equal to:
[ ['age', 'DESC'], ['name', 'ASC' ] ]
This is why order must be an array – the possibility that you want to sort by multiple columns, as we showed above. So, even if you only want to sort by age (as shown in the next example), order must still be an array (even though it has only one element). Please note that we use 'ASC' (for ascending) and 'DESC' (for descending) as a second parameter for each column.
await models.User.findAll({ where: { name: 'Mark Jackson' }, order: [ ['age', 'ASC'] ], raw: true });
After the findAll() call, we can handle the array (and the elements in it), as we handle any other array in JavaScript (we can iterate through it, get it's length, etc).
Real-world example
You should use Sequelize whenever you create a web application with Node.js that needs to store something into a relational database. Sequelize has a very large feature set, a lot of users (meaning you can easily Google any potential issues), and an active development team.
To test Sequelize on a real-world project, you can use the "cloudapps" command-line tool to start from a project boilerplate that contains both Express.js and Sequelize (as we did at the start of this blog post). Just open a Terminal or a Command Prompt, go to a directory where you want to store your new project, and run the command "cloudapps new". Then, select a name for your new project, and choose the project template that mentions Express.js, Bootstrap and Sequelize. Afterwards, you can go into the project folder, run "npm install" (to install all dependencies) and "npm run start" to run the web server. Your application will be available at "http://localhost:3000" (with any browser).
Feel free to play with the project source code, add new models or API endpoints, and test the various Sequelize operations we mentioned in this blog post. The project templates available with the "cloudapps" cli also feature Authentication (password login), which is one of the hardest things to implement correctly for beginner and intermediate software developers. You should use a project boilerplate like this to bootstrap all (or most) of your new projects, to make development safer, faster and more interesting. Have fun!