Databases

Milliseconds vs Microseconds

I recently ran into an issue that I didn’t see coming. I was creating records, and then querying for them from the database.

I knew that they were created very close together in time, but they were created in two separate steps, so I knew that the timestamps would be different.

Only when I queried for them and compared the timestamps in my tests, I was getting the same timestamp back on both records.

After some digging I realized that the records had different timestamps in the database…where things were being measured in microseconds, but Javascript only tracks down to the millisecond, so under certain circumstances I was seeing one record round up and the other record round down such that the two of them were showing the exact same timestamp — to the millisecond on my in-memory object in Javascript.

I never would have expected to be in a situation where I needed or cared about more precision than 1/1000th of a second, but there you have it.

Postgres Create & Return Order

I recently wrote some tests that ended up being flakey. After some investigation, I realized that the objects I was getting back from my Postgres database weren’t always in the order that I was expecting them to be.

I was using a single update or insert statement to create several records at the same time and then querying that same table. My initial experiments had indicated that the objects were always being returned in the same order that I passed them into the insert/update statement, but as I grew the number of tests using that assumption and ran them dozens of times, it became apparent that this assumption is flawed.

It appears that Postgres by default generally saves the objects in the order they are passed in the insert/update statement, and that it generally returns them in that same order, but sometimes it doesn’t.

I haven’t pinned down whether the breakdown is in saving, the retrieving, or both. My money is that it’s on the saving side of things, but it doesn’t actually matter for my purposes. For now, I’ve been updating these tests to return in a given order, ordering by something other than the created_at timestamp.

A good learning for me there about not leaving anything to chance when it comes to a test. Testing…one of the few places where being more explicit is better.

Javascript Milliseconds vs Postgres Microseconds

I recently ran into something that I never would have even considered thinking about before now.

I was trying to query Postgres looking for all records with a specific result in a field that was a timestamp.

I got zero results, which was a real head scratcher given that I could see the records in the database.

As it turns out, by default, Postgres saves timestamps that track 6 digits to the right of the decimal. (Microseconds.) On the other hand, when you convert a Javascipt Date object to it’s string equivelent, you only get 3 digits after the decimal (Milliseconds).

So, when I asked passed in my query string it was something like YYYY-MM-DD HH:mm:ss.333. Postgres would then look at that and say ‘I have YYYY-MM-DD HH:mm:ss.333125′ but that is slightly after what you’re asking for, so I have no results to return to you’.

You can over-ride the default settings for a timestamp in Postgres to be only 3 digits past the decimal at the time you create the table/field by defining it as ‘timestamp(3)’

SQL Injection & Parameterization

After my last post I realized that not everyone knows what SQL injection is. In short, it involves providing SQL commands via user-input fields which results in the database running those commands as part of what is supposed to just be an insert or update. This injection of SQL into fields that aren’t expecting SQL can cause serious problems in your database (like dropping entire tables).

Here’s a brief overview: https://www.w3schools.com/sql/sql_injection.asp

Postgres (and the PG npm package) will allow you to prevent SQL injection through a process called parameterization. (https://www.postgresql.org/docs/9.2/sql-prepare.html) in short, via a parameterized query, you are telling your database which pieces of your query are meant to be plain text (which won’t run SQL commands regardless of what the user might try to inject), and which pieces are meant to include actual database-changing commands.

PG Parameterized Database Update Command

In my last post, I talked about using the update command and offered up this simple example:

UPDATE users
SET instances = instances - 1
WHERE id = 102
RETURNING *

Switching this to Typescript code that would work with PG (and having it update the first_name rather than the instances) would look like this:

import { Pool } from 'pg';

const pool = new Pool({
  host: 'localhost',
  user: 'database-user',
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
});

const updateFirstName = async (
  firstName: string,
  id: number,
): Promise<void> => {
  const client = await pool.connect(); // create a connection

  try {

    const query = `
      UPDATE users
      SET first_name = '${firstName}'
      WHERE id = ${id}
      RETURNING *
    `;

    await client.query(query); // use the connection
  } catch (error) {
    // Deal with your catch error
  } finally {
    client.release(); // release the connection
  }
};

The problem with this, obviously, is that there isn’t any parameterization, which means that it’s vulnerable to SQL injection attacks.

I looked around trying to find an example of parameterization with the PG npm package as it related to an UPDATE query/command. I found plenty of examples for the INSERT INTO query/command, but nothing with the UPDATE.

So, I did some experimenting and determined that this works:

import { Pool } from 'pg';

const pool = new Pool({ // Customize all of this...
  host: 'localhost',
  user: 'database-user',
  max: 20,
  idleTimeoutMillis: 30000,
  connectionTimeoutMillis: 2000,
})

const updateFirstName = async (
  firstName: string,
  lastName: string,
  id: number,
): Promise<void> => {
  const client = await pool.connect(); // create a connection

  try {

    if(typeof id !== 'number) {
      throw new Error('non-number id passed into updateFirstName');
    }
    const query = `
      UPDATE users
      SET first_name = $1, last_name = $2
      WHERE id = ${id}
      RETURNING *
    `;

    const values = [firstName, lastName];

    await client.query(query, values); // use the connection
  } catch (error) {
    // Deal with your catch error
  } finally {
    client.release(); // release the connection
  }
}

You’ll obviously have to customize that some to match your use case, but that should give you the correct approach.

Database Update Command

I’ve been reworking my side project to use Typescript and the PG npm package (so that I can use a postgres database). In the course of doing that, started out using ‘INSERT INTO’ statements because they have the benefit of creating a new record if no record currently exists with that primary key, and updating the existing record if there currently exists a record with that primary key.

That has by and large worked well, because my initial tables that I’ve been rebuilding used a string as a primary key rather than an auto-incrementing integer (or in postgres parlance a SERIAL).

Once I started trying to do INSERT statements into a table with a more ‘normal’ primary key, I ran into problems around the primary key.

Additionally, I came across a scenario where I needed to decrement a number in the database and I didn’t want to have to do a query to get the current number, and then do an insert because I suspected that would result in more load on my database and more latency for my application.

You may not run across the same issues, but if you do run into problems with the INSERT approach, it may be worth playing around with the UPDATE command.

Here is an example of an update query:

UPDATE users
SET instances = instances - 1
WHERE id = 102
RETURNING *

That syntax is fairly easy to find, but I couldn’t find an example with parameterization for the PG npm package, so I had to do some experimenting with that. More on that next time.

Converting Database Objects via JSON (part 2)

My last post discussed using a toJson method/static for converting the result from a database call. I’ve used that to make sure that sensitive information (like a hashed password) isn’t return to users as the result of an API call.

Today I ran into another unexpected bit of behavior. For some reason, I thought that the toJson method/static was just changing the string that was returned when JSON.stringify was called.

As it turns out (at least with Sequelize when working with Postgres), it actually changes the underlying object and then produces the corresponding JSON string for the changed object.

This tripped me up because I was trying to debug something, so I logged out the object, and then promptly saw a bunch of other stuff break because the subsequent logic was looking for the user object to have a password, but the object that had been returned from the database no longer had a password.

This is a good reason for jumping into the debugger rather than using logging statements. That was never an option in ServiceNow, so I’ll have to dust off those skills and get back into the habit of using the debugger instead of just using log statements.

Converting Database Objects via JSON

I’ve spoken before about the two Andrew Meads Udemy classes I took a couple of years ago. That was where I was introduced to the concept of establishing a toJSON method or static on a database object.

In the class, we used the toJSON method to remove the password from the object. That way we could be sure that what was returned to the user after an API call wouldn’t have the password on it.

That meshes well with one of my pet theories, which is fixing stuff in the right spot. By putting the logic on the database model, then you don’t have to remember to strip it out each time you return a user via an API call.

I’ve used this now with Mongoose/MongoDB and Sequelize/Postgres

Recently, however I ran into some unexpected behavior. I needed to do some complex logic around a group of records returned from the database.

My approach was to create a new object with some additional information on it for each record that had been returned and then add the database object to this new object. The logic all worked splendidly, but when I converted the new object via JSON.stringify for logging purposes, a bunch of information was logged out that I expected to be removed via my toJSON method.

Apparently, when you add a database model/object to another object, stringifying that new object doesn’t take advantage of the logic defined on the model.

I went back to the logic and instead of creating a new object, I just attached additional information to each record I got back from the database. That did the trick quite handily, and I still get the advantage of the toJson method declared on my database model, so that is the pattern I’ll use going forward.