The C++ connector for PostgreSQL

libpqxx is the official C++ client API for PostgreSQL, the enterprise-strength open-source relational database. (If "PostgreSQL" is too verbose, call it by its shorter name, postgres).

If you are writing software in C++ that needs to access databases managed by postgres —on just about any platform— then libpqxx is the library you use.

The source code for libpqxx is available under the BSD license, so you're free to download it, pass it on to others, change it, sell it, include it in your own code, and share your changes with anyone you choose. No charge, no catch. Also, no guarantees. :-)

Finding Everything

Quick links:

News

2023-01-12: At last! Faster than C.

I'm calling it, based on an entirely unscientific, off-the-cuff, unbalanced, and non-peer-reviewed toy benchmark. Under some circumstances, your mileage may vary™, querying your database can be faster in libpqxx than a close equivalent in C.

Here's the C-and-libpq programme I benchmarked. Note that it's full of missing error checking & handling. Its central loop doesn't even have a good way of detecting errors. All of that can be fixed, but the upshot is: expect a real-world version to be slower and more cumbersome.

#include <stdio.h> 
#include <stdlib.h>
#include <postgresql/libpq-fe.h>

int main()
{
  PGconn *const conn = PQconnectdb("");
  // TODO: Check for errors, report, clean up.
  PGresult *const res = PQexec(
    conn,
    "SELECT generate_series, 'row #' || generate_series "
    "FROM generate_series(1, 100000000)"
  );
  // TODO: Check for errors, report, clean up.
  const unsigned long sz = PQntuples(res);
  // TODO: Check for errors, report, clean up.
  for (unsigned long i = 0; i < sz; ++i)
  {
    const char *const id_string = PQgetvalue(res, 1, 0);
    const long id = atol(id_string);
    // TODO: Check for errors (how?), report, clean up.
    const char *const payload = PQgetvalue(res, i, 1);
    printf("%ld: %s\n", id, payload);
  }
  PQclear(res);
  PQfinish(conn);
  return 0;
}

And here's the C++-and-libpqxx equivalent:

#include <iostream>
#include <pqxx/pqxx>

int main()
{
  try
  {
    pqxx::connection conn;
    pqxx::transaction tx{conn};
    for (auto const &[id, payload] : tx.stream<long, std::string_view>(
      "SELECT generate_series, 'row #' || generate_series "
      "FROM generate_series(1, 100000000)"))
    {
      std::cout << id << ": " << payload << '\n';
    }
  }
  catch (std::exception const &e)
  {
    std::cerr << "ERROR: " << e.what() << '\n';
    return 1;
  }
}

This code actually has all the error checking and reporting you need. It may be ugly, but for instance, if parsing of a field fails, you'll get an appropriate error message.

I ran both several times, on the same lightly loaded system. Here's the best timing I got for the C/libpq code:

real  0m37,059s
user  0m13,263s
sys   0m3,619s

Here's the best timing I got for the C++/libpqxx code:

real  0m31,333s
user  0m23,687s
sys   0m0,465s

The "real" part is the wall-clock time: the overall time that the loops took. The libpqxx code shaved more than 15% off the libpq version's run time. On the other hand, the libpqxx version used more CPU time. This was ordinary "user" time though; "sys" time is often more of a problem, and libpqxx still did considerably better at that than the C version.

Another difference, not shown here, is that the C++ code starts processing results earlier. The C code waits for the full query to execute, loads all result data into its own memory, and then starts processing it. The C++ code uses "streaming." It gets its first results early on and starts processing them immediately, so it has useful work to do while the server is still sending data.

(In fact when I tried to increase the loop count for the benchmark, I froze my system with the C version because it allocated too much memory at once. That won't happen with the C++ version.)

I'm particularly proud of these results because libpqxx builds on top of libpq, using only its C-level functions to communicate with the database. So it was a very nice surprise to hear that libpqxx result streaming was so fast.

A lot of work went into getting this result. All of that work you could do for yourself with libpq, if you wanted, but it's not trivial. The result streaming uses the SQL COPY command internally, which makes the server send more or less raw data. Then libpqxx code parses this directly, without calling on libpq.

Initially, streaming was slower than normal result sets, because of the work needed to deal with the various encodings in which the server may send the data. In libpqxx you get support for all of the same client encodings that libpq supports, though some will be faster than others. This means that the inner processing loop needs to make a variable call for each and every character in the output, to find the bytes where the character begins and ends.

The trick was in making the compiler generate entire specialised processing loops for each encoding, so it could optimise across these calls as if they were just written directly in the algorithm. And of course, it had to be done without repeating lots of code or using ugly macro tricks. I think it worked out pretty well!

Technical Overview

This library works on top of the C-level API library, libpq. It comes with postgres. You will link your program with both libpqxx and libpq in that order.

Coding with libpqxx revolves around transactions. Transactions are a central concept in database management systems, but they are widely under-appreciated among application developers. In libpqxx, they're fundamental.

With conventional database APIs, you issue commands and queries to a database session or connection, and optionally create the occasional transaction. In libpqxx you start with a connection, but you do all your SQL work in transactions that you open in your connection. You commit each transaction when it's complete. If you don't, all changes made inside the transaction get rolled back.

There are several types of transactions with various "quality of service" properties. If you really don't want a transaction on the database, one of the available transaction types is called nontransaction. It provides basic non-transactional behaviour. (This is sometimes called "autocommit": it commits every successful command right away).

You can execute queries in different ways, each by calling methods on the transaction: * "query" methods execute a query, wait for the full result data to come back, and provide you with each row of data converted to field types of your choice. * "stream" methods execute a query, taking a bit more time to start up, but the data then comes in at a higher rate, and you can start processing each row right as it comes in. Just like the "query" methods, they convert the data to field types of your choice. * "exec" methods execute a query, wait for the full result to come in, and then give you a result object. It's a container of rows (and each row is a container of fields), but it also contains some metadata about the result set. In particular, use "exec0" when you execute a command that should not return any data rows at all.

Either way, don't do any if statements to check for errors when you execute an SQL command. If something goes wrong, you will get an exception.

Quick example

Can't have a database example without an Employee table. Here's a simple application: find an employee by name, and raise their salary by 1 whatever-it-is-they-get-paid-in.

This example is so simple that anything that goes wrong crashes the program. You won't need to do a lot more to fix that, but we'll get to it later.

#include <iostream>
#include <pqxx/pqxx>

int main(int, char *argv[])
{
  // (Normally you'd check for valid command-line arguments.)

  pqxx::connection c{"postgresql://accounting@localhost/company"};
  pqxx::work txn{c};

  // For querying just one single value, the transaction has a shorthand method
  // query_value().
  //
  // Use txn.quote() to escape and quote a C++ string for use as an SQL string
  // in a query's text.
  int employee_id = txn.query_value<int>(
    "SELECT id "
    "FROM Employee "
    "WHERE name =" + txn.quote(argv[1]));

  std::cout << "Updating employee #" << employee_id << '\n';

  // Update the employee's salary.  Use exec0() to perform a command and check
  // that it produces no result rows.  If the result does contain data, this
  // will throw an exception.
  //
  // The ID is an integer, so we don't need to escape and quote it when using
  // it in our query text.  Just convert it to its PostgreSQL textual
  // representation using to_string().
  txn.exec0(
    "UPDATE EMPLOYEE "
    "SET salary = salary + 1 "
    "WHERE id = " + pqxx::to_string(employee_id));

  // Make our change definite.
  txn.commit();
}

You'll find more detailed explanations and reference-style docs on the ReadTheDocs site.

Building your libpqxx program

The details depend on your system and compiler. On a typical Unix-like system, you might do:

c++ add_employee.cxx -lpqxx -lpq

Remember to keep the -lpqxx and -lpq in that order! Otherwise the linker will complain bitterly about missing functions like PQconnectdb and PQexec.

If libpqxx is installed in a nonstandard location, such as /usr/local, you may need to add options like -I/usr/local/include (to make the compiler find headers pqxx/* in /usr/local/include/pqxx), and/or -L/usr/local/lib (to make the linker find the library in /usr/local/lib).

This should work on most GNU/Linux systems (Mint, Debian, Fedora, Gentoo, Red Hat, Slax, Ubuntu, etc.), BSD systems (FreeBSD, NetBSD, OpenBSD), vaguely Unix-like systems such as Apple's macOS, and so on — as long as you have libpqxx, libpq, and a C++ compiler installed. If your C++ compiler has a different name on the command line, use that instead of "c++".

It works differently on Microsoft Windows, though there are development environments out there that behave more like a Unix system.

Handling errors

Errors are exceptions, and derived from std::exception, just like you'd expect. So you can handle database errors like all others:

#include <iostream>
#include "my-db-code.hxx"

int main(int argc, char *argv[])
{
  try
  {
    do_db_work(trans);
  }
  catch (std::exception const &e)
  {
    std::cerr << e.what() << std::endl;
    return 1;
  }
}

Of course libpqxx also defines its own exception hierarchy for errors it throws, so you can handle those specially if you like:

#include <iostream>
#include <pqxx/except>
#include "my-db-code.hxx"

int main(int argc, char *argv[])
{
  try
  {
    do_db_work(trans);
  }
  catch (pqxx::sql_error const &e)
  {
    std::cerr
        << "Database error: " << e.what() << std::endl
        << "Query was: " << e.query() << std::endl;
    return 2;
  }
  catch (std::exception const &e)
  {
    std::cerr << e.what() << std::endl;
    return 1;
  }
}

Just one caveat: not all platforms support throwing an exception in a shared library and catching it outside that shared library. Unless you're building for Windows, it's probably a good habit to use static libraries instead.

Complete example

Here's a more complete example, showing iteration and direct field access.

#include <iostream>
#include <pqxx/pqxx>

/// Query employees from database.  Return result.
pqxx::result query()
{
  pqxx::connection c{"postgresql://accounting@localhost/company"};
  pqxx::work txn{c};

  // Silly example: Add up all salaries.  Normally you'd let the database do
  // this for you.
  long total = 0;
  for (auto [salary] : txn.query("SELECT salary FROM Employee"))
    total += salary;
  std::cout << "Total salary: " << total << '\n';

  // Execute and process some data.
  pqxx::result r{txn.exec("SELECT name, salary FROM Employee")};
  for (auto row: r)
    std::cout
      // Address column by name.  Use c_str() to get C-style string.
      << row["name"].c_str()
      << " makes "
      // Address column by zero-based index.  Use as<int>() to parse as int.
      << row[1].as<int>()
      << "."
      << std::endl;

  // Not really needed, since we made no changes, but good habit to be
  // explicit about when the transaction is done.
  txn.commit();

  // Connection object goes out of scope here.  It closes automatically.
  // But the result object remains valid.
  return r;
}


/// Query employees from database, print results.
int main(int, char *argv[])
{
  try
  {
    pqxx::result r{query()};

    // Results can be accessed and iterated again.  Even after the connection
    // has been closed.
    for (auto row: r)
    {
      std::cout << "Row: ";
      // Iterate over fields in a row.
      for (auto field: row) std::cout << field.c_str() << " ";
      std::cout << std::endl;
    }
  }
  catch (pqxx::sql_error const &e)
  {
    std::cerr << "SQL error: " << e.what() << std::endl;
    std::cerr << "Query was: " << e.query() << std::endl;
    return 2;
  }
  catch (std::exception const &e)
  {
    std::cerr << "Error: " << e.what() << std::endl;
    return 1;
  }
}

Results and result rows have all the member functions you expect to find in a container: begin()/end(), front()/back(), size(), index operator, and so on. The contents are immutable.


GTF Contributor