The C++ connector for PostgreSQL
libpqxx
is the official C++ client API for
PostgreSQL, the enterprise-strength
open-source relational database. (If "PostgreSQL" is too verbose, call it by
its shorter name, postgres).
If you are writing software in C++ that needs to access databases managed by postgres —on just about any platform— then libpqxx is the library you use.
The source code for libpqxx is available under the BSD license, so you're free to download it, pass it on to others, change it, sell it, include it in your own code, and share your changes with anyone you choose. No charge, no catch. Also, no guarantees. :-)
Finding Everything
Quick links:
-
Documentation is at ReadTheDocs.
-
Code, bug tracker, and releases are on Github.
-
Stats and insights are in OpenHub
News
2023-01-12: At last! Faster than C.
I'm calling it, based on an entirely unscientific, off-the-cuff, unbalanced, and non-peer-reviewed toy benchmark. Under some circumstances, your mileage may vary™, querying your database can be faster in libpqxx than a close equivalent in C.
Here's the C-and-libpq programme I benchmarked. Note that it's full of missing error checking & handling. Its central loop doesn't even have a good way of detecting errors. All of that can be fixed, but the upshot is: expect a real-world version to be slower and more cumbersome.
#include <stdio.h>
#include <stdlib.h>
#include <postgresql/libpq-fe.h>
int main()
{
PGconn *const conn = PQconnectdb("");
// TODO: Check for errors, report, clean up.
PGresult *const res = PQexec(
conn,
"SELECT generate_series, 'row #' || generate_series "
"FROM generate_series(1, 100000000)"
);
// TODO: Check for errors, report, clean up.
const unsigned long sz = PQntuples(res);
// TODO: Check for errors, report, clean up.
for (unsigned long i = 0; i < sz; ++i)
{
const char *const id_string = PQgetvalue(res, 1, 0);
const long id = atol(id_string);
// TODO: Check for errors (how?), report, clean up.
const char *const payload = PQgetvalue(res, i, 1);
printf("%ld: %s\n", id, payload);
}
PQclear(res);
PQfinish(conn);
return 0;
}
And here's the C++-and-libpqxx equivalent:
#include <iostream>
#include <pqxx/pqxx>
int main()
{
try
{
pqxx::connection conn;
pqxx::transaction tx{conn};
for (auto const &[id, payload] : tx.stream<long, std::string_view>(
"SELECT generate_series, 'row #' || generate_series "
"FROM generate_series(1, 100000000)"))
{
std::cout << id << ": " << payload << '\n';
}
}
catch (std::exception const &e)
{
std::cerr << "ERROR: " << e.what() << '\n';
return 1;
}
}
This code actually has all the error checking and reporting you need. It may be ugly, but for instance, if parsing of a field fails, you'll get an appropriate error message.
I ran both several times, on the same lightly loaded system. Here's the best timing I got for the C/libpq code:
real 0m37,059s
user 0m13,263s
sys 0m3,619s
Here's the best timing I got for the C++/libpqxx code:
real 0m31,333s
user 0m23,687s
sys 0m0,465s
The "real" part is the wall-clock time: the overall time that the loops took. The libpqxx code shaved more than 15% off the libpq version's run time. On the other hand, the libpqxx version used more CPU time. This was ordinary "user" time though; "sys" time is often more of a problem, and libpqxx still did considerably better at that than the C version.
Another difference, not shown here, is that the C++ code starts processing results earlier. The C code waits for the full query to execute, loads all result data into its own memory, and then starts processing it. The C++ code uses "streaming." It gets its first results early on and starts processing them immediately, so it has useful work to do while the server is still sending data.
(In fact when I tried to increase the loop count for the benchmark, I froze my system with the C version because it allocated too much memory at once. That won't happen with the C++ version.)
I'm particularly proud of these results because libpqxx builds on top of libpq, using only its C-level functions to communicate with the database. So it was a very nice surprise to hear that libpqxx result streaming was so fast.
A lot of work went into getting this result. All of that work you could do for yourself with libpq, if you wanted, but it's not trivial. The result streaming uses the SQL COPY command internally, which makes the server send more or less raw data. Then libpqxx code parses this directly, without calling on libpq.
Initially, streaming was slower than normal result sets, because of the work needed to deal with the various encodings in which the server may send the data. In libpqxx you get support for all of the same client encodings that libpq supports, though some will be faster than others. This means that the inner processing loop needs to make a variable call for each and every character in the output, to find the bytes where the character begins and ends.
The trick was in making the compiler generate entire specialised processing loops for each encoding, so it could optimise across these calls as if they were just written directly in the algorithm. And of course, it had to be done without repeating lots of code or using ugly macro tricks. I think it worked out pretty well!
Technical Overview
This library works on top of the C-level API library, libpq
. It comes with
postgres. You will link your program with both libpqxx
and libpq
in that
order.
Coding with libpqxx revolves around transactions. Transactions are a central concept in database management systems, but they are widely under-appreciated among application developers. In libpqxx, they're fundamental.
With conventional database APIs, you issue commands and queries to a database session or connection, and optionally create the occasional transaction. In libpqxx you start with a connection, but you do all your SQL work in transactions that you open in your connection. You commit each transaction when it's complete. If you don't, all changes made inside the transaction get rolled back.
There are several types of transactions with various "quality of service" properties. If you really don't want a transaction on the database, one of the available transaction types is called nontransaction. It provides basic non-transactional behaviour. (This is sometimes called "autocommit": it commits every successful command right away).
You can execute queries in different ways, each by calling methods on the transaction: * "query" methods execute a query, wait for the full result data to come back, and provide you with each row of data converted to field types of your choice. * "stream" methods execute a query, taking a bit more time to start up, but the data then comes in at a higher rate, and you can start processing each row right as it comes in. Just like the "query" methods, they convert the data to field types of your choice. * "exec" methods execute a query, wait for the full result to come in, and then give you a result object. It's a container of rows (and each row is a container of fields), but it also contains some metadata about the result set. In particular, use "exec0" when you execute a command that should not return any data rows at all.
Either way, don't do any if
statements to check for errors when you execute an
SQL command. If something
goes wrong, you will get an exception.
Quick example
Can't have a database example without an Employee table. Here's a simple application: find an employee by name, and raise their salary by 1 whatever-it-is-they-get-paid-in.
This example is so simple that anything that goes wrong crashes the program. You won't need to do a lot more to fix that, but we'll get to it later.
#include <iostream>
#include <pqxx/pqxx>
int main(int, char *argv[])
{
// (Normally you'd check for valid command-line arguments.)
pqxx::connection c{"postgresql://accounting@localhost/company"};
pqxx::work txn{c};
// For querying just one single value, the transaction has a shorthand method
// query_value().
//
// Use txn.quote() to escape and quote a C++ string for use as an SQL string
// in a query's text.
int employee_id = txn.query_value<int>(
"SELECT id "
"FROM Employee "
"WHERE name =" + txn.quote(argv[1]));
std::cout << "Updating employee #" << employee_id << '\n';
// Update the employee's salary. Use exec0() to perform a command and check
// that it produces no result rows. If the result does contain data, this
// will throw an exception.
//
// The ID is an integer, so we don't need to escape and quote it when using
// it in our query text. Just convert it to its PostgreSQL textual
// representation using to_string().
txn.exec0(
"UPDATE EMPLOYEE "
"SET salary = salary + 1 "
"WHERE id = " + pqxx::to_string(employee_id));
// Make our change definite.
txn.commit();
}
You'll find more detailed explanations and reference-style docs on the ReadTheDocs site.
Building your libpqxx program
The details depend on your system and compiler. On a typical Unix-like system, you might do:
c++ add_employee.cxx -lpqxx -lpq
Remember to keep the -lpqxx
and -lpq
in that order! Otherwise the linker
will complain bitterly about missing functions like PQconnectdb
and PQexec
.
If libpqxx is installed in a nonstandard location, such as /usr/local
, you
may need to add options like -I/usr/local/include
(to make the compiler find
headers pqxx/*
in /usr/local/include/pqxx
), and/or -L/usr/local/lib
(to
make the linker find the library in /usr/local/lib
).
This should work on most GNU/Linux systems (Mint, Debian, Fedora, Gentoo,
Red Hat, Slax, Ubuntu, etc.), BSD systems (FreeBSD, NetBSD, OpenBSD), vaguely
Unix-like systems such as Apple's macOS, and so on — as long as you have
libpqxx, libpq, and a C++ compiler installed. If your C++ compiler has a
different name on the command line, use that instead of "c++
".
It works differently on Microsoft Windows, though there are development environments out there that behave more like a Unix system.
Handling errors
Errors are exceptions, and derived from std::exception
, just like you'd
expect. So you can handle database errors like all others:
#include <iostream>
#include "my-db-code.hxx"
int main(int argc, char *argv[])
{
try
{
do_db_work(trans);
}
catch (std::exception const &e)
{
std::cerr << e.what() << std::endl;
return 1;
}
}
Of course libpqxx also defines its own exception hierarchy for errors it throws, so you can handle those specially if you like:
#include <iostream>
#include <pqxx/except>
#include "my-db-code.hxx"
int main(int argc, char *argv[])
{
try
{
do_db_work(trans);
}
catch (pqxx::sql_error const &e)
{
std::cerr
<< "Database error: " << e.what() << std::endl
<< "Query was: " << e.query() << std::endl;
return 2;
}
catch (std::exception const &e)
{
std::cerr << e.what() << std::endl;
return 1;
}
}
Just one caveat: not all platforms support throwing an exception in a shared library and catching it outside that shared library. Unless you're building for Windows, it's probably a good habit to use static libraries instead.
Complete example
Here's a more complete example, showing iteration and direct field access.
#include <iostream>
#include <pqxx/pqxx>
/// Query employees from database. Return result.
pqxx::result query()
{
pqxx::connection c{"postgresql://accounting@localhost/company"};
pqxx::work txn{c};
// Silly example: Add up all salaries. Normally you'd let the database do
// this for you.
long total = 0;
for (auto [salary] : txn.query("SELECT salary FROM Employee"))
total += salary;
std::cout << "Total salary: " << total << '\n';
// Execute and process some data.
pqxx::result r{txn.exec("SELECT name, salary FROM Employee")};
for (auto row: r)
std::cout
// Address column by name. Use c_str() to get C-style string.
<< row["name"].c_str()
<< " makes "
// Address column by zero-based index. Use as<int>() to parse as int.
<< row[1].as<int>()
<< "."
<< std::endl;
// Not really needed, since we made no changes, but good habit to be
// explicit about when the transaction is done.
txn.commit();
// Connection object goes out of scope here. It closes automatically.
// But the result object remains valid.
return r;
}
/// Query employees from database, print results.
int main(int, char *argv[])
{
try
{
pqxx::result r{query()};
// Results can be accessed and iterated again. Even after the connection
// has been closed.
for (auto row: r)
{
std::cout << "Row: ";
// Iterate over fields in a row.
for (auto field: row) std::cout << field.c_str() << " ";
std::cout << std::endl;
}
}
catch (pqxx::sql_error const &e)
{
std::cerr << "SQL error: " << e.what() << std::endl;
std::cerr << "Query was: " << e.query() << std::endl;
return 2;
}
catch (std::exception const &e)
{
std::cerr << "Error: " << e.what() << std::endl;
return 1;
}
}
Results and result rows have all the member functions you expect to find in a
container: begin()
/end()
, front()
/back()
, size()
, index operator, and
so on. The contents are immutable.