wiki:WikiStart

Version 200 (modified by jtv, 3 months ago) (diff)

--

The C++ connector for PostgreSQL

libpqxx is the official C++ client API for PostgreSQL, the enterprise-strength open-source relational database. (If "PostgreSQL" is too verbose, call it by its shorter name, postgres).

If you are writing software in C++ that needs to access databases managed by postgres—on just about any platform—then libpqxx is the library you use. It is the standard C++ language binding for the postgres RDBMS.

The source code for libpqxx is available under the BSD license, so you're free to download it, pass it on to others, change it, sell it, include it in your own code, and share your changes with anyone you choose. No charge, no catch. Also, no guarantees.

News

2019-03-19: Bug in result objects outliving their connection objects

This one took a while to figure out. The underlying C driver was doing something subtly unexpected. As a result, a libpqxx application will accidentally access de-allocated memory (and probably crash) when all of the following things happen together:

  1. The application receives a result object from a connection.
  2. It destroys the connection object, but keeps the result.
  3. And, it continues using the result.
  4. Then, some operation on the result causes the underlying C driver, libpq, to issue an error or warning message.

Nothing happens in "normal" cases because error or warning messages are pretty rare. For instance, the problem was first spotted in a test which tries to retrieve the name of a nonexistent result column — not something most applications will try.

To make it even more obscure, the crash only happened with some compilers. Don't think that makes them bad compilers; it's sheer stupid coincidence that we never noticed any symptoms of the problem before.

How to proceed? There isn't much that I can do about this without risking performance degradation. The next release though, 6.4.0, will make the bug yet easier to avoid. The bug can then only happen if the connection also has an errorhandler registered on it at the time when it produced the result. And of course documentation will warn you about this, as a risk of installing error handlers.

It's still a nasty complication though, and if you have any clever ideas on how to improve the situation, please file a bug on our Github page!

2019-02-02: 6.3.0 released

Greetings from FOSDEM. After a lot of changes, 6.3.0 has now been tagged and announced. See below for the changes, and directions for the future.

2019-01-20: Big changes coming

We're getting close to a 6.3 release, which introduces some important new features. But it also gets us ready for some big incompatible changes in 7.0. Beyond that, I have what could be a theme for 8.0. Let's go through these one by one:

What's new in 6.3

Thanks to some major contributions from Joseph "JadeMatrix" Durel, we now have:

Type-safe replacements for tablereader and tablewriter: stream_from and stream_to. They support tuples (or tuple-like types) to represent database rows on the client side, so you could stream a std::tuple or std::vector straight into a database row, or vice versa.

Broader support for variable-width multibyte client encodings. UTF-8 has always worked, but other encodings such as SJIS might confuse the code by embedding bytes that looked like ASCII quotes or backslashes and such inside a multibyte character. From now on, the library should work properly with all client encodings that PostgreSQL supports.

In other news:

  • libpqxx lets you assign names to some types of objects, such as cursors or transactions. These names are now more tolerant of "special" characters such as embedded quotes or non-ASCII letters.
  • The CMake build has been overhauled. There may be teething problems, so please report anything you run into.
  • Copying result objects for long SQL queries is now faster.

For the really big changes though, we're going to have to break stuff. Many items have also been deprecated. See the section on my 7.0 plans below.

Also, many items which had been documented as being deprecated are now actually marked deprecated, using the C++ deprecated attribute if your compiler supports it. This means that you may start seeing warnings where your code has been using deprecated features. At some point those features really will disappear.

Preparing for 7.0

We'll see some profound change in libpqxx 7.0. Some of details still have to be worked out, but in 6.3 you'll start seeing deprecation warnings for features that will no longer work as before.

C++ upgrade

C++11 has been great, but now I want more. There are some wonderful things we can do if we bump the minimum requirement to C++17. Is your compiler ready for them?

The C++17 features I'd particularly like to use are:

  • std::optional as the one standard way to accommodate null values.
  • Nested namespaces. It'll make pqxx::internal easier for me to manage.
  • std::string_view may become really important later on; see my "libpqxx 8" plans below.
  • std::to_chars and std::from_chars could replace a lot of libpqxx code.

If your compiler is not ready for any of these new features, please file a bug on the Github page so I can avoid breaking your build.

Connection classes

The connection class hierarchy is going to change drastically. If you have defined your own custom connection classes, these will no longer work, and you will start to see deprecation warnings in libpqxx 6.3. The new design is not final, but it will be a lot simpler than what we have now. Some esoteric features are going to disappear, and in return you'll get a move constructor and a much easier class hierarchy to work with.

The plan is to fold essentially the entire class hierarchy into one single class. It will support the current built-in connection types, but it won't be extensible. We have always had an API for defining your own connection policies, and the library uses it to implement different types of connections, but I've never heard of anyone else using this API to define their own connection types. The existing built-in connection types will just be thin wrappers for the single connection class, and eventually the single connection class should replace them all.

Replacing lazyconnection and asyncconnection

If you're using lazyconnection or asyncconnection, the API will change. You'll have to do a bit of extra work.

In older libpqxx versions, you got a connection object which looks just like a regular connection but only completes its actual connection to the database once you actually start using it. Your code will have to go through an explicit check-or-wait step before using the connection. It may end up looking similar to std::future, or it may just be a member function. I'm not sure yet.

No more connection reactivation

It's just a fact of life: network connections to the database can break. It could happen because of a network problem, or the database may have been restarted. The existing connection classes hide this from you. They notice that the connection is broken and quietly try to re-establish it for you. This feature is going to be removed.

That may sound like a problem. But your application already knows how to deal with a broken connection, right? It may simply happen slightly more often.

I believe doing it this way will make the application design issues clearer, eliminate a duplication of effort between libpqxx and your application, and clear out some dark corners where bugs might hide. It will also remove an enormous source of complexity inside the connection classes.

As a bonus, several operations on connection objects can finally get the const qualifier. That's because they will no longer try to re-establish a broken connection, or finalise an incomplete lazy or asynchronous connection. It will be a lot more obvious which operations make changes to your connection and which ones will not. Some references in your code that don't look like they modify the connection may even become const.

A theme for 8.0?

Kirit "KayEss" Sælensminde has taught me a lot about C++ API performance. And before he makes libpqxx obsolete with a library of his own, I'd like to give you some of his speedups myself.

Here's what I have in mind. (Your feedback would be welcome; feel free to file issues on the Github page.)

  • String conversions will change. If you were defining your own string_traits types, you'll have to support some additional features.
  • You'll be able to avoid a lot of std::string copying, which is a major performance drag, at the cost of some convenience.

I'm not quite sure how this will work yet. One really simple idea is: a new version of to_string might accept a buffer of some kind, so it can write its output there without allocating a new one for you. At a minimum it could take a std::string and reuse the storage it already has allocated.

A more general idea is: you pass a lambda (or some other callable) to to_string as a new parameter. Instead of just creating and returning a new std::string, to_string will construct its output in whatever form is convenient internally.

Then to_string calls your lambda, passing the output into it as (for example) a std::string_view, which is much cheaper than creating a new std::string. The to_string call will return whatever your lambda returns. The default lambda will simply construct a std::string and return it, so by default you keep the existing behaviour.

As a silly example, let's say that you're writing numbers to a stream:

for (auto i: items)
    stream << pqxx::to_string(i);

It's not very efficient for to_string to allocate a new buffer on the heap, for each i, just so its contents can immediately be copied into the internal buffer for stream.

If to_string took a lambda, you could write something like:

for (auto i: items)
    pqxx::to_string(i, [&stream](std::string_view text){ stream << text; });

This lets to_string construct its output in an efficient little buffer on the stack, and writes it straight to the stream buffer. The to_string call doesn't even need to return a value.

But there are some problems still. In some cases it could be really helpful if the text buffer had a terminating zero byte. Some to_string implementations may be able to provide that very cheaply, while it could be expensive for others. Let me know what you think!

2018-09-22: libpqxx turns 18

Today marks 18 years since I committed the first revision of libpqxx in its original CVS repo. I later converted that repo to Subversion (svn), and imported it to Bazaar (bzr), and finally converted the Subversion repo to Git. But amazingly, the 18-year revision history is still there.

Happy birthday, libpqxx — you are now old enough to vote!

It's been a real education to maintain my own code for 18 years. The most important property a piece of code can have is clarity. Not functionality, not correctness: you can fix those later, so long as the clarity is there. To write software is to explain, to both computers and humans, a solution to a problem.

Over 18 years you often find yourself wondering "what on Earth was I thinking when I wrote this?" This code has provided me with a constant reminder that something which seemed perfectly obvious when you wrote it, can become a total mystery to you later. Think kindly of the poor human who's struggling to understand your code, because it may be a future you.

And so, I hope libpqxx today is clearer and more maintainable than it was even a few years ago. C++11 helped a lot, and I hope to move the baseline to C++14 before too long. C++17 will bring more radical improvements, streamlining the string conversions in particular, and maybe adding some optimisations. The language is still evolving in wonderful ways, PostgreSQL is still getting better, and libpqxx has more changes to look forward to in its adult life.

Finding Everything

Where What
Online Documentation Documentation hosted on Read The Docs
Github page Get source, report bugs, submit patches, request changes
Author and Contributors Who made all this?

Technical Overview

This library works on top of the C-level API library, libpq. It comes with postgres. You will link your program with both libpqxx and libpq, in that order.

Coding with libpqxx revolves around transactions. Transactions are a central concept in database management systems, but they are widely under-appreciated among application developers. In libpqxx, they're fundamental.

With conventional database APIs, you issue commands and queries to a database session or connection, and optionally create the occasional transaction. In libpqxx you start with a connection, but you do all your SQL work in transactions that you open in your connection. You commit each transaction when it's complete; if you don't, all changes made inside the transaction get rolled back.

There are several types of transactions with various "quality of service" properties. If you really don't want a transaction, one of the available transaction types is called nontransaction. This transaction type provides basic non-transactional behaviour. (This is sometimes called "autocommit": it commits every command right away).

Every command or query returns a result. Your query fetches its result data immediately when you execute it, and stores it in the result. Don't check your result for errors; failures show up as regular C++ exceptions.

Result objects can be kept around for as long as they are needed, completely separate from the connections and transactions that originated them. You can access the rows in a result using standard iterators, or more like an array using numerical indexes. Inside each row you can access the fields by standard iterators, numerical indexes, or using column names.

Brief example

Can't have a database example without an Employee table. Here's a simple program: find an employee by name, and raise their salary by 1 whatever-it-is-they-get-paid-in.

This example is so simple that anything that goes wrong crashes the program. You won't need to do that much more to fix that, but we'll get to it later.

#include <iostream>
#include <pqxx/pqxx>

int main(int, char *argv[])
{
  pqxx::connection c("dbname=company user=accounting");
  pqxx::work txn(c);

  pqxx::result r = txn.exec(
    "SELECT id "
    "FROM Employee "
    "WHERE name =" + txn.quote(argv[1]));

  if (r.size() != 1)
  {
    std::cerr
      << "Expected 1 employee with name " << argv[1] << ", "
      << "but found " << r.size() << std::endl;
    return 1;
  }

  int employee_id = r[0][0].as<int>();
  std::cout << "Updating employee #" << employee_id << std::endl;

  txn.exec(
    "UPDATE EMPLOYEE "
    "SET salary = salary + 1 "
    "WHERE id = " + txn.quote(employee_id));

  txn.commit();
}

For more realistic examples with detailed explanations, see CodeExamples. Also, see the API reference for details about what everything does.

Building your libpqxx program

The details depend on your system and compiler. On a typical Unix-like system, you might do:

c++ add_employee.cxx -lpqxx -lpq

Remember to keep the -lpqxx and -lpq in that order! Otherwise the linker will complain bitterly about missing functions like PQconnectdb and PQexec.

If libpqxx is installed in a nonstandard location, such as /usr/local, you may need to add options like -I/usr/local/include (to make the compiler find headers pqxx/* in /usr/local/include/pqxx), and/or -L/usr/local/lib (to make the linker find the library in /usr/local/lib).

This should work on most GNU/Linux systems (Mint, Debian, Fedora, Gentoo, Red Hat, Slax, Ubuntu, etc.), BSD systems (FreeBSD, NetBSD, OpenBSD), vaguely Unix-like systems such as Apple OS X, and so on—as long as you have libpqxx, libpq, and a C++ compiler installed. If your C++ compiler has a different name on the command line, use that instead of "c++").

It works differently on Microsoft Windows, though there are development environments out there that behave more like a Unix system.

Handling errors

Errors are exceptions, derived from std::exception, just like you'd expect. So you can handle database errors like all others:

#include <iostream>
#include "my-db-code.hxx"

int main(int argc, char *argv[])
{
  try
  {
    do_db_work(trans);
  }
  catch (const std::exception &e)
  {
    std::cerr << e.what() << std::endl;
    return 1;
  }
}

Of course libpqxx also defines its own exception hierarchy for errors it throws, so you can handle those specially if you like:

#include <iostream>
#include <pqxx/except>
#include "my-db-code.hxx"

int main(int argc, char *argv[])
{
  try
  {
    do_db_work(trans);
  }
  catch (const pqxx::sql_error &e)
  {
    std::cerr
        << "Database error: " << e.what() << std::endl
        << "Query was: " << e.query() << std::endl;
    return 2;
  }
  catch (const std::exception &e)
  {
    std::cerr << e.what() << std::endl;
    return 1;
  }
}

Just one caveat: not all platforms support throwing an exception in a shared library and catching it outside that shared library. It's probably a good habit to use static libraries instead.

Complete example

Here's a more complete example, in C++11, showing iteration and direct field access.

#include <iostream>
#include <pqxx/pqxx>

/// Query employees from database.  Return result.
pqxx::result query()
{
  pqxx::connection c{"dbname=company user=accounting"};
  pqxx::work txn{c};

  pqxx::result r = txn.exec("SELECT name, salary FROM Employee");
  for (auto row: r)
    std::cout
      // Address column by name.  Use c_str() to get C-style string.
      << row["name"].c_str()
      << " makes "
      // Address column by zero-based index.  Use as<int>() to parse as int.
      << row[1].as<int>()
      << "."
      << std::endl;

  // Not really needed, since we made no changes, but good habit to be
  // explicit about when the transaction is done.
  txn.commit();

  // Connection object goes out of scope here.  It closes automatically.
  return r;
}


/// Query employees from database, print results.
int main(int, char *argv[])
{
  try
  {
    pqxx::result r = query();

    // Results can be accessed and iterated again.  Even after the connection
    // has been closed.
    for (auto row: r)
    {
      std::cout << "Row: ";
      // Iterate over fields in a row.
      for (auto field: row) std::cout << field.c_str() << " ";
      std::cout << std::endl;
    }
  }
  catch (const pqxx::sql_error &e)
  {
    std::cerr << "SQL error: " << e.what() << std::endl;
    std::cerr << "Query was: " << e.query() << std::endl;
    return 2;
  }
  catch (const std::exception &e)
  {
    std::cerr << "Error: " << e.what() << std::endl;
    return 1;
  }
}

Results and result rows have all the member functions you expect to find in a container: front()/back(), size(), index operator, and so on. You can't modify the result's contents, however.


Ohloh Metrics GTF Contributor