Wednesday, November 27, 2013

Variadic templates and all that stuff

In general, we mean by any concept nothing more than a set of operations; the concept is synonymous with a corresponding set of operations.

-- Percy Bridgman, The Logic of Modern Physics, paper

The last year or so I've drifted towards using variadic templates, tuples and meta function in my day to day programming. It's not clear if this is a good thing or not, but the reality is that it's fun and I continuously find new ways of solving old problems. Ones in a while I also find elegant ways of solving new problems using variadic templates, tuples and meta programming techniques.

This entry is about a small collection of functions which I tend to use regularly.

Downloading the code

The code from this entry and some of the previous entries can be retrieved from: https://github.com/hansewetz/blog

A few examples before starting to code

Here is an example showing basic printing of a tuple:

// create a tuple and print it
auto tu=std::make_tuple(1,"Hello",string("world"));
cout<<tu<<endl;

The result is: [1 Hello world ]

Sometimes it's also useful to print the types of tuples elements:

auto z1=make_tuple(1,"hello",2,std::plus<int>());
cout<<type2string(z1)<<endl;

The result is: (std::tuple<int, char const*, int, std::plus<int> >)

Another way to print the element types, avoiding printing the tuple type is as follows:

auto z2=std::make_tuple(1,"hello",2,std::plus<int>());
cout<<transform_tuple(type2string_helper(),z2)<<endl;

The result is: [(int) (char const*) (int) (std::plus<int>) ]

The function transform_tuple is a function, I'll get to later, which generates a new tuple by applying a function object to each one of the tuples elements.

Another type of example folds the tuple into a single value:

auto c=make_tuple(1,2,3,4,5,6);
cout<<"six factorial: "<<foldl_tuple(multiplies<int>(),c)<<endl;

The result is: six factorial: 720.

Clearly any binary function can be used. For example, a function which stringifies it's parameters and concatenates them or a function adding two values often comes handy.

Sometimes its useful to rearrange elements in a tuple. For example, say we want to create a tuple which have element 5 first and element 2 last:

auto z3=make_tuple(0,1,2,3,4);
cout<<"rearanged tuple: "<<transform_tuple(indlist<4,0,1,3,2>(),z3)<<endl;

The result is: rearranged tuple: [4 0 1 3 2 ]

The indlist is a template struct which will be part in one way or another of most of the tools in this blog entry.

Before getting started with implementations I'll give a code sample of a very powerful idea. The example below shows how a tuple can be chopped up into individual types and passed as a set of arguments to a function:

// add tuple elements together
auto z5=make_tuple(1,2.3,7);
cout<<apply_tuple(adder(),z5)<<endl;

The output is: 10.3.

The type adder is a utility struct (which I'll show later) calculating the sum of its' input arguments.

The indlist type

It turns out that a compile time list of integers is very helpful when manipulating tuples. An indlist is a simple struct defined as:

template<std::size_t...i>struct indlist{};

With indlists it's possible to manipulate tuple elements in pretty amazing ways. So before starting to make use of indlists let's develop a few tools for managing them. First I need to be able to create indlists. A straight forward way is to write something like:

using IL=indlist<0,4>;

It will become tedious to always manually create them by enumerating integers. Instead I want create a couple of utilities for creating them based on some patterns. Before doing this I'll create two meta functions :

// push front an index to an indlist
template<std::size_t a,typename IL>struct push_front;
template<std::size_t a,size_t...i>
struct push_front<a,indlist<i...>>{
  using type=indlist<a,i...>;
};
// push back an index to an indlist
template<std::size_t a,typename IL>struct push_back;
template<std::size_t a,size_t...i>
struct push_back<a,indlist<i...>>{
  using type=indlist<i...,a>;
};

Now it's easy to define a few indlist creation functions:

// make an indlist from [start ... end]
template<std::size_t start,std::size_t end>
struct make_indlist_from_range{
  using type=typename push_front<start,typename make_indlist_from_range<start+1,end>::type>::type;
};
template<std::size_t i>
struct make_indlist_from_range<i,i>{
  using type=indlist<i>;
};
// make an indlist with N elements of value V
template<std::size_t N,std::size_t V>
struct make_uniform_indlist{
  using type=typename push_back<V,typename make_uniform_indlist<N-1,V>::type>::type;
};
template<std::size_t V>
struct make_uniform_indlist<0,V>{
  using type=indlist<>;
};
// make an indlist from a tuple
template<typename Tuple>struct make_indlist_from_tuple;
template<typename...Args>
struct make_indlist_from_tuple<std::tuple<Args...>>{
  using type=typename make_indlist_from_range<0,sizeof...(Args)-1>::type;
};
template<>
struct make_indlist_from_tuple<std::tuple<>>{
  using type=indlist<>;
};

Some readers may have seen other ways of constructing indlists. For example, it is possible to construct an indlist as follows:

// indlist
template<std::size_t...i>struct indlist{};

// make an indlist from [start ... end]
template<std::size_t endind,std::size_t ind,std::size_t...i>
struct indlist_builder:indlist_builder<endind,ind+1,i...,ind>{
};
template<std::size_t endind,std::size_t...i>
struct indlist_builder<endind,endind,i...>{
  using type=indlist<i...,endind>;
};
template<std::size_t start,size_t end>
struct make_indlist_from_range{
  using type=typename indlist_builder<end,start>::type;
};

Manipulating indlists

Now that we know how to create indlists it's time to code up a few tools for manipulating them. The tools will come in handy later on so it's not a bad idea to have them ready. I've already coded push_front and push_back. Before coding the related a few more push functions together with and pop functions I'll code a helper function reversing an indlist:


// reverse an indlist (re-use push_back)
template<std::size_t i,size_t...j>
struct reverse<indlist<i,j...>>{
  using R=typename reverse<indlist<j...>>::type;
  using type=typename push_back<i,R>::type;
};
template<>
struct reverse<indlist<>>{
  using type=indlist<>;
};

It's now easy to code the push and pop functions:

// pop front index from indlist
template<typename IL>struct pop_front;
template<std::size_t i,std::size_t...j>
struct pop_front<indlist<i,j...>>{
  using type=indlist<j...>;
};
// pop back element from indlist
// (reverse, pop front, reverse)
template<typename IL>
struct pop_back{
  using R0=typename reverse<IL>::type;
  using R1=typename pop_front<R0>::type;
  using type=typename reverse<R1>::type;
};
// pop n elements from front
template<std::size_t N,typename IL>
struct popn_front{
  using TMP=typename pop_front<IL>::type;
  using type=typename popn_front<N-1,TMP>::type;
};
template<typename IL>
struct popn_front<0,IL>{
  using type=IL;
};
// pop n elements from back
template<std::size_t N,typename IL>
struct popn_back{
  using TMP=typename pop_back<IL>::type;
  using type=typename popn_back<N-1,TMP>::type;
};
template<typename IL>
struct popn_back<0,IL>{
  using type=IL;
};

By now it should be clear that I've not always coded the most efficient solution. Since compilation of meta programming code often generates obscure error messages I prefer code which is simple to read.

A predicate which can be used with std::enable_if is sometimes useful when we want to enable a function to be used only for indlists:

// check if a type is an indlist
template<typename T>struct is_indlist:std::false_type{};
template<std::size_t...i>struct is_indlist<indlist<i...>>:std::true_type{};

I'll also add (unrelated to is_indlist) a predicate for checking if a type is a tuple:


// check if a type is a tuple;
template<typename T>struct is_tuple:std::false_type{};
template<typename...T>struct is_tuple<std::tuple<T...>>:std::true_type{};

For debugging purposes printing an indlist is indispensable:

// print an indlist object
template<std::size_t i,std::size_t...j>
std::ostream&operator<<(std::ostream&os,indlist<i,j...>const&){
  return os<<i<<" "<<indlist<j...>();
}
std::ostream&operator<<(std::ostream&os,indlist<>const&){
  return os;
}

Below are a couple of other useful meta functions:

// get size of an indlist
template<typename IL>struct size;
template<std::size_t...i>
struct size<indlist<i...>>{
  static std::size_t const value=sizeof...(i);
};
// get first element as an indlist
template<typename IL>struct front;
template<std::size_t i,std::size_t...j>
struct front<indlist<i,j...>>{
  using type=indlist<i>;
};
// get last element as an indlist
template<typename IL>
struct back{
  using R=typename reverse<IL>::type;
  using type=typename front<R>::type;
};
// get first N elements as an indlist
template<std::size_t N,typename IL>
struct firstn{
  using type=typename popn_back<size<IL>::value-N,IL>::type;
};
// get last N elements as an indlist
template<std::size_t N,typename IL>
struct lastn{
  using R1=typename reverse<IL>::type;
  using R2=typename firstn<N,R1>::type;
  using type=typename reverse<R2>::type;
};
// get an element as a size_t gven the index
template<std::size_t IND,typename IL>struct get;
template<std::size_t IND,std::size_t i,std::size_t...j>
struct get<IND,indlist<i,j...>>{
  static constexpr std::size_t value=get<IND-1,indlist<j...>>::value;
};
template<std::size_t i,std::size_t...j>
struct get<0,indlist<i,j...>>{
  static constexpr std::size_t value=i;
};
// append an indlist to another indlist
template<typename IL1,typename IL2>struct append;
template<typename IL1,std::size_t i,std::size_t...j>
struct append<IL1,indlist<i,j...>>{
  using type=typename append<typename push_back<i,IL1>::type,indlist<j...>>::type;
};
template<typename IL1>
struct append<IL1,indlist<>>{
  using type=IL1;
};

A few esoteric indlist functions

Just to show how easy it is (and also just because we can) here are a few more – not so useful – meta functions:

// get a slice of an indlist
template<std::size_t IND,std::size_t LEN,typename IL>
struct slice{
  using IL1=typename popn_front<IND,IL>::type;
  using type=typename popn_back<size<IL1>::value-LEN,IL1>::type;
};
// reverse elements starting from I and N elements forward
template<std::size_t IND,std::size_t N,typename IL>
struct reverse_slice{
  using IL_LEFT=typename slice<0,IND,IL>::type;
  using IL_MID=typename slice<IND,N,IL>::type;
  using IL_RIGHT=typename slice<IND+N,size<IL>::value-(IND+N),IL>::type;
  using IL_MID_R=typename reverse<IL_MID>::type;
  using A=typename append<IL_LEFT,IL_MID_R>::type;
  using type=typename append<A,IL_RIGHT>::type;
};
// rotate an indlist left N steps
template<std::size_t N,typename IL>
struct rotate_left{
  using R1=typename slice<0,N,typename reverse_slice<0,N,IL>::type>::type;
  using R2=typename slice<N,size<IL>::value-N,typename reverse_slice<N,size<IL>::value-N,IL>::type>::type;
  using A=typename append<R1,R2>::type;
  using type=typename reverse<A>::type;
};
// add +1 to each index in indlist
template<typename IL>struct add1;
template<std::size_t i,std::size_t...j>
struct add1<indlist<i,j...>>{
  using type=typename push_front<i+1,typename add1<indlist<j...>>::type>::type;
};
template<>
struct add1<indlist<>>{
  using type=indlist<>;
};
// fold list to an int with a binary meta function (taking two ints)
//                  foldl1 f [x]      = x
//                  foldl1 f (x:y:xs) = foldl1 f (f x y : xs)
template<template<std::size_t,std::size_t>class FBIN,typename IL>struct foldl_indlist2int;
template<template<std::size_t,std::size_t>class FBIN,std::size_t i,std::size_t j>
struct foldl_indlist2int<FBIN,indlist<i,j>>{
  constexpr static std::size_t value=FBIN<i,j>::value;
};
template<template<std::size_t,std::size_t>class FBIN,std::size_t i,std::size_t j,std::size_t...k>
struct foldl_indlist2int<FBIN,indlist<i,j,k...>>{
  constexpr static std::size_t value=foldl_indlist2int<FBIN,indlist<FBIN<i,j>::value,k...>>::value;
};

That's it for functions managing indlists. At this point it should be easy to write other customised indlist functions.

Putting indlists to work – the basics

One of the work horses when dealing with tuples is a function which takes a callable object and a tuple as parameters. The function passes the tuple elements to the function as a list of arguments:

// dummy meta function returning 'return type' for 'apply_tuple_with_indlist'
template<typename IL,typename F,typename T>struct apply_tuple_with_indlist_ret;
template<std::size_t...i,typename F,typename T>
struct apply_tuple_with_indlist_ret<indlist<i...>,F,T>{
  using type=typename std::result_of<F(decltype(std::get<i>(std::forward<T>(std::declval<T>())))...)>::type;
};
// pass tuple to a function taking list of arguments (controll which args and the order using indlist)
template<std::size_t...i,typename F,typename T,typename R=typename apply_tuple_with_indlist_ret<indlist<i...>,F,T>::type>
constexpr auto apply_tuple_with_indlist(indlist<i...>const&,F f,T&&tu)->R
{
  return f(std::get<i>(std::forward<T>(tu))...);
}

Notice that the apply_tuple_with_indlist_ret only serves the purpose of evaluating the return type.

A simple application of the function is to create a new tuple from an old one by picking elements from the tuple. Given the function maketuple:

// create a tuple from a list of args (not same as make_tuple - this is a structure)
struct maketuple{
  template<typename...T>
  constexpr std::tuple<T...>operator()(T&&...t){
    return std::tuple<T...>(std::forward<T>(t)...);
  }
};

I can do the following :

// create a tuple with elements in reverse order
auto ttu1=make_tuple(1,2,3,4,5);
using IL2=make_indlist_from_tuple<decltype(ttu1)>::type;
cout<<apply_tuple_with_indlist(reverse<IL2>::type(),maketuple(),ttu1)<<endl;

The output is: [5 4 3 2 1 ]

Another application is to shave off or add elements to a tuple. For instance, say I want to pop the first element of a tuple:

// pop first element of a tuple
auto ttu1=make_tuple(1,2,3,4,5);
using IL2=make_indlist_from_tuple<decltype(ttu1)>::type;
cout<<apply_tuple_with_indlist(pop_front<IL2>::type(),maketuple(),ttu1)<<endl;

The output is: [2 3 4 5 ]

I can simplify the use of apply_tuple_with_indlist by directly passing a meta function as a parameter:

// transformation of a tuple to another tuple
// (template template parameter can be 'reverse', 'pop_front', ... but must transform an indlist)
template<template<class>class FIT,
         typename T,
         typename TD=typename std::decay<T>::type,
         typename IL=typename make_indlist_from_tuple<TD>::type,
         typename IR=typename FIT<IL>::type,
         typename F=maketuple,
         typename R=typename apply_tuple_with_indlist_ret<IR,F,T>::type>
constexpr R transform_tuple(T&&t){
  return apply_tuple_with_indlist(IR(),F(),std::forward<T>(t));
}

Now I can simply write the previous reverse examples as:

// reverse tuple elements
auto ttu1=make_tuple(1,2,3,4,5);
cout<<transform_tuple<reverse>(ttu1)<<endl;

Of course I can also create a genuine utility function for reversing tuples as:

// reverse tuple (utility function)
template<typename T,
         typename TD=typename std::decay<T>::type,
         typename IL=typename make_indlist_from_tuple<TD>::type,
         typename IR=typename reverse<IL>::type,
         typename F=maketuple,
         typename R=typename apply_tuple_with_indlist_ret<IR,F,T>::type>
constexpr R reverse_tuple(T&&t){
  return apply_tuple_with_indlist(IR(),F(),std::forward<T>(t));
}

Continuing along the same lines I'll create utility functions for popping elements off a tuple:

// pop front
template<std::size_t N,typename T,
         typename F=maketuple,
         typename TD=typename std::decay<T>::type,
         typename IL=typename popn_front<N,typename make_indlist_from_tuple<TD>::type>::type,
         typename R=typename apply_tuple_with_indlist_ret<IL,F,T>::type>
constexpr auto popn_front_tuple(T&&t)->R{
  return apply_tuple_with_indlist(IL(),F(),std::forward<T>(t));
}
// pop back
template<std::size_t N,typename T,
         typename F=maketuple,
         typename TD=typename std::decay<T>::type,
         typename IL=typename popn_back<N,typename make_indlist_from_tuple<TD>::type>::type,
         typename R=typename apply_tuple_with_indlist_ret<IL,F,T>::type>
constexpr auto popn_back_tuple(T&&t)->R{
  return apply_tuple_with_indlist(IL(),F(),std::forward<T>(t));
}

An example is shown here:

// pop 2 elements from fron of tuple
auto a1=make_tuple(1,2,3,4);
cout<<"popn_front<2>: "<<popn_front_tuple<2>(a1)<<endl;

The result is: popn_front: [3 4 ]

So far I've exploded a tuple into elements before passing the elements to the function. Sometimes it may be useful to call a function by passing a list of arguments directly but controlling the order in which the arguments are passed to the function. Or, picking the arguments we want to forward to the function. The code looks lie:

// pass list of types to a function (control order of arguments to F with an 'indlist')
template<std::size_t...i,typename F,typename...T,
         typename IL=indlist<i...>,
         typename TU=std::tuple<T...>,
         typename R=typename apply_tuple_with_indlist_ret<IL,F,TU>::type>
constexpr auto apply_with_indlist(IL,F f,T&&...t)->R
{
  // cannot have more than one statement here (won't compile: gcc4.8 ?)
  return apply_tuple_with_indlist(IL(),f,TU{std::forward<T>(t)...});
}

For example:

// test function for tuple
struct PrintNValues{
  template<typename...T>
  void operator()(T&&...t){
    auto tu=make_tuple(std::forward<T>(t)...);
    cout<<tu<<endl;
  }
};
…
// call a function by reversing arguments
using IL8=make_indlist_from_range<0,5>::type;
apply_with_indlist(reverse<IL8>::type(),PrintNValues(),0,1,2,3,4,5);

The output is: [5 4 3 2 1 0 ]

Applying a tuple element- function before calling a function

It's convenient to apply an element specific function to a tuple element before forwarding the transformed element to a function. This is easily done as follows:

// dummy meta function returning 'return type' for 'fapply_tuple_with_indlist'
template<typename IL,typename F,typename FX,typename T>struct fapply_tuple_with_indlist_ret;
template<std::size_t...i,typename F,typename FX,typename T>
struct fapply_tuple_with_indlist_ret<indlist<i...>,F,FX,T>{
  using type=decltype(std::declval<F>()(std::declval<FX>()(std::get<i>(std::forward<T>(std::declval<T>())))...));
};
// pass tuple to a function taking list of FX(arguments) (controll which args and the order using indlist)
template<std::size_t...i,typename F,typename FX,typename T,typename R=typename fapply_tuple_with_indlist_ret<indlist<i...>,F,FX,T>::type>
constexpr auto fapply_tuple_with_indlist(indlist<i...>const&,F f,FX fx,T&&tu)->R
{ 
  return f(fx(std::get<i>(std::forward<T>(tu)))...);
}

I'll use this function when implementing a function which transforms each element in a tuple and returns a new transformed tuple:

// transform a tuple by applying a function to each element
// (may return a tuple having different element types)
template<typename FX,typename T,
         typename ISCALL=typename std::enable_if<is_callable<FX>::value>::type,
         typename TD=typename std::decay<T>::type,
         typename IL=typename make_indlist_from_tuple<TD>::type,
         typename R=typename fapply_tuple_with_indlist_ret<IL,maketuple,FX,T>::type>
auto transform_tuple(FX fx,T&&t)->R{
    return fapply_tuple_with_indlist(IL(),maketuple(),fx,std::forward<T>(t));
}

where is_callable is implemented as:

// check if a type is callable
template<typename T>
struct is_callable {
private:
    typedef char(&yes)[1];
    typedef char(&no)[2];
    struct fallback{void operator()();};
    struct derived:T,fallback{};
    template<typename U,U>struct check;
    template<typename>static yes test(...);
    template<typename C>static no test(check<void(fallback::*)(),&C::operator()>*);
public:
    static const bool value =sizeof(test<derived>(0))==sizeof(yes);
};

Here is an example multiplying each element in the tuple by 2 generating a new tuple:

  // transform each element in a tuple generating a new tuple
  auto z7=make_tuple(1,2,3,4);
  cout<<"mulyiply by 2: "<<transform_tuple([](int val){return val*2;},z7)<<endl;

The output is: mulyiply by 2: [2 4 6 8 ]

Applying a function on each tuple element one at a time

A reoccurring task when programming with tuples is to apply a function to each element of the tuple. This is easily accomplished as:

// pass each element of a tuple to a function (control call order using 'indlist')
template<std::size_t i,std::size_t...j,typename F,typename T>
constexpr void apply_tuple_ntimes_with_indlist(indlist<i,j...>const&,F f,T&&tu)
{
  f(std::get<i>(std::forward<T>(tu)));
  apply_tuple_ntimes_with_indlist(indlist<j...>(),f,std::forward<T>(tu));
};
template<typename F,typename T>
constexpr void apply_tuple_ntimes_with_indlist(indlist<>const&,F f,T&&tu){
}

The following snippet prints the tuple elements in reverse order, one on each line:

// print 1 value
struct Print1Value{
  template<typename T>
  void operator()(T&&t){cout<<t<<endl;}
};
...
// call a function for each element in a tuple in reverse order
auto z6=make_tuple(1,2,"three");
using IL6=make_indlist_from_tuple<decltype(z6)>::type;
apply_tuple_ntimes_with_indlist(reverse<IL6>::type(),Print1Value(),z6);

The output is:
three
2
1

Folding a tuple

Folding a tuple can be done in a neat and elegant way. The code I not so pleasing to the eye but is pretty straight forward. Most likely the code should be refactored into a more readable form. Being too lazy I'll keep this brute force implementation:

// fold a a tuple (binary function)
// (truly horrific syntax)
struct foldl_tuple_helper{
  template<typename F,typename T,typename U,typename...V,
           typename TU=std::tuple<T,U,V...>,
           typename FRONT=decltype(std::declval<F>()(std::get<0>(std::declval<TU>()),std::get<1>(std::declval<TU>()))),
           typename TAIL=decltype(popn_front_tuple<2>(std::declval<TU>()))>
  auto operator()(F f,std::tuple<T,U,V...>const&tu)->decltype((*this)(f,std::tuple_cat(std::tuple<FRONT>(std::declval<FRONT>()),std::declval<TAIL>()))){
    FRONT front=f(std::get<0>(tu),std::get<1>(tu));
    TAIL tail=popn_front_tuple<2>(tu);
    return operator()(f,std::tuple_cat(std::tuple<FRONT>(front),tail));
  }
  template<typename F,typename T>
  constexpr auto operator()(F f,std::tuple<T>const&tu)->T{
    return std::get<0>(tu);
  }
};
template<typename F,typename T>
constexpr auto foldl_tuple(F f,T&&t)->decltype(foldl_tuple_helper()(std::declval<F>(),std::forward<T>(std::declval<T>()))){
  return foldl_tuple_helper()(f,std::forward<T>(t));
}

A simple example is shown here:

  auto c=make_tuple(1,2,3,4,5,6);
  cout<<"six factorial: "<<foldl_tuple(multiplies<int>(),c)<<endl;

The output is: six factorial: 720

Another example concatenating all elemnts into a string can be written as:

// concatenate arguments into a string
struct Concatenate2{
  template<typename T1,typename T2>
  string operator()(T1&&t1,T2&&t2){
    stringstream str;
    str<<t1<<t2;
    return str.str();
  }
};
...
// fold tuple
auto b=make_tuple("one=",1," two=",2," five=",5);
cout<<"fold to strings: "<<foldl_tuple(Concatenate2(),b)<<endl;

Using fold to code tuple inner product

Just for fun, I'll write a function taking the inner product of two (or more) tuples. First I'll code two general functions for multiplying and adding elements:

// binary adder
struct binadd{
  template<typename T1,typename T2>
  constexpr auto operator()(T1&&t1,T2&&t2)->decltype(t1+t2){return t1+t2;}
};
// binary multiplier
struct binmul{
  template<typename T1,typename T2>
  constexpr auto operator()(T1&&t1,T2&&t2)->decltype(t1*t2){return t1*t2;}
};
// general adder
struct adder{
  template<typename...T>
  constexpr auto operator()(T&&...t)->decltype(foldl_tuple(binadd(),maketuple()(std::forward<T>(t)...))){
    return foldl_tuple(binadd(),maketuple()(std::forward<T>(t)...));
  }
};
// general multiplier
struct multiplier{
  template<typename...T>
  constexpr auto operator()(T&&...t)->decltype(foldl_tuple(binmul(),maketuple()(std::forward<T>(t)...))){
    return foldl_tuple(binmul(),maketuple()(t...));
  }
};

Next I'll code a function multiplying elements i of a a list of tuples and then summing the results up. Clearly the multiply and sum does not have to be the usual arithmetic operations. Here is the implementation:

// inner product of tuples 
template<typename ADDER,typename MULTIPLIER>
struct tuple_inner_product_helper{
  template<std::size_t...i,typename...TU>
  constexpr auto operator()(indlist<i...>const&,TU&&...tu)
      ->decltype(foldl_tuple(ADDER(),maketuple()(apply_tuple(MULTIPLIER(),transform_tuples<i>(std::forward<TU>(tu)...))...))){
    return foldl_tuple(ADDER(),maketuple()(apply_tuple(MULTIPLIER(),transform_tuples<i>(std::forward<TU>(tu)...))...));
  }
};
template<typename ADDER=adder,typename MULTIPLIER=multiplier,typename T1,typename...T2,
         typename TD=typename std::decay<T1>::type,
         typename IL=typename make_indlist_from_tuple<TD>::type,
         typename HELPER=tuple_inner_product_helper<ADDER,MULTIPLIER>>
constexpr auto tuple_inner_product(T1&&t1,T2&&...t2)->decltype(HELPER()(IL(),std::forward<T1>(t1),std::forward<T2>(t2)...)){
  return HELPER()(IL(),std::forward<T1>(t1),std::forward<T2>(t2)...);
}

Yes, I know it looks pretty bad but the implementation is really easier to write than to read! However, I'm pretty sure that a simpler implementation can be written s folding. Again I'm too lazy to rewrite it so I'll stick to one above.

Here is a simple example:

  // do a inner product (1^4+2^4+...)
  auto w1=make_tuple(1,2,3,4);
  cout<<tuple_inner_product(w1,w1,w1,w1)<<endl;

Dealing with multiple tuples

So far I've only dealt with a single tuple at a time. I'll start off with a function which takes a list of tuples, picks one element from each tuple and passes them to a function:

// call a function passing one element from each of a set of tuples (control which elemnt using an indlist)
template<std::size_t...i,typename F,typename...T>
constexpr auto apply_elem_tuples(indlist<i...>const&,F f,T&&...t)->decltype(f(std::get<i>(std::forward<T>(t))...))
{
  return f(std::get<i>(std::forward<T>(t))...);
}

The function is useful when building a new tuple from the elements of a set of tuples. Here I'll build a new tuple by picking the second elements from a set of tuples:

// build a tuple by picking the second element from each tuple
auto x1=make_tuple(1,2,3,4,5,6);
auto x2=make_tuple(3,4,5,6,7);
auto x3=make_tuple(10,11);
cout<<apply_elem_tuples(indlist<1,1,1>(),maketuple(),x1,x2,x3);

The output is: [2 4 11 ]

Often we'll want to pick the same element from each tuple. Having a tailored function for this makes the task simpler:

// apply a function on the nth element of a list of tuples
template<std::size_t N,typename F,typename...T,typename IL=typename make_uniform_indlist<sizeof...(T),N>::type>
constexpr auto apply_nth_elem_tuples(F f,T&&...t)->decltype(apply_elem_tuples(IL(),f,std::forward<T>(t)...))
{
  return apply_elem_tuples(IL(),f,std::forward<T>(t)...);
}

The previous example can then be written as:

// build a tuple by taking the first element from a set of tuples
auto x1=make_tuple(1,2,3,4,5,6);
auto x2=make_tuple(3,4,5,6,7);
auto x3=make_tuple(10,11);
cout<<apply_nth_elem_tuples<1>(maketuple(),x1,x2,x3);

The output is again: [2 4 11 ]

Simplifying it even more I can write:

// pick one element in order from a set of tuples creating a new tuple
// (first elemnt is picked from first tuple at index specified by first index in indlist)
template<typename IL,typename...T,
         typename R=decltype(apply_elem_tuples(IL(),maketuple(),std::forward<T>(std::declval<T>())...))>
constexpr R transform_tuples(IL&&il,T&&...t)
{
  return apply_elem_tuples(std::forward<IL>(il),maketuple(),std::forward<T>(t)...);
}

Conclusions

indlists turns out to be critical when controlling tuple elements – the order in which they are passed to functions, which elements to select etc. Once a set of function are available for manipulating indlists manipulation of tuple elements works almost looks like magic.

A curious thing when writing meta functions is that most of the time it's easier to write the code than to read it and understand it afterwards. Maybe this is because the syntax is rather convoluted. As a consequence debugging meta functions is often a non-trivial task.

Writing a large set of meta functions becomes tiresome after a while. Hundreds of compiler errors for even minor bugs and the somewhat cryptically notation becomes exhausting when pounding the keyboard after midnight.

However, ones the decision has been taken to code in C++ there is not really any way out of learning meta programming techniques. It's pointless to code in C++ if only techniques from the 80s and 90s are used. Meta programming is here to stay – it's just a matter to decide if you want to be left behind or stay on the path C++ has taken. A valid question to ask is how much meta programming should be used in day to day work. Should it only be reserved for coding libraries or should it seep into application code?

The use of meta functions is simple enough. Using meta functions is often elegant and makes the tiresome work of coding them looking like a worthwhile endeavour. A question that lingers though is if the ratio between benefits and costs (in terms of frustration, anger and obsession trying to decipher compilation errors) is large enough to justify the type of work described in this blog entry?

Friday, November 8, 2013

Compact C++11 code for Oracle - Part III

Newton was a genius, but not because of the superior computational power of his brain. Newton’s genius was, on the contrary, his ability to simplify, idealize, and streamline the world so that it became, in some measure, tractable to the brains of perfectly ordinary men.

-- G. Weinberg, An Introduction to General Systems Thinking

In this blog entry I'll show a utility which simplifies modifications to data in an Oracle database table. However, before starting I'll quickly show a few changes to the code from the previous entry.

Changes to code from previous blog entries

First, based on a few comments from colleagues I renamed the classes. Specifically I use source as opposed to collection mostly because my previous OcciSelectCollection was not really a collection. I also added a constructor which creates an OCCI environment and a connection in order to simplify quick prototyping. Finally I made the collection (now called occi_source) movable.

Here is a small program using the renamed classes and the new constructor printing data specified by a select statement to standard out:

#include "occi_utils.h"
#include <iostream>
#include <string>
#include <tuple>
using namespace std;

// --- main test program
int main(){
  // auth credentials
  occi_auth const auth{"hans","ewetz","mydb"};

  // sql statement, row and bind types, bind data
  typedef tuple<int,string,string,string>row_t;
  typedef tuple<string>bind_t;
  string sql{"select rownum,dbid,tab,cutoff_dt from MT_DBSYNC_CUTOFF_DT where dbid=:1"};
  bind_t bind{"Policy"};

  // read data specified by select statement
  occi_source<row_t,bind_t>rows{auth,sql,bind};
  for(auto r:rows)cout<<r<<endl;
}

Letting the occi_source manage all OCCI resources is not a realistic choice for most applications. However, when writing code for tests and prototypes, pushing as much OCCI specific code to the occi_source speeds up development significantly.

Modelling database access

The implementation of the occi_input_iterator was based on the boost::iterator_facade. A better alternative would have been to use the boost boost::function_input_iterator. In the next blog I'll revisit the implementation and will most likely change it to use the boost::function_input_iterator.

It's worth noticing that there are other ways to model access to a database. For example, the concept of streams could be used as opposed to the use of source. Yet another way would be to create a thick layer which models a generic database interface.

I'll continue to stick with a thin layer over OCCI which hides the actual database access but lets the user have control over OCCI resources. Specifically I don't want to invent new ways of specifying SQL statements and I definitely don't want my utilities to look inside SQL statements. Anything related to parsing SQL statements I'll gladly pass along to Oracle.

I will continue to stay close to core C++11 features such as tuples and strings instead of building special classes for rows, bind variables, SQL statements etc. For those interested in more generic alternatives, the Database Template Library (DTL) may be of interest.

An iterator for modifying data

Now, let's take a look at how to model modifications to data in an Oracle table. I'll follow along the same lines as before where I used an occi_source for managing OCCI resources (now I'll use an occi_sink) and an iterator for executing SQL statements. However, this time around I'll use the boost function_output_iterator as the base for implementing an output iterator.

The model where I assign bind variables to an iterator worked fine for select statements. It turns out that the model is OK for modification statements also. However, as you'll see later the code executing a modification statement without bind variables looks a little awkward.

OK, let's start with piece of code showing what I would like to be able to do:

#include "occi_tools.h"
#include <string>
#include <algorithm>
#include <tuple>

using namespace std;

// --- test insert
void update(occi_auth const&auth){
  typedef tuple<int,string>bind_t;
  string sql{"insert into names (id,name,isrt_tmstmp) values(:1,:2,sysdate)"};
  occi_sink<bind_t>sink(auth,sql);
  vector<bind_t>b{bind_t{1,"hans"},bind_t{2,"ingvar"}, bind_t{3,"valdemar"}};
  copy(b.begin(),b.end(),sink.begin());
}
// --- main test program
int main(){
 // auth credentials
  occi_auth const auth{"hans","ewetz"",mydb"};
  update(auth);
}

The code snippet inserts 3 rows in the table names by creating a occi_sink, getting an output iterator from the occi_sink and finally copying a vector of tuples corresponding to the bind variables to the iterafunction_input_iterattor. In the sample code I let the sink manage any OCCI resources including the environment and the connection. As will be clear soon I'll define a set of occi_sink constructors which will allow me to manage some of the OCCI resources.

A boost::function_output_iterator requires a function which will be called each time an item is assigned to the iterator. I'll write it as a function which simply delegates back to the iterator:

// forward decl
template<typename Bind>class occi_output_iterator;
  
// function called for each modification
// (function delegates back to iterator to do work)
template<typename Bind>
class occi_unary_output:std::unary_function<Bind const&,void>{
public:
  // ctors
  occi_unary_output(occi_output_iterator<Bind>*it):it_(it){}

  // modification function
  void operator()(Bind const&bind){
    it_->operator()(bind);
  }
private:
  occi_output_iterator<Bind>*it_;
};

The actual iterator is coded as follows:

// wrapper around boost::function_output_iterator
template<typename Bind>
class occi_output_iterator:public boost::function_output_iterator<occi_unary_output<Bind>>{
friend class occi_unary_output<Bind>;
public:
  // typedef for simpler declarations (tuple<int,int,...> with as many elements a Bind)
  using Size=typename uniform_tuple_builder<std::tuple_size<Bind>::value,std::size_t>::type;

  // ctors, assign, dtor
  occi_output_iterator(oracle::occi::Connection*conn,std::string const&sql,std::size_t batchsize=1,Size const&size=Size()):
      boost::function_output_iterator<occi_unary_output<Bind>>(occi_unary_output<Bind>(this)),
      conn_(conn),sql_(sql),batchsize_(batchsize),nwaiting_(0){

    // create statement, set batch modification, create statement and binder
    stmt_=std::shared_ptr<oracle::occi::Statement>{conn_->createStatement(sql_),occi_stmt_deleter(conn_)};
    stmt_->setMaxIterations(batchsize_);

    // set max size of bind variables which have variable length
    if(batchsize==0){
      throw std::runtime_error("invalid batchsize: 0 while constructing occi_output_iterator");
    }else
    if(batchsize>1){
      // set size for variable size bind variables (will throw exception if size==0 for variable size bind variable)
      occi_bind_sizer<Bind>sizer{stmt_};
      apply_with_index_template(sizer,size);
    }
    // create binder object
    binder_=occi_data_binder(stmt_);
  }
  occi_output_iterator(occi_output_iterator const&)=default;
  occi_output_iterator(occi_output_iterator&&)=default;
  occi_output_iterator&operator=(occi_output_iterator&)=default;
  occi_output_iterator&operator=(occi_output_iterator&&)=default;
  ~occi_output_iterator()=default;

  // explicitly execute buffered statements
  void flush(){flushAux();}
private:
  // modification function
  void operator()(Bind const&bind){
    // check if weed to add previous row, bind new row and check if we need to flush (execute)
    ++nwaiting_;
    if(nwaiting_>1)stmt_->addIteration();
    apply_with_index(binder_,bind);
    if(nwaiting_==batchsize_)flushAux();
  }
  // flush remaining statements
  void flushAux(){
    if(nwaiting_>0){
      stmt_->executeUpdate();
      nwaiting_=0;
    }
  }
private:
  oracle::occi::Connection*conn_;
  std::string const&sql_;
  std::shared_ptr<oracle::occi::Statement>stmt_;
  occi_data_binder binder_;
  std::size_t batchsize_;
  std::size_t nwaiting_;
};

The code requires some explanations. OCCI provides a feature that batches requests and sends them to the server in one go. The batchsize_ and nwaiting _ members keeps track of when it's time to flush a batch of statements to the server. A flush method is provided so a user can explicitly flush remaining statements. For example, if the batch size is set to 3 and 4 items have been assigned to the iterator, 1 statement is buffered. Therefore, a user can explicitly call flush forcing the iterator to send any remaining statements to the server.

There is a little more to managing batching of statements. OCCI must know beforehand the maximum size of a statement. To calculate the size OCCI requires the maximum size of any variable length bind variables. For fixed length bind variables OCCI can calculate the size. To pass this information through the iterator to OCCI I use a tuple having the same number of items but each item having the type size_t. A user must then set the maximum length of a variable length bind variable in the tuple. The functionality for informing OCCI about the size of a variable length bind variable is implemented in occi_bind_sizer:

// set maximum size a bind variable can have (used in update/insert/delete statements)
template<typename T>struct occi_bind_sizer_aux;
template<>struct occi_bind_sizer_aux<int>{
  static void setsize(std::size_t ind,std::shared_ptr<oracle::occi::Statement>stmt,size_t size){
    // nothing to do - not a variable size type
  }
};
template<>struct occi_bind_sizer_aux<std::string>{
  static void setsize(std::size_t ind,std::shared_ptr<oracle::occi::Statement>stmt,std::size_t size){
    // don't allow string length to have max size == 0
    if(size==0)throw std::runtime_error("invalid size: 0 for std::string bind variable while setting 'stmt->setMaxParamSize(ind,size)'");
    stmt->setMaxParamSize(ind+1,size);
  }
};
// TODO: add more types to sizer 
// ...
template<typename ... Args>class occi_bind_sizer;
template<typename ... Args>
class occi_bind_sizer<std::tuple<Args ...>>{
public:
  // get bind type
  using Bind=std::tuple<Args ...>;

  occi_bind_sizer(std::shared_ptr<oracle::occi::Statement>stmt):stmt_(stmt){}
  occi_bind_sizer()=default;
  occi_bind_sizer(occi_bind_sizer const&)=default;
  occi_bind_sizer(occi_bind_sizer&&)=default;
  occi_bind_sizer&operator=(occi_bind_sizer&)=default;
  occi_bind_sizer&operator=(occi_bind_sizer&&)=default;
  ~occi_bind_sizer()=default;
  template<std::size_t Ind>
  void apply(size_t size){
    using T=typename std::decay<decltype(std::get<Ind>(Bind()))>::type;
    occi_bind_sizer_aux<T>::setsize(Ind,stmt_,size);
  }
private:
  std::shared_ptr<oracle::occi::Statement>stmt_;
};

That's all there is to the iterator implementation.

An occi_sink

Once the idea of assigning a tuple representing a bind variable to an iterator is clear, the implementation is straight forward and the iterator can be used as is. However, I started to write the code to show that it was possible to write a small C++ library which would be a typesafe alternative to using PERL for DB access. Using the occi_output_iterator directly is simple but not simple enough. I'll make it simple enough by adding a sink which main responsibility is to create iterators and manage OCCI resources.

The implementation of occi_sink is rather long mostly because it's meant to provide a simple to use interface – not because it's complicated. A large part of the code consists of constructors and move support. The occi_sink is movable but not copyable.

Here is the code:

// sink 
template<typename Bind=std::tuple<>>
class occi_sink{
public:
  // typedef for simpler declarations (tuple<int,int,...> with as many elements a Bind)
  using Size=typename uniform_tuple_builder<std::tuple_size<Bind>::value,std::size_t>::type;

  // enum for controlling commit, rollback or do nothing
  enum commit_t:int{Rollback=0,Commit=1,Nop=2};
  // typedefs
  typedef typename occi_output_iterator<Bind>::value_type value_type;
  typedef typename occi_output_iterator<Bind>::pointer pointer;
  typedef typename occi_output_iterator<Bind>::reference reference;
  typedef occi_output_iterator<Bind>iterator;

  // ctor taking authentication (commit default == true since sink manages resources)
  explicit occi_sink(occi_auth const&auth,std::string const&sql,std::size_t batchsize=1,Size const&size=Size(),commit_t commit=Commit):
      conn_(nullptr),connpool_(nullptr),env_(nullptr),auth_(auth),sql(sql),closeConn_(true),
      releaseConn_(false),terminateEnv_(true),batchsize_(batchsize),size_(size),commit_(commit){
    check_batchsize(batchsize_);
    env_=oracle::occi::Environment::createEnvironment(oracle::occi::Environment::DEFAULT);
    conn_=env_->createConnection(std::get<0>(auth_),std::get<1>(auth_),std::get<2>(auth_));
  }
  // ctor taking environment + authentication (commit default == true since sink manages resources)
  explicit occi_sink(oracle::occi::Environment*env,occi_auth const&auth,std::string const&sql,std::size_t batchsize=1,Size const&size=Size(),commit_t commit=Commit):
      conn_(nullptr),connpool_(nullptr),env_(env),auth_(auth),sql(sql),closeConn_(true),
      releaseConn_(false),terminateEnv_(false),batchsize_(batchsize),size_(size),commit_(commit){
    check_batchsize(batchsize_);
    conn_=env_->createConnection(std::get<0>(auth_),std::get<1>(auth_),std::get<2>(auth_));
  }
  // ctor taking an open connection (commit default == false since sink does not manage resources)
  explicit occi_sink(oracle::occi::Connection*conn,std::string const&sql,std::size_t batchsize=1,Size const&size=Size(),commit_t commit=Nop):
      conn_(conn),connpool_(nullptr),env_(nullptr),sql(sql),closeConn_(false),
      releaseConn_(false),terminateEnv_(false),batchsize_(batchsize),size_(size),commit_(commit){
    check_batchsize(batchsize_);
  }
  // ctor taking a stateless connection pool (commit default == false since sink does not manage resources)
  explicit occi_sink(oracle::occi::StatelessConnectionPool*connpool,std::string const&sql,std::size_t batchsize=1,Size const&size=Size(),commit_t commit=Nop):
      conn_(nullptr),connpool_(connpool),env_(nullptr),sql(sql),closeConn_(false),
      releaseConn_(true),terminateEnv_(false),batchsize_(batchsize),size_(size),commit_(commit){
    check_batchsize(batchsize_);
    conn_=connpool_->getConnection();
  }
  // ctors, assign (movable but not copyable)
  occi_sink()=delete;
  occi_sink(occi_sink const&)=delete;
  occi_sink(occi_sink&&s):
      conn_(s.conn_),connpool_(s.connpool_),env_(s.env_),
      closeConn_(s.closeConn_),releaseConn_(s.releaseConn_),terminateEnv_(s.terminateEnv_),
      auth_(s.auth_),sql(s.sql){
    // reset all relevant state
    reset_state(std::forward<occi_sink<Bind>>(s));
  }
  occi_sink&operator=(occi_sink const&)=delete;
  occi_sink&operator=(occi_sink&&s){
    // std::swap with no throw and reset state of parameter
    swap(s);
    reset_state(std::forward<occi_sink<Bind>>(s));
  }
  // dtor - close occi resources if needed
  ~occi_sink(){
    // check if we should commit
    if(commit_==Commit)conn_->commit();else
    if(commit_==Rollback)conn_->rollback();

    // take care of connection
    if(closeConn_)env_->terminateConnection(conn_);else
    if(releaseConn_)connpool_->releaseConnection(conn_);

    // take care of environment
    if(terminateEnv_)oracle::occi::Environment::terminateEnvironment(env_);
  }
  // get begin/end iterators
  iterator begin()const{return iterator(conn_,sql,batchsize_,size_);}

  // swap function
  void swap(occi_sink&s)noexcept(true){
    std::swap(s.conn_,conn_);std::swap(s.connpool_,connpool_);std::swap(s.env_,env_);
    std::swap(s.closeConn_,closeConn_);std::swap(s.releaseConn_,releaseConn_);std::swap(s.terminateEnv_,terminateEnv_);
    std::swap(s.auth_,auth_);std::swap(s.sql,sql);swap(s.batchsize_,batchsize_);swap(s.size_,size_);swap(s.commit_,commit_);
  }
private:
  // check if batchsize is valid
  void check_batchsize(std::size_t batchsize){
    if(batchsize==0)throw std::runtime_error("invalid batchsize: 0 while constructing occi_sink");
  }
  // reset state for a sink (only called with r-value)
  void reset_state(occi_sink&&s){
    s.conn_=nullptr;s.connpool_=nullptr;s.env_=nullptr;
    s.closeConn_=false;s.releaseConn_=false;s.terminateEnv_=false;
    s.auth_=occi_auth{};s.sql="";s.batchsize_=0;s.size_=Size{};s.commit_=false;
  }
  // occi resources
  oracle::occi::Connection*conn_;
  oracle::occi::StatelessConnectionPool*connpool_;
  oracle::occi::Environment*env_;

  // authentication + sql statement
  occi_auth auth_;
  std::string sql;
  std::size_t batchsize_;
  Size size_;

  // controll of what to do with occi resources
  bool closeConn_;
  bool releaseConn_;
  bool terminateEnv_;
  commit_t commit_;
};

A few examples

1 - Deleting without bind variables

Here is an example showing how to execute a delete statement without bind variables :

  string sql{"delete from names"};
  occi_sink<tuple<>>sink(auth,sql);
  *sink.begin()=tuple<>();

I mentioned earlier in this blog item that executing statements without bind variables would look a little awkward. I could probably build in some feature in the occi_sink to execute simple statements like the one here. However, I prefer to see it as a feature that there is a single consistent way of executing modification statements.

2 - Using plain iterator without sink

Here is a (slightly modified) production example using both an occi_source and an occi_output_iterator directly. The function is part of a class holding a connection in a shared_ptr attribute conn_:

// add a package to database
int PackageMetadata::addPackage(string const&name,string const&location,string const&clientid,string const&docformat){
  // get new package odentifier
  using row_t=tuple<int>;
  occi_source<row_t>rows(conn_.get(),"select EXMT_PACKAGE_ID_SEQ.nextval from dual");
  int pkgid{get<0>(*rows.begin())};

  // insert package into db
  using bind_t=tuple<int,string,string,string,string>;
  string sql{"insert into EXMT_PACKAGE (ID,NAME,PKG_LOCATION,CLIENT_ID,DOC_FORMAT,ISRT_TMSTMP,LST_UPD_TMSTMP)VALUES(:1,:2,:3,:4,:5,sysdate,sysdate)"};
  occi_output_iterator<bind_t>it{conn_.get(),sql,1};
  *it=bind_t{pkgid,name,location,clientid,docformat};

  // return package identifier
  return pkgid;
}

3 - Batching statements

Finally here is an example which sets the batch size, inserts some rows and and flushes remaining (buffered) statements:

  typedef tuple<int,string>bind_t;
  typedef tuple<int,int>size_t;
  string sql{"insert into names (id,name,isrt_tmstmp) values(:1,:2,sysdate)"};
  occi_sink<bind_t>sink(auth,sql,2,size_t(0,100));
  vector<bind_t>b{bind_t{1,"hans"},bind_t{2,"ingvar"},bind_t{3,"valdemar"}};
  auto it=sink.begin();
  copy(b.begin(),b.end(),it);
  it.flush();

Notice that when explicitly setting the batch size to a number greater than 1, statements will be buffered and flushed. When buffering statements, OCCI must have information available so it can calculate the maximum size of a single statement. Because of this OCCI must know the maximum size of any variable length bind variables. That information is passed to the occi_sink as a tuple holding size_t types. For non-variable length bind variables the corresponding length is ignored by the occi_output_iterator.

Conclusions

OCCI contains a large number of features. Here I only support a few of them such as batching of statements sent to the Oracle server. It's clear that there is room for supporting more of the OCCI features.

An alternative to adding more features would be to expose the statement type to a user. A user would then be able to retrieve the statement from an occi_output_iterator and call various OCCI methods on the statement. The risk here is of course that it would interfere with the function of the occi_output_iterator. Until there is time to analyse what should be added (and possibly removed) I'll leave the wrappers alone

Even though it took a little while before I had time to punch out the blog entries the code was written in a hurry. I'm sure that there are issues and bugs in the code. I'm also sure that some of the design decisions were not optimal. For example, some constructors take default parameters and I got a feeling that I didn't declare them in the correct order when I last used the code.

Overall I'm pretty happy that by using some of the basic C++11 features such as tuples and variadic parameters it was possible to write a thin, relative non-intrusive, wrapper around OCCI making Oracle access in C++ as simple as Oracle access in PERL.

I'll probably write one more entry about this topic once I've had more time to think about what should and could be done better. A few things I'll modify as soon as time allows is to use boost::function_input_iterator when implementing the occi_input_iterator. I'll also take a look at possibly modify the tuple utilities to use index lists to manage recursions.

I should mention that when showing the code using the occi_sink and the occi_source to a PERL developer the reaction was 'so what's the advantage?. Well … let's start with type safety

Friday, November 1, 2013

Compact C++11 code for Oracle – Part II

Progress is not possible without deviation

-- Frank Zappa

I'll continue right from where I left off in my previous entry which ended with two utilities:

template<typename Func,typename Tuple>applyWithIndex(Func f,Tuple&&t);
template<typename ... Args>std::ostream&operator<<(std::ostream&os,std::tuple<Args ...>const&t);

The first function passes a tuple element index together with an element to F. The second one is a simple utility which prints each tuple element together with the type of the element. Because F must be able to take any type as the second parameter (i.e., the tuple element), it should be implemented as a struct or class with a template call operator.

Before attacking the design of an input iterator iterating through a collection of rows specified by a select statement I need two more utilities. The first one is a fetcher capable copying data from an OCCI result set into a tuple. The second utility is a binder which I will use to set bind variable in an OCCI statement. Both utilities are implemented along the same principles.

The fetcher is relatively simple to implement given that I already have a function - applyWithIndex - which passes each element of a tuple together with its index to a callable object. The class I'll use to fetch data from an OCCI result set looks like this:

class DataFetcher{
public:
  DataFetcher(std::shared_ptr<oracle::occi::ResultSet>rs):rs_(rs){}
  DataFetcher()=default;
  DataFetcher(DataFetcher const&)=default;
  DataFetcher(DataFetcher&&)=default;
  DataFetcher&operator=(DataFetcher const&)=default;
  DataFetcher&operator=(DataFetcher&&)=default;
  ~DataFetcher()=default;
  std::shared_ptr<oracle::occi::ResultSet>getResultSet()const{return rs_;}
  bool operator==(DataFetcher const&other)const{return rs_==other.rs_;}
  bool operator==(DataFetcher&&other)const{return rs_==other.rs_;}
  template<typename T>
  void operator()(int ind,T&t)const{
    DataFetcherAux<T>::fetch(ind,rs_,t);
  }
private:
  std::shared_ptr<oracle::occi::ResultSet>rs_;
};

The key is the call operator which is a template taking an index together with a tuple element as parameters. Ones inside the call operator I know the type of the tuple element and can call the helper function - DataFetcherAux - which will do the job of actually getting data from the result set into the tuple element.

Whats left to do now is to write a set of functions which get the data from a result set and stores it in a tuple element. Here I'll need one function for each type since the OCCI getter functions have type dependent names. Here is the implementation of two of the functions:

// fetch data from a result set and store in a variable (one struct for each type)
template<typename T>struct DataFetcherAux;
template<>struct DataFetcherAux<int>{
  static void fetch(int ind,std::shared_ptr<oracle::occi::ResultSet>rs,int&i){i=rs->getInt(ind+1);}
};
template<>struct DataFetcherAux<std::string>{
  static void fetch(int ind,std::shared_ptr<oracle::occi::ResultSet>rs,std::string&s){s=rs->getString(ind+1);}
};

Here I only showed two type specific fetchers. But, adding new ones for other types is trivial.

The binder follows the same principles as the fetcher. I'll show the code here

// bind data to select statement (one variable at a time)
template<typename T>struct DataBinderAux;
template<>struct DataBinderAux<int>{
  static void bind(int ind,std::shared_ptr<oracle::occi::Statement>stmt,int val){
    stmt->setInt(ind+1,val);
  }
};
template<>struct DataBinderAux<std::string>{
  static void bind(int ind,std::shared_ptr<oracle::occi::Statement>stmt,std::string const&val){
    stmt->setString(ind+1,val);
  }
};
class DataBinder{
public:
  DataBinder(std::shared_ptr<oracle::occi::Statement>stmt):stmt_(stmt){}
  DataBinder()=default;
  DataBinder(DataBinder const&)=default;
  DataBinder(DataBinder&&)=default;
  DataBinder&operator=(DataBinder&)=default;
  DataBinder&operator=(DataBinder&&)=default;
  ~DataBinder()=default;
  template<typename T>
  void operator()(int ind,T const&t){
    DataBinderAux<T>::bind(ind,stmt_,t);
  }
private:
  std::shared_ptr<oracle::occi::Statement>stmt_;
};

A was the case with the DataFetcher I only show two binders, one for string and one for int. Adding binders for other types is simple.

Now I'm at the point where it is' time to implement an iterator over a set of rows defined by a select statement. But first, let's recap what we have so far. First, we have a function which applies a user specified callable object to each one of the element in a tuple by calling the object with a tuple index together with a tuple element. Next, we have two classes which make use of this function to retrieve data from an OCCI result set into a tuple and bind data in a tuple to an OCCI statement. It seems like we now have the tools needed to go ahead with an iterator implementation!

The simplest way of coding up an iterator is to go for the boost:: iterator_facade. The façade takes care of lots of details that are easy to get wrong by manually coding an STL compliant iterator. What's left to code ourselves are two functions: increment and dereference together with constructors and other things that are specific to an OCCI iterator.

The iterator, even though it has a few lines of code, is simple. The full implementation of the iterator is shown here:

template<typename Row,typename Bind=std::tuple<>>
class OcciSelectIterator:public boost::iterator_facade<OcciSelectIterator<Row,Bind>,Row const,boost::forward_traversal_tag>{
friend class boost::iterator_core_access;
public:
  // ctor, assign,dtor
  explicit OcciSelectIterator(oracle::occi::Connection*conn,std::string const&select,Bind const&bind=Bind{}):
      conn_(conn),select_(select),bind_(bind),fetcher_(),stmt_(nullptr),end_(false){
    // create row fetcher and fetch first row
    stmt_=std::shared_ptr<oracle::occi::Statement>{conn_->createStatement(),occi_stmt_deleter(conn_)};
    stmt_->setSQL(select);
    DataBinder binder(stmt_);
    applyWithIndex(binder,bind_);
    std::shared_ptr<oracle::occi::ResultSet>rs(stmt_->executeQuery(),occi_rs_deleter(stmt_.get()));
    fetcher_=DataFetcher(rs);
    nextrow();
  }
  OcciSelectIterator():end_(true),bind_(Bind{}){}
  OcciSelectIterator(OcciSelectIterator const&)=default;
  OcciSelectIterator(OcciSelectIterator&&)=default;
  OcciSelectIterator&operator=(OcciSelectIterator const&)=default;
  OcciSelectIterator&operator=(OcciSelectIterator&&)=default;
  ~OcciSelectIterator()=default;
private:
  // iterator functions
  void increment(){nextrow();}
  bool equal(OcciSelectIterator const&other)const{
    // any iterators which is marked with 'end_' are identical
    if(end_||other.end_)return end_&&other.end_;
    return conn_==other.conn_&&select_==other.select_&&stmt_==other.stmt_&&fetcher_==other.fetcher_;
  }
  Row const&dereference()const{
    if(end_)throw std::runtime_error("OcciSelectIterator<>: attempt to dereference end iterator");
    return currentRow_;
  }
  // get next row
  void nextrow(){
    if(end_)throw std::runtime_error("OcciSelectIterator<>: attempt to step past end iterator");
    if(fetcher_.getResultSet()->next())applyWithIndex(fetcher_,currentRow_);
    else end_=true;
  }
  // state
  const std::string select_;
  const Bind bind_;
  oracle::occi::Connection*conn_;
  std::shared_ptr<oracle::occi::Statement>stmt_;
  DataFetcher fetcher_;
  Row currentRow_;
  bool end_;
};

It's worth noticing how the end iterator is implemented: an end iterator is simply an iterator created using the default constructor. Of course this means that an end iterator can be used with any select statement as long as the template parameters are of the same type. This may seem strange, but in reality it does not create any problems.

It's easy to use the iterator to select rows from the database. For example:

    // authentication info (user, passwd, database
    OcciAuth const auth{"hans","ewetz","mydb"};

    // row and bind 
    typedef tuple<int,string,string,string>row_t;
    typedef tuple<string>bind_t;
    string select{"select rownum,dbid,tab,cutoff_dt from MT_DBSYNC_CUTOFF_DT where dbid=:1"};
    bind_t bind{"Policy"};

    // use iterator interface
    oracle::occi::Connection*conn{env->createConnection(std::get<0>(auth),std::get<1>(auth),std::get<2>(auth))};
    OcciSelectIterator<row_t,bind_t>begin{conn,select,bind};
    OcciSelectIterator<row_t,bind_t>end;
    for(auto it=begin;it!=end;++it)cout<<*it<<endl;

The code generates the output:

[(int: 1)(std::string: Policy)(std::string: TMLO)(std::string: 12-08-13)]
[(int: 2)(std::string: Policy)(std::string: TMMAT)(std::string: 12-08-13)]

The OcciAuth type is a simple typedef:

// typedef for authentication data
typedef std::tuple<std::string,std::string,std::string>OcciAuth;

That seems easy enough. However, wrapping the iterator inside a collection simplifies execution of a select statement even more. Specifically I want to have the choice of letting the collection manage the database connection. A simple collection can be implemented as follows:

// select collection
template<typename Row,typename Bind=std::tuple<>>
class OcciSelectCollection{
public:
  // typedefs
  typedef typename OcciSelectIterator<Row,Bind>::value_type value_type;
  typedef typename OcciSelectIterator<Row,Bind>::pointer pointer;
  typedef typename OcciSelectIterator<Row,Bind>::reference reference;
  typedef OcciSelectIterator<Row,Bind>iterator;

  // ctor taking an already created environment
  explicit OcciSelectCollection(oracle::occi::Environment*env,OcciAuth const&auth,std::string const&select,Bind const&bind=Bind{}):
      conn_(nullptr),connpool_(nullptr),env_(env),auth_(auth),select_(select),bind_(bind),closeConn_(true),releaseConn_(false){
    conn_=env->createConnection(std::get<0>(auth_),std::get<1>(auth_),std::get<2>(auth_));
  } 
  // ctor taking an open connection
  explicit OcciSelectCollection(oracle::occi::Connection*conn,std::string const&select,Bind const&bind=Bind{}):
      conn_(conn),connpool_(nullptr),env_(nullptr),select_(select),bind_(bind),closeConn_(false),releaseConn_(false){
  }
  // ctor taking a connection pool
  explicit OcciSelectCollection(oracle::occi::StatelessConnectionPool*connpool,std::string const&select,Bind const&bind=Bind{}):
      conn_(nullptr),connpool_(connpool),env_(nullptr),select_(select),bind_(bind),closeConn_(false),releaseConn_(true){
    conn_=connpool_->getConnection();
  }
  // ctors (all deleted since we don't want to duplicate connection)
  OcciSelectCollection()=delete;
  OcciSelectCollection(OcciSelectCollection const&)=delete;
  OcciSelectCollection(OcciSelectCollection&&)=delete;
  OcciSelectCollection&operator=(OcciSelectCollection const&)=delete;
  OcciSelectCollection&operator=(OcciSelectCollection&&)=delete;

  // dtor - close open occi connection
  ~OcciSelectCollection(){
    if(closeConn_)env_->terminateConnection(conn_);else
    if(releaseConn_)connpool_->releaseConnection(conn_);
  } 
  // get begin/end iterators
  iterator begin()const{return iterator(conn_,select_,bind_);}
  iterator end()const{return iterator();}

  // utility functions
  void setBind(Bind const&bind){bind_=bind;}
  Bind const&getBind(Bind const&bind)const{return bind_;}
private:
  oracle::occi::Connection*conn_;    
  oracle::occi::StatelessConnectionPool*connpool_;    
  oracle::occi::Environment*env_;
  const OcciAuth auth_;
  const std::string select_;
  Bind bind_;
  bool closeConn_;
  bool releaseConn_;
};

Management of OCCI objects can be simplified using smart pointers together with deleters. Internally in the implementation I've used smart pointers together with deleters. However, when passing parameters to the collection I've used raw pointers so that a user is not locked into the use of smart pointers. The deleters have the following implementation:

#ifndef __OCCI_BASIC_UTILS_H__
#define __OCCI_BASIC_UTILS_H__
#include <occiData.h>

// deleter for environment.
struct occi_env_deleter{
  void operator()(oracle::occi::Environment*env)const{
    if(env)oracle::occi::Environment::terminateEnvironment(env);
  }
};
// deleter for stateless connection pool
class occi_stateless_pool_deleter{
public:
  explicit occi_stateless_pool_deleter(oracle::occi::Environment*env):env_(env){}
  void operator()(oracle::occi::StatelessConnectionPool*connpool)const{
    if(env_&&connpool)env_->terminateStatelessConnectionPool(connpool);
  }
private:
  oracle::occi::Environment*env_;
};
// deleter for connection.
class occi_conn_deleter{
public:
  explicit occi_conn_deleter(oracle::occi::Environment*env):env_(env){}
  void operator()(oracle::occi::Connection*conn)const{
    if(env_&&conn)env_->terminateConnection(conn);
  }
private:
  oracle::occi::Environment*env_;
};
// deleter for statement.
class occi_stmt_deleter{
public:
  explicit occi_stmt_deleter(oracle::occi::Connection*conn):conn_(conn){}
  void operator()(oracle::occi::Statement*stmt)const{
    if(conn_&&stmt)conn_->terminateStatement(stmt);
  }
private:
  oracle::occi::Connection*conn_;
};
// deleter for result set.
class occi_rs_deleter{
public:
  explicit occi_rs_deleter(oracle::occi::Statement*stmt):stmt_(stmt){}
  void operator()(oracle::occi::ResultSet*rs)const{
    if(stmt_&&rs)stmt_->closeResultSet(rs);
  }
private:
  oracle::occi::Statement*stmt_;
};
#endif

So what's the conclusion of all this fiddling with templates? Yes, the template machinery is not so nice unless you are into C++ template meta programming. On the other hand, I didn't really use any horrendously complicated template constructs. Whether you like template meta programming or not, the result is not too bad; without even breaking a sweat I can pick up data from an Oracle database with just a handful lines of code:

#include "occi_utils.h"
#include <occiData.h>
#include <iostream>
using namespace std;
using namespace oracle::occi;

// main test program
int main(){
  // auth credentials and OCCI environment
  OcciAuth const auth{"hans","ewetz","mydb"};
  std::shared_ptr<Environment>env(Environment::createEnvironment(Environment::DEFAULT),occi_env_deleter());
  try {
    // row and bind 
    typedef tuple<int,string,string,string>row_t;
    typedef tuple<string>bind_t;
    string select{"select rownum,dbid,tab,cutoff_dt from MT_DBSYNC_CUTOFF_DT where dbid=:1"};
    bind_t bind{"Policy"};

    // use collection interface with authentication
    OcciSelectCollection<row_t,bind_t>rows{env.get(),auth,select,bind};
    for(auto r:rows)cout<<r<<endl;
  } 
  catch(std::exception&e){
    cerr<<"caught exception: "<<e.what()<<endl;
    return 1;
  }
  return 0;
}

Now, maybe there is one or two more lines here than in a PERL program. After all, the reason I wrote the code was to show that C++ code can be just as compact as PERL code. So why do I have a few more lines than a PERL program? Well, typesafety has some cost associated with it. Variables have to be declared to ensure that we don't write rubbish code. OK, I could have reduced the code by a few lines at the cost of readability. But all in all, compiled typesafe C++ code is still superior to non-compiled non-typesafe PERL code.

The next step is to write some support classes for updating, inserting and deleting in a database. This part I'll leave for another entry since I haven't written the code yet.

In case you want a copy of the code, feel free to send me an email at: hansewetz@hotmail.com.