Freitag, 27. Juni 2014

A C++ Variant type using type_info

Many computer languages don't use strict typing at all or they do have variant types which can take any type from POD types like int and double to containers or self defined classes. In C++ there is no variant type. But does one need a variant type at all? I certainly know which types I will transport (by value, by reference or by pointer) over my interfaces. While this is usually true for a well designed project, sometimes situations come up where variants would be helpful. I try to sketch out one possible way of programming a variant in C++.

Suppose you have a couple of classes derived from the same interface. Each of the classes contains potentially different variable types. Some contain an integer and a string, another one contains a vector of int a third one contains a different combination thereof.
struct Interface  
{  
  Interface () {}   
};  
   
struct A : public Interface  
{  
  A () : Interface () {}  
   
  int length;  
  std::string name;  
};  
    
struct B : public Interface  
{  
  B () : Interface () {}  
    
  int length;  
  std::vector ages;  
};  
   
struct C : public Interface  
{  
  C () : Interface () {}  
    
  std::string name;  
  std::vector ages;  
};  


In your code you've created a container of pointers to the interface.
int main (int)
{
  std::vector myContainer;
  myContainer.push_back (new A ());
  myContainer.push_back (new B ());
  myContainer.push_back (new C ());
  myContainer.push_back (new B ());
}
For some reason you want to collect all the values of all the classes A, B and C which are stored in the container. One use case is logging. You want to extract the values of all of your objects and put them into a list, a diagram or maybe an xml file. You don't want to implement all the formatting options into the classes A, B or C, neither do you want to implement it in the interface. Maybe you plan on extending the logging feature with other views which you haven't defined yet. Obviously you want just to fetch the values and do the formatting in your logging code. Hence, the task is to be able to fetch all the different types of all the objects, preferably in an easy manner.

What are the options?

  • You could write a getter function for each variable type. This is a lot of work and everytime a new variable is added, the getter function has to be added in all classes (=more work). Some of the getter functions would not have useful values to return for some of the classes if the variable is not present in this particular class. 
  • You could try to cast the the interface to all the available classes. Once the cast succeeds, the specific getters could be used. 
  • Template getters? Templates and virtual functions don't mix well. 
  • Create a variant type which can hold all the necessary variables. 

Of all the mentioned ways, creating a variant type is the most convenient way to handle the situation. But how could we write such a variant type? 
Using a variant type the solution could look like this:
enum EnumDataKey
{
  eLength,
  eName,
  eAges
};

 
struct Interface
{
  Interface () {}  
  virtual Variant getData (EnumDataKey eKey) const = 0;
};
where the virtual getData function has to be implemented specifically for each of the derived classes. The enums serve to denote which data the function shall return. It could encode simply the data type (int, double, string, etc.) or---as in this case---serve to name a specific variable (e.g. length, name, ages).

But we don't have a variant type yet. 

We don't, and as written earlier, there is no such generic type in C++. There are several possibilities to achieve such a behaviour.
  • The first possibility would be to have a class which contains a member variable for each of the types which the classes might give back. This technique works, but it's not very elegant, because we'd have to carry around a fat data structure always. If we wanted to give back a new data type we'd have to amend the fat data structure by this new type. This is not flexible and error prone.
  • The second possibility is to try to store the variables as void pointers and use type_info to encode the type information. Then on the receiver side, the type has to be built again and filled with the data from the void*. I will explore this option in the following.
Copy constructor and assignment operator of type_info are private. No copying possible. But within the execution of a program, the command typeid returns for the same types always the reference to the same type_info object (see: type_info ). I want to exploit this. 

How to build a variant type?

We can program a variant type for C++ using two classes. A class which serves to carry the data and the corresponding type_info and another (templated) class which unwraps the carried data once it is needed again. I like the feature, that the middle man (the interface) doesn't have to know anything about which types might be wrapped. The type has to be known at the time of wrapping (obviously) and then by the receiver which un-wraps it. This technique is used in the boost::any type. Since boost is not available in every project, it's good to know the techniques to be able to self-implement it. I tried to add the explanation as comments to the code.
#ifndef  __WRAP_H__
#define  __WRAP_H__

#include <typeinfo>

class IsNullException {};
class IsIncompatibleException {};

// this class wraps an arbitrary data type and can be 
// transported over virtual member functions
class Wrap
{
public:
    // c'tor 
    Wrap () 
    : pValue (0)
    , pType (&typeid(void))
    , bDoDelete(false) 
    {}

    // c'tor for an arbitrary type
    // store the variable as non-const void* in pValue
    // remember the type information in pType
    template <typename T>
    Wrap (const T& value) 
    : pValue (static_cast(const_cast(new T(value)))) 
    , pType (&typeid(T)) // store the type info in pType
    , bDoDelete(true)  // we've done "new T", remember to delete afterwards
    {}

    // assignment operator to assign an arbitrarily 
    // typed variable to the wrapper 
    // store the variable as non-const void* in pValue
    // store the type info in pType
    // we've done "new T", remember that we have to delete it afterwards
    template <typename T>
    Wrap& operator= (const T& value) 
    {
        pValue = static_cast(const_cast(new T(value)));
        pType = &typeid(T);
        bDoDelete = true;
        return *this;
    }

    // get the pointer
    inline const void* getPointer () const { return pValue; }

    // get the type info
    inline const std::type_info& getTypeInfo () const { return *pType; }

    // if no variable has been set, the wrapper is empty
    inline bool empty () const { return pValue == NULL; }

    // do we have to delete it?
    inline bool doDelete() const { return bDoDelete; }

private:
    void*                 pValue;  // value information
    const std::type_info* pType;   // type information 
    bool                  bDoDelete; // has to be deleted
};
#include <typeinfo>
template <typename T>
class UnWrap
{
public:
    typedef T value_type;

    // c'tor which takes a wrapped value
    UnWrap (Wrap wrap) : pValue(0)
    {
        // check if the type of T corresponds to the type of the
        // wrapped object
        if (typeid(T) != wrap.getTypeInfo())  // if not
            throw IsIncompatibleException (); // throw an exception
        if (!wrap.getPointer()) // check that something has been wrapped
            throw IsNullException ();
        // cast the void* back to the desired type
        pValue = const_cast(static_cast(wrap.getPointer ()));
        // in case a clone has been created at wrapping
        // the UnWrap-type has the duty to delete the value
        // and thus free the memory
        bDoDelete = wrap.doDelete ();
    }

    // free the memory if UnWrap is the owner
    ~UnWrap() { if (bDoDelete) delete pValue; }

    // cast operator; enables UnWrap<anytype> to be casted to anytype
    // where anytype is the type provided to UnWrap
    // which has to be the same type as the one which 
    // has been wrapped. 
    operator const T& () const
    { 
        if (pValue==NULL) // just to be secure 
            throw IsNullException(); 
        return *pValue; // only const value is provided
    }

private:
    T* pValue;
    bool bDoDelete;
};
We can now use the variant type as sketched out in the following. In the printData function for each of the possible types it is tested if it can be unwrapped. If not, a exception is thrown, if yes, the value is unwrapped and then copied into a local variable and subsequently printed.
#include 
#include 
#include "wrap.h"

void printData (const Interface* data)
{
    // unwrap "eName" (a std::string)
    try
    {
        UnWrap<std::string> unWrapName (data->getData (eName));
        const std::string& name = unWrapName;
        // can do some formatting here
        std::cout << "name= " << name << std::endl;
    }
    catch (...)
    {
        // can do something in the case the data doesn't
        // contain eName
    }

    // unwrap "eLength" (an integer)
    try
    {
        UnWrap<int> unWrapLength (data->getData (eLength));
        const int& length = unWrapLength;
        std::cout << "length= " << length << std::endl;
    }
    catch (...)
    {
    }

    // unwrap "eAges" (a std::vector<int>)
    try
    {
        UnWrap<std::vector<int> > unWrapAges (data->getData (eAges));
        const std::vector<intT>& ages = unWrapAges;
        for (std::vector<intT>::const_iterator it = ages.begin (), 
             itEnd = ages.end (); it != itEnd; ++it)
        {
             std::cout << "age= " << (*it) << std::endl;
        }
    }
    catch (...)
    {
    }
}
The main function for testing this type prepares some data and calls the printData function. It is shown in the following.
int main (int)
{
  std::vector myContainer;
  myContainer.push_back (new A ());
  myContainer.push_back (new B ());
  myContainer.push_back (new C ());
  myContainer.push_back (new B ());
  
  //doSomethingWith (myContainer);

  // loop over all the data in the container 
  // (or whichever data structure is holding the data)
  for (std::vector::const_iterator it = myContainer.begin (), 
    itEnd = myContainer.end (); it != itEnd; ++it)
  {
      printData (*it);
  }
}
I hope this code helps you and you can adapt it to your own needs. This blog posting has been written with C++03 in mind. With C++11 features like declspec and auto as well as move semantics this variant type can be improved. Check out as well the implementations of variant-like types in boost (e.g. boost::any, boost::variant) if they fit your need. If you can use boost in your project, you might be saved from implementing such types by yourself.
You are very welcome to leave your comments! Let me know what you think.
Creative Commons License A C++ Variant type using type_info by Peter Speckmayer is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.