Thursday, 11 September 2014

Defensive Programming - BSLS_ASSERT

One of the tenants of the BDE is its approach to Defensive Programming which tries to address how you deal with undefined behaviour, those inputs that fall outside of the preconditions for a method.

The basic idea is that the production code should in fact not cater for undefined behaviour, since if it did, then one could argue that the behaviour is now defined. However actually doing nothing would be of no value to the exercise of debugging, or more directly exposing latent bugs.

The good old ASSERT has been around to solve exactly this problem, as it disappears in release mode builds.

The BDE however, offers its own set of Assert macros that try to take the existing ASSERT concept to a different level by offering more flexibility to both the function author and the function consumer regarding the behaviour over the asserts :

There two dimensions of this behavioural control that are offered:

  1. When should the Assert should take affect?
  2. What the behaviour of the affect should be?

When should the Assert should take affect?

To look at this dimension of control, we need to introduce the Assert macros that are supported for the function author to choose from:

BSLS_ASSERT(X)
When the cost of running the assert is within 5-10% of the cost of the raw production code
BSLS_ASSERT_SAFE(X)
When the cost of running the assert is in orders of magnitude compared to the raw production code
BSLS_ASSERT_OPT(X)
When the cost of the assert is negligible compared to the raw production code

So why do we give the author of the code the choice to make these calls about the relative cost of the ASSERT macros, especially if the idea is to have the assert code go away in release builds?

That is the point... we don't want to have the assert choice be that granular! Especially when you are writing library code like the BDE. As the BDE you have absolutely no knowledge about where you are being used, and what sort of tolerance your consuming application has for runtime checks. Some applications might have a high degree of tolerance for CPU cycle being spent on runtime validation - others might be so seriously constrained that even in debug mode, performing aggressive runtime checks would be prohibitive.

The onus is left to the application developer to chose which of these asserts to accept with the following preprocessor directives

  1. BSLS_ASSERT_LEVEL_ASSERT_SAFE
  2. BSLS_ASSERT_LEVEL_ASSERT
  3. BSLS_ASSERT_LEVEL_ASSERT_OPT
  4. BSLS_ASSERT_LEVEL_NONE

Lets look at an example of when you might use these as the function developer by looking at a somewhat random method in the BDE - highlighted comments in the method are mine

int systemProtect(void *address, int pageSize)
    // Protect from read/write access the page of memory at the specified
    // 'address' having the specified 'pageSize' (in bytes).  The behavior is
    // undefined unless 'pageSize == getSystemPageSize()'.
{
    BSLS_ASSERT(address);
        //Notice that our precondition here is assumed for production code, and not validated
        //so this is a typical candidate for validation... and further we know that relative to
        //our function the overhead of this check is quite low
        
    BSLS_ASSERT_SAFE(pageSize == getSystemPageSize());
        //Notice again, our function simply assume that you will pass it in the correct page size, 
        //which means it is a good candidate to assert.
        //However, in this case the relative cost of this function is quite expensive, and in bug 
        //free production code, those would be massively wasted cycle -- so we use SAFE

#ifdef BSLS_PLATFORM_OS_WINDOWS
    DWORD oldProtect;
    return !VirtualProtect(address, pageSize, PAGE_NOACCESS, &oldProtect);
#else
    return mprotect(static_cast<char*>(address), pageSize, PROT_NONE);
#endif
}

On the other hand there might be a function like this

int restoreData(int dataIndex)
{
    BSLS_ASSERT_OPT(validateIndex(dataIndex));
        //do some validation on this index, but since we are about to make a call out to 
        //some persistence layer.. we know that the relative cost of this validation is
        //entirely negligible 
        
    return getDataFromDeepStorage( dataIndex );
}
Someone might now argue that if you know that OPT type asserts are so relatively cheap in relation to the function, why even put them in an assert - and if you were thinking that, then I would argue that you are falling into the trap of conflating relative performance of a method to that of an application, and this speaks directly to the fact that a function developer (especially a library developer) has no knowledge of the domain it is running it.

While in relation to this method the cost might be small, in the context of the application, this method might be being called 100's of orders of magnitude's more than any other method, which in turn might mean the second most expensive part of the application would be the "negligible" run time check.. which is entirely redundant ( assuming a bug free application )

What the behaviour of the affect should be?

Lets now look at the second dimension.. what exactly should the assert do... should we write to the console ( on embedded devices that might be an issue ).. should we pop up a dialog box ( as a service without a UI that sucks )... throw an exception... write to a log file... send an email ?

The answer to all of these is yes... because well it depends -- it depends on the type of application you are, are you running on the developers machine, or are you in embedded in medical hardware, are you under load and performance testing etc.

Again we are back to the position where only at the application level do we know the answer to this, so we can only make this call then.

The BDE Assert Macros are all configured to invoke a method that matches this signature, and is setup like so :

static void assertHandler(const char *text, const char *file, int line)
{
 //.. do something appropriate
}

int main()
{
 bsls::Assert::setFailureHandler(&assertHandler); 
        //Tell the BDE that should any assert fire off in this program, it should 
        //invoke this handler
    
    doProgram(); 
 return 0;
}
Like all good things though, the BDE will come with a default handler, and a few other handlers that you might find handy
    bsls::Assert::setFailureHandler(&bsls::Assert::failThrow); 
        //will throw bsls::AssertTestException 
    bsls::Assert::setFailureHandler(&bsls::Assert::failAbort); 
        //The Default -- aborts the application with std::abort()
    bsls::Assert::setFailureHandler(&bsls::Assert::failSleep);
        //Will simply go into a repeating loop of sleeps to give you the chance to attach a debugger 
        //into the live crash
All in all, some nice extensions to the simple little Assert, to allow you the freedom to extend your capabilities of doing runtime checks deeper into the life cycle of your application -- even live -- without taking on the innocent by ignorant judgement call of a library developer.

Also something you will need to know how to control if you are going to be developing with the BDE libraries.

Happy defensive programming!

Wednesday, 10 September 2014

Setting up to Explore the BDE

First things first, before you want to look around the BDE it might be nice to get it onto your machine and execute some code, so lets see if we can't set you up with that environment.

The Short Story

1) git clone https://github.com/bloomberg/bde.git
2) git clone https://github.com/bloomberg/bde-tools.git 
3) Add bde-tools/bin add to path
4) Install Python 2.7 add to path
5) waf configure
6) waf build
7) waf install

The Longer Story

Not too surprisingly we are going to need to get the code first

git clone https://github.com/bloomberg/bde.git

Next we are going to want to build the code, to do this we make use of the bde-tools which is based on WAF so lets go and grab that too -- probably a good ideas to clone this too so we can keep it up to date.

git clone https://github.com/bloomberg/bde-tools.git

However, before you get carried away, WAF depends on Python 2.6.x-.2.7.x .. not 3.x, so install that if you don't already have it, and make sure you have it on your path.

BDE-Tools has bde-tools/bin/waf|waf.bat which also needs to be on your path, so add that. Right now we are good to go...

If you browse to the root of your BDE clone, you can now run waf configure which should run like this:

>waf configure
Using C:\Python27\python.exe
Setting top to                           : E:\Bloomberg\bde
Setting out to                           : E:\Bloomberg\bde\build
Checking for 'msvc' (c compiler)         : C:\Program Files\Microsoft Visual Studio 12.0\VC\BIN\CL.exe
Checking for 'msvc' (c++ compiler)       : C:\Program Files\Microsoft Visual Studio 12.0\VC\BIN\CL.exe
os_type                                  : windows
os_name                                  : windows_nt
cpu_type                                 : x86
os_ver                                   : 6.1
comp_type                                : cl
comp_ver                                 : 18.00
uplid                                    : windows-windows_nt-x86-6.1-cl-18.00
ufid                                     : dbg_exc_mt
prefix                                   : C:\users\stephenb\appdata\local\temp

Loading BDE metadata                     : ok
Evaluating options for 'bsl'             : ok
Evaluating options for 'bdl'             : ok
Saving configuration                     : ok
'configure' finished successfully (2.732s)

Note however, if you do have both a supported and unsupported python version installed on your machine, WAF may pick up the wrong version, but it will not error at this point, only later.. so take note of the version it selects in line 2 above.

So the output gives you an idea of what is going on here... WAF is interrogating your system to find the appropriate configuration. But what does it do with this information, well if you look at your root folder now you will see a new build folder, which has the "executable" it needs to do the build, along with all the appropriate flags and variables it needs in the build/c4che sub folder.

Okay so now we are bootstrapped and ready to build with waf build:

>waf build
Using C:\Python27\python.exe
Waf: Entering directory `E:\Bloomberg\bde\build'
Waf: using 2 jobs (change with -j)
[  1/312] cxx: groups\bsl\bsltf\bsltf_allocbitwisemoveabletesttype.cpp -> bugroups\bsl\bsltf\bsltf_allocbitwisemoveabletesttype.cpp.1.o
[  2/312] cxx: groups\bsl\bsltf\bsltf_alloctesttype.cpp -> build\groups\bsl\f\bsltf_alloctesttype.cpp.1.o

... lots more ...

[312/312] cxxstlib: build\groups\bsl\bslx\bslx_byteinstream.cpp.22.o build\gs\bsl\bslx\bslx_byteoutstream.cpp.22.o build\groups\bsl\bslx\bslx_instreamfuons.cpp.22.o build\groups\bsl\bslx\bslx_marshallingutil.cpp.22.o build\groupl\bslx\bslx_outstreamfunctions.cpp.22.o build\groups\bsl\bslx\bslx_testinstrxception.cpp.22.o build\groups\bsl\bslx\bslx_testinstream.cpp.22.o build\grobsl\bslx\bslx_testoutstream.cpp.22.o build\groups\bsl\bslx\bslx_typecode.cpp.o build\groups\bsl\bslx\bslx_versionfunctions.cpp.22.o -> build\groups\bsl\bbslx.lib
Waf: Leaving directory `E:\Bloomberg\bde\build'
'build' finished successfully (1m44.321s)

Now that that is all done, we see a whole bunch of folders under the build/groups for each of the different package groups -- which we will get into in a later post -- where our libs have been build to. However, we still have one more step, which will make referencing these libraries much easlier - waf install which will copy our libs and headers into an install location for easy inclusion into our projects.

>waf install
Using C:\Python27\python.exe
Waf: Entering directory `E:\Bloomberg\bde\build'- install C:\users\stephenb\appdata\local\temp\include\bsl\bsltf_allocbitwisemoveabletesttype.h (from groups\bsl\bsltf\bsltf_allocbitwisemoveabletesttype.h)
- install C:\users\stephenb\appdata\local\temp\include\bsl\bsltf_alloctesttype.h (from groups\bsl\bsltf\bsltf_alloctesttype.h)

... lots more ...

- install C:\users\stephenb\appdata\local\temp\include\bdl\bdlma_sequentialpool.h (from groups\bdl\bdlma\bdlma_sequentialpool.h)
Waf: Leaving directory `E:\Bloomberg\bde\build'
'install' finished successfully (1.080s)

If however you are not happy with the default install location you can simply go back to waf config and pass in a new install location

>waf configure --prefix c:\bloomberg\bde
>waf install

Sweet.. now we have our lib and header file in easy to reference location for experimenting with

c:\bloomberg\bde>dir /b
include
lib

c:\bloomberg\bde>dir include /b
bdl
bsl

c:\bloomberg\bde>dir lib /b
bdl.lib
bsl.lib
pkgconfig

Lastly it is probably a good idea to run the unit tests to see that everything is working as expected - although they may not all pass to do this, we run the build again.. this time telling it to build and run the tests.

>waf build --test=run
Using C:\Python27\python.exe
Waf: Entering directory `E:\Bloomberg\bde\build'
Waf: using 2 jobs (change with -j)
[  19/1164] cxx: groups\bsl\bsltf\bsltf_allocbitwisemoveabletesttype.t.cpp -> build\groups\bsl\bsltf\bsltf_allocbitwisemoveabletesttype.t.cpp.2.o
[  20/1164] cxx: groups\bsl\bsltf\bsltf_alloctesttype.t.cpp -> build\groups\bsl\bsltf\bsltf_alloctesttype.t.cpp.3.o
[  31/1164] cxx: groups\bsl\bsltf\bsltf_bitwisemoveabletesttype.t.cpp -> build\groups\bsl\bsltf\bsltf_bitwisemoveabletesttype.t.cpp.4.o
[  32/1164] cxx: groups\bsl\bsltf\bsltf_convertiblevaluewrapper.t.cpp -> build\groups\bsl\bsltf\bsltf_convertiblevaluewrapper.t.cpp.5.o
[  33/1164] cxx: groups\bsl\bsltf\bsltf_degeneratefunctor.t.cpp -> build\groups\bsl\bsltf\bsltf_degeneratefunctor.t.cpp.6.o

... lots more ..

What you may notice is that the tests are each standalone applications... which we can take advantage of in a future post :)

ps> Get-Process bs*

Handles  NPM(K)    PM(K)      WS(K) VM(M)   CPU(s)     Id ProcessName
-------  ------    -----      ----- -----   ------     -- -----------
     13       1      320       1292     9    27.18  10468 bsls_atomic.t
     12       1      280       1160     7     0.00   8380 bsls_bsllock.t

You should be good to go exploring now... Happy BDE hacking!

Monday, 8 September 2014

Lower Comments for Readability

The first thing that struck me was actually a really small things.. but one of those so intuitive things, it makes you wonder .. why did I never think of that, or how is it I have never come across this ?

Take a look at a snippet from one of their base classes .. what do you notice different?

namespace BloombergLP {

namespace bslma {

                        // ===============
                        // class Allocator
                        // ===============

class Allocator {
    // This protocol class provides a pure abstract interface and contract for
    // clients and suppliers of raw memory.  If the requested memory cannot be
    // returned; the contract requires that an 'std::bad_alloc' exception be
    // thrown.  Note that memory is guaranteed to be sufficiently aligned for
    // any object of the requested size on the current platform, which may be
    // less than the maximal alignment guarantee afforded by global
    // 'operator new'.

  public:
    // PUBLIC TYPES
    typedef bsls::Types::size_type size_type;
        // Alias for a signed integral type capable of representing the number
        // of bytes in this platform's virtual address space.


    // CREATORS
    virtual ~Allocator();
        // Destroy this allocator.  Note that the behavior of destroying an
        // allocator while memory is allocated from it is not specified.
        // (Unless you *know* that it is valid to do so, don't!)
        
    ...    
 }

Perhaps I have just been sheltered, but seeing the comments *below* the class name and method declarations.. is well just beautiful. It only stands to reason that I would like to know what is being commented on, before I read the comment.

Introduction

Bloomberg have open sourced some of their core internal libraries that they call they collectively call the BDE - Bloomberg Development Environment which they have been actively developing and using in their high throughput production environment for over 10 years.

So I thought I would take a look around to see what I could learn. This blog will my collection of lessons, and possibly a useful guide to others to help find their way around these components.