Simple question… very hard answer… Talk amongst yourselves…

Posted by on in Blogs

I’m going to try a completely different approach to this post. I’ll post a question and simply let the discussion ensue. I would even encourage the discussion to spill over to the public newsgroups/forums. Question for today is:

How can you effectively unit-test synchronization primitives for correctness or more generally, how would you test a concurrency library?

Let’s see how far we can get down this rabbit hole ;-).

About
Gold User, Rank: 84, Points: 11

Comments

  • Guest
    Pawel Glowacki Monday, 22 March 2010

    I would put it in the concurrent work and see how it performs...

  • Guest
    Allen Bauer Monday, 22 March 2010

    Performance is a separate issue. I said *correctness*, not performance. Yes it is a critical issue, but without correctness, performance is irrelevant.

  • Guest
    Pawel Glowacki Monday, 22 March 2010

    I would first define tests:-) What do we want to test? "Library" should be something reusable across a number of projects. What kinds of projects are these?
    I would start from use-cases of using such a library. In other words: cases of apps using the library being tested.

  • Guest
    David Heffernan Monday, 22 March 2010

    Off-topic I know, but *you* don't need to test primitives because the OS vendor does that. If you really want to know how they are tested then I'd look at the tests for Linux kernel.

    Starting from scratch I think the first thing you need to do is to enumerate how your primitive could fail and then try to devise a way to make that failure manifest. Easy to say, not so easy to do. Since these primitives will very likely be quite hardware dependent you need a lab with a lot of different hardware!

  • Guest
    Jacopo Monday, 22 March 2010

    I am not an expert in the field, but Helgrind and the UPPAAL project (http://www.uppaal.com/) might be relevant to the discussion.

    Many hard real-time systems research groups deal with this kind of problems everyday, they probably have a lot to offer in this area.

    The traditional "academic" approach would probably be to formalize some invariants and show that they hold in every possible situation. I am not sure this is always feasible.

  • Guest
    Jonas Maebe Monday, 22 March 2010

    For primitives, I usually write a test contains as much opportunities as possible that may/should cause a data race to cause an observable side effect. E.g., http://svn.freepascal.org/svn/fpc/trunk/tests/test/units/sysutils/trwsync.pp is the test I wrote for TMultiReadExclusiveWriteSynchronizer (I hereby release it to the Public Domain, so feel free to look at it and use it in any way).

  • Guest
    David Heffernan Monday, 22 March 2010

    Is TMultiReadExclusiveWriteSynchronizer a primitive?

    In fact Allen, what do you mean by primitive?

  • Guest
    Jonas Maebe Monday, 22 March 2010

    Synchronisation primitives are functionality that does not do anything by itself save for providing functionality to synchronise multiple threads of execution.

    There are obviously simple (mutex, semaphore, ...) and more complex (barrier, reader/writer locks, ...) primitives, and many primitives can be expressed/implemented using other primitives (semaphore using a mutex, barrier using mutexes or semaphores, ...).

  • Guest
    ahmoy Monday, 22 March 2010

    1. synchronization + dead lock
    2. speed - compare to using array of TThread(s)
    3. memory leak?

  • Guest
    alex Monday, 22 March 2010

    i would not use windows standard synchronization primitives, but use something similar to http://golang.org/ instead

  • Guest
    Márton Balassa Monday, 22 March 2010

    Give the problem to mathematicians, they can prove or disprove this kind of things :) Seriously, I would do that.

  • Guest
    Josh Kelley Monday, 22 March 2010

  • Guest
    Bruce McGee Tuesday, 23 March 2010

    In this case, I think the term "integration test" is more appropriate than "unit test".

    I assume you already have a good handle on the idea of TDD and have at least a basic framework of tests in place. If not, that's a whole other discussion.

    For this kind of fundamental core functionality, that HAS to work reliably every time, I think you need to re-engineer some of these in to long running tests and run them on as many different hardware/OS combinations as you can get your hands on for extended periods of time. Basically, a dedicated testing lab. And/or a USB key launchable suite that a random co-worker can run on their PC over night.

    No matter how many ways I anticipate how something will fail, something unexpected almost always comes up.

  • Guest
    Giel Tuesday, 23 March 2010

    I remember seeing a video on Channel 9. Microsoft was working on a tool which could test multithreaded code by finding out all possible sequences in which the code can execute. The tool could then execute all these paths.

  • Guest
    Eric Tuesday, 23 March 2010

    Make them run "slow", and throw execution test cases in these slow scenarios.

    Run those primitives in debugger mode, instruction by instruction, and at each step, have a variety of tests run against that primitive's state, so that your primitive isn't just tested at each of its high-level states, but at each of its actual CPU-level states (which is usually where hard to spot trouble lives).

    For instance, for an atomic assignment, this is intended to spot that there isn't an intermediate state where the assignment isn't atomic, even on instructions that *usually* last no more than a fraction of a cycle (thanks to pipelining, thus potentially resulting in extremely infrequent issues).

    If you don't have that exhaustive approach, but go for tradionnal unit tests in threads, odds are your correctness will only be statistical, and potentially merely circumstancial.

    That or going for a mathematical proof of course, as already said.

    Also, I think performance should be part of the unit test framework, not as much for pure performance sake, but to spot race conditions and implicit serializations (which would render your primitives entirely pointless if these are not serialization primitives you're testing!)

  • Guest
    Roddy Pratt Tuesday, 23 March 2010

    This sounds like a perfect Stack Overflow question!

    This line from Wikipedia entry on unit testing is pertinent: "Like all forms of software testing, unit tests can only show the presence of errors; they cannot show the absence of errors."

    I'd start by unit-testing it like any other module. Have simple unit tests for the various use cases, add new tests as appropriate to catch 'normal' bugs that are found during the unit's life.

    But, as you know, this isn't really going to catch many of the concurrency edge cases, except by accident, so there are a couple of different techniques to add.

    a: eyeballs. There is really no substitute for major, frequent code reviews on this kind of stuff, with as many good engineers as you can find. If your tests (or 'live' use) does throw up a problem, you'll probably have to do this anyway to find it.

    b: soak testing. Set up a machine (or several) running a serious multithreaded application (it probably doesn't matter what, but you should extend it according to the bugs you find), and run it 24/7. No VMs - this needs real metal.

    c: Unask the question. Instead, ask "why am I reinventing the wheel?" I've been round this myself too often to mention in years gone by, and would rather eat catfood than write/debug another set of synchronization primitives.

    I have the dubious honor of having found/fixed a fairly ugly BCB2009 RTL unicode string concurrency bug in the not-too-distant past. Found by eyeballing the code.

  • Guest
    Jolyon Smith Tuesday, 23 March 2010

    It's a null question. Without knowing what functionality the "Concurrency Library" is supposed to provide it's impossible to devise tests or envisage what tests might be necessary.

    "How would you test a Data Access library?"
    "How would you test a Mathematics library?"
    "How would you test a Graphics library?"
    "How would you test a Language library?"

    You can't test a name/label, you can only test functionality.

    So, reverting to the more specific question about synch primitives, still the question remains, what are the synch primitives involved? What are they supposed to do? How are they supposed to work?

    Once armed with that information, the tests should be quite straightforward to devise I should think.

  • Guest
    Darian Miller Tuesday, 23 March 2010

    One obvious piece of this equation is how to execute the tests. The test bed, once written, needs to be ran across a range of single, dual+ core systems with various speeds and capabilities, along with a combination of OS releases.
    (A single core running on XP will behave a lot different than a dual quad core Windows Server 2008)

    As far as the test bed goes, I would lean towards massive stress testing. Something on the order of the MS web stress tester where you vary the number of threads (or processes depending on the type of access required) and workers in which some workers are set to continuously write to different types of shared value(s), some workers are set to continuously read from these shared values, and some set to randomly read/write these shared values. Randomized wait lengths introduced before and during a read or write block. Enabled workers would need to log start and successful completion utilizing some sort of non-synchronized logging mechanism which can be analyzed later for success rate.


    $0.02 Good luck, sounds fun.

  • Guest
    m. Th. Tuesday, 23 March 2010

    Release it. Ship some time-bombed DCUs or a similar way of delivery. The biggest problems in any library are the design problems. Don't ignore them.

    Also, in order to "effectively test" you must envisage real-world cases. But your 'real world' is very different than ours. And this is natural. Hence one more reason to have a public beta.

    Have a look here:

    http://msdn.microsoft.com/en-us/devlabs/cc950526.aspx

    ...and take ideas.

    HTH

  • Guest
    Primoz Tuesday, 23 March 2010

    In OmniThreadLibrary, I'm stress-testing all lock-free code. The code runs with different number of threads, from one to *2, and repeats all tests "indefinitely".

    Typically, all bugs were detected in less than 15 minutes. However, I'm running all such tests overnight, for at least 12 hours.

  • Please login first in order for you to submit comments