Message Oriented Object Design and James Shore's Challenge

James Shore posted an architectural challenge this week on his blog and personally threw his gauntlet in my face to answer the challenge using this message oriented design stuff I've been ranting and raving about. Of course when I say "threw his gauntlet in my face" I really mean he said it might be interesting to see... BUT STILL! A man can't back down from that! ;)

 

You can read about his challenge in detail here: http://jamesshore.com/Blog/Architectural-Design-Challenge.html
 
To sum it up, the idea is to build a ROT13 file encoder, TDD'd and beautiful. There were two parts to his challenge. The first was just to get everything done by reading the whole file into memory. Then he wanted us to refine our design around this idea. Once that was done we could move onto to part two of the challenge which required us to process the file as we load it off of disk and save it back to disk incrementally.
 
What did I learn from this experience? I'm extremely happy with the flexibility and robustness of the designs I get when I approach things from a message oriented point of view. There are times where it's too much (there are no absolutes laws right?) but for any system I work on of any actual complexity, it is a great guiding hand for me.
 
 
To get a general idea of what I did I've provided the way I connected my objects together below:
 
First my mistakes:
 
1) This part is confusing. I'm basically just creating an object that will forward every message it receives to both other objects but it isn't executed well:
 
2) Instead of having a separate configuration command, the things I wanted configured, should have just been configured on the fly. Setting them to be configured and then calling for configuration to occur seems way too meh.
 
3) The line where I call out fileReader.Read(); is where the whole system comes to life but I fear that's not obvious.
 
Now what I like:
 
1) Whenever I create a message oriented design, I can discuss the whole system by pointing to the place where I configure my dependencies. The overall system flow may not be perfectly digestable but if one were to try to create a flow chart from this configuration they would find it very easy (I have and it lends itself well to presentations ;)  ). Another thing is, instead of needing a call graph that shows how objects talk to one another, the same ideas fall out of the view of how the objects are dependent on one another in my experience.
 
2) Whenever I run into too much pain with this approach it's a smell I did something wrong. Case in point: While working on part I of the challenge, I had begun to write and test a class that was essentially going to orchestrate all of the other classes together on top of the class which configures which objects talk to one another. Essentially I was building a router. The pain for me was that I was creating WAY too many fakes and needing to care WAY too much about what they were doing. So I took a step back and drew my objects on a piece of paper and then reconnected them per the new design I drew. There were hardly any code changes necessary and it was pretty short work.
 
3) I tend to write tiny objects. Some people hate having too many objects or objects that don't do much so your tastes may vary. I've found that smaller more focused classes help me however. When they encompass literally only a single responsibility I find them to be easier to replace/modify when they no longer fit my needs and I only need to mock when I absolutely need to.
 
If you haven't been talking with me or reading what I've been writing about Message Oriented Object Design here's a brief quick summary:
 
Message Oriented Object Design is an object oriented design philosophy wherein we view objects as sending immutable messages/publishing events on channels. MOOD systems also rely on the configuration of object networks to enable collaboration between them. A core tenet is the lack of inter-object getters (be it method or property calls).
 
4) I like how little code there is in my console application.
 
That's it! Leave me a comment if you want to lend your own critique of what I've done. I also encourage you to head over to James' site and throw your own hat in the ring and critique other people's designs (be harder on the other designs though of course!).
 
Till next time.

Message Oriented Object Design and Machine Learning in Javascript

This article will show how to use Message Oriented Object Design (not unlike Message Oriented Programming aka MOP or Actor Model) to model your user interface as an actor and handle some more complex processing while updating the user interface. Specifically, the sample code implements a simple machine learning exercise wherein you enter any character on your keyboard and the program attempts to guess what you chose (without cheating ;).
 
First what is Message Oriented Object Design (henceforth referred to as MOOD)? Message Oriented Object Design is an object oriented design philosophy wherein we view objects as sending immutable messages/publishing events on channels. MOOD systems also rely on the configuration of object networks to enable collaboration between them. A core tenet is the lack of inter-object getters (be it method or property calls). Since I wrote this example in Javascript and it has no inherent support for this concept all of the ideals of MOOD will need to be enforced by convention. Message Oriented Object Design is a term I made up. I'm not sure that it's sufficiently different from Message Oriented Programming or Message Based Programming to warrant existence but I also don't want to sully those terms with my own ideas if there are important subtleties I'm missing.
 
I'm in the process of writing a very in depth article on Message Oriented Object Design so if you want to know more just let me know and I'll contact you when it's available. For now, suffice it to say that the words object and actor are interchangeable as are the words message, method, or event.
 
The problem we're going to solve is this: Given a text box where a user can enter in any character literal we will create a system that will use that information to predict what the user's next entry to be and also update the web page with our stats on how we're doing. Because we're using the MOOD philosophy, there will be no getters between objects (using them on private methods is perfectly acceptable though).
 
To get started I wrote the following very simple javascript object to represent the user interface:
 
 
One of the key ideas that makes MOOD so powerful is that it views your user interface as just another MOOD object (basically an as an actor). This means that all of the UI eventing that can be so troublesome finds a home here. The idea of asynchronous actions will be built into all of our objects so even as we switch contexts to work on the machine learning portion of the system, the overall object design will look very familiar.
 
Here you can also see the concept of a channel in my objects. In MOOD (and Message Oriented Programming) we always assume that we're using a channel which well pass along our message to the correct object. So while we will end up passing an object reference as the channel, this assumption forces us to view our code as though it is an isolated object unaware of how its method calls will affect others. This will enable us to ensure an extremely clean separation of concerns (SRP) and it will make it easier on us to verify when we violate SRP. How? Look at the semantic meaning of the code in the object. Does any of it seem out of place for an object that's managing the type of UI we are? Why isn't there any knowledge of the learning that this program will be doing? Think about this as we continue.
 
Once I had this code written, I wrote some quick test code just to make sure it was outputting the correct values to the correct spots in the HTML. I'll leave writing that code as an exercise to the reader as it is fairly trivial.
 
Next up, I iterated on the actor that managed the learning task. While the latest code is utilizing a markov chain to learn the users' patterns, I started it incrementally by just having it guess "yesterday's weather" (ie. use the current input as our prediction of the next input). This is the completed implementation:
 
 
As a refresher, the Markov Chain as I've implemented it tells us which value is most likely to be entered next given the previous value. I won't go into the implementation details but the code is fairly concise and is hopefully legible enough to be decrypted. 
 
The learning actor has just a couple of main parts to it.
  • The next(value) message that is passed the value that the user entered.
  • The _learn_from_new_information(previous_value, current_value) method that trains our markov chain.
  • The _make_best_guess(value) method that utilizes our trained markov chain to make an educated guess about the user's next entry.
  • Last but not least, a simple set_guess_channel_to(channel) message that we can use to publish what we guessed and what the right guess actually was.
Initially, I had actually written the code that is now in the scoreboard actor as a part of the learning actor. Here's that code:
 
 
You can see it's fairly simple and likewise I was hesitant to move it to a new class. As you get started with this style, you will feel this quite often. I recommend fighting through the pain until you come upon the first "major" refactoring you need to do. The ease in which you'll be able to make that change I guarantee will astound you and you'll be hooked. Another reason I hesitated to move this out of my learning actor is that I assumed I would be duplicating the concept of "the previous value must equal the last". Since MOOD doesn't allow for getters I knew that the only way I could have shared that logic would be copy 'n' paste reuse (read: ewww). Look at the algorithm left over in the learning actor though. It never cares whether or not we guessed right. It only tracks the guesses and makes a hypothesis regarding them. So if guess checking wasn't a concern of the learning algorithm why did I have it there to begin with? I simply wanted to display a scoreboard. Hence the creation of my scoreboard actor.
 
We've got all of these objects but what to do with them? The configuration of our objects is referred to as the network configuration. This is essentially just a different flavor of dependency injection. The difference here being that your configuration will be able to be factored away from the rest of your code and isolated if you so choose. Here's the object network configuration for this code:
 
 
The first thing that should stand out to you is that we are making no attempt to make our objects immutable. In MOOD, just like in Actor Model, we are guaranteed that an actor will only ever be used from the context of a single thread throughout its lifetime. This might seem to be a poor constraint here is why it's not: Imagine the learning actor gets some VERY complex logic. That isn't a stretch depending on how accurate you want the guesses to be. So, if you had written this code without using this style and didn't explicitly design for asynchronicity what might happen? The first time that learning actor needs to really think, your UI will freeze up. Because we wrote this using Message Oriented Object Design however, we can throw that logic _anywhere_ and it won't block our UI. What do I mean by anywhere? I mean we could literally host it on a web service and instead of implementing our actor on the HTML we could have an actor that was responsible for interacting with the web service. Someday, if Javascript gets threads we could even throw the extra work onto a thread and create a channel object to manage the threading context on the passed messages. The rest of our code wouldn't change for either case. If you need an actual example leave me a comment to that effect because for now this seems as though it's easy to see especially once someone has pointed it out. In the meantime, if you've been thinking that Message Oriented Object Design is a lot of extra work for pedantic self-indulgent programmers think about whether or not your code could do that.
 
Oh yeah. Also, notice that there is only one VERY thin object in all of that code that has anything to do with the DOM. The rest is trivially unit testable. And not just testable in a small way, but testable as in only the object under test will be exercised. I didn't TDD this code. That's just the way the MOOD pulls me.
 
Also, I apologize in advance for my horrible naming of methods and objects. Hopefully it still gets the point across.
 
That's it! Go ahead and try it, it's pretty neat. Just "randomly" pressing keys on the keyboard the way I do the code was able to guess correctly 40% of the time or so. Not bad at all! Also, regular patterns like "abcabcabc" it will get pretty quick and you'll see the code try to follow you if you do something like "aaaabababaaaababababab". Of course, like all learning agents, the more random the string you enter, the worse the agent will perform.
 
The full HTML source code is here for you to download and try. It does require you to include jquery for it to work. Leave a comment if you have any questions! :)
 

TDD-ing Concurrent Code

A Method for Modelling Concurrency
 

I'm prepping code for Code Camp Boise and Seattle and I thought I'd share some of the simple stuff I'm writing as I'm writing it to act as an introduction of sorts to the concepts.

 
I hear a lot of people say things like "Well we made this process concurrent so now we can't test it." That just always felt wrong to me. Over the past year or two, as I've been reading about threading though I've kept this in mind. Like any concern, it's difficult to test without taking it into account if the concern isn't abstracted away from the code under test.
 

Testing concurrent software can be extremely difficult. While debugging, breakpoints can be seemingly randomly tripped by other threads that you don't care about, your data can change right under your nose whether or not you're paused, etc.
 
Another issue I hear is that synchronizing across threads is a pain. What happens if after verifying the object you want to use is in the appropriate state, some other thread changes it and then when you use it it throws an exception? In this way, race conditions can be extremely difficult to manage.
 
One way to handle this is to use a message based data flow oriented model. Why? Well first and foremost because this allows you to model your data dependencies and allow the abstraction to suss out the details. By just declaring a network of processes (which are essentially objects) as a directed graph you gain the ability to do this. Now you've explicitly declared how these different processes will interact with one another and since data flow programming uses immutable objects you won't have to worry about any processes interfering with each other.
 
Yet another great thing about developing a data flow network is that you can test each process in isolation without it even having any knowledge of threading. That's what I will be talking about for the remainder of this post.
 
Some Context
 
A friend of mine needed a computer program that would go through a text file of over 10,000 lines of text (sometimes more) and find all of the valid email addresses. First and foremost I came up with a quick description and overview of the process I could see going through:
 
FileReadingAgent
reads in lines from file line by line and passes them along to the ObviousEmailExtractionAgent while skipping the blank lines.

ObviousEmailExtractionAgent
extracts obviously good email addresses from each line and passes them on to the GoodEmailCollectionAgent. 
Lines without obviously good email addresses (or with none at all) are passed to the NonObviousEmailExtractionAgent for further processing.

NonObviousEmailExtractionAgent
Uses more intelligent email extraction rules to find less obvious email addresses
Passes any found email addresses on to the GoodEmailCollectionAgent.

GoodEmailCollectionAgent
Aggregates known good email addresses.
 
So to reiterate, all of these agents should be assumed to be running in their own threads. Also, they only communicate to one another via immutable messages. 
 
The FileReadingAgent would have a connection to the ObviousEmailExtractionAgent. The ObviousEmailExtractionAgent would have *two* connections. One to the GoodEmailCollectionAgent and another to the NonObviousEmailExtractionAgent.
 
In this post I'd like to share the tests that went into creating the FileReadingAgent. 
 
TDD-ing a Process
 
The first context I worked on assumed there was only one line of text in a file. This is a basic context that just helps to ensure that the basic plumbing for my agent is all hooked up. This is how I tested this context:
 
 
The next context I worked on contained blank text lines. I wanted to ensure that those lines of text didn't get passed on to my email finding agents.
 
 
The final context has lines of text with whitespace characters and one line of text that is an email address. I wanted to ensure that only lines with any kind of text moved on to the agents that would actually try to parse out email addresses. In hindsight, this probably should have gone in the ObviousEmailExtractionAgent. It seems like the FileReading agent shouldn't really be concerned with this. I could probably just change the name of my FileReadingAgent to NonBlankLineReadingAgent and get by that way. ;)
 
 
My "final" code doesn't handle disposal or anything and it definitely should! That's an oversight on my part. Aside from that this code should be pretty much complete:
 
 
Notice the use of the IObserver interface? I'm stealing a bit from the new .NET Reactive framework (an idea that I got from Robert Ream). By using the OnNext method I can make my network of agents push oriented rather than pull oriented. The benefits of this can be enumerated in another blog post. :)
 
How could I connect these to run synchronously? Super easy. This is how I could link the LineByLineFileReader to the ObviousGoodEmailExtractionAgent:
 
lineByLineFileReader.ShouldSendLinesOfTextTo(obviousGoodEmailExtractionAgent);
 
Then to start I'd send the filepath I wanted to be processed to the lineByLineFileReader like so: 
 
lineByLineFileReader.OnNext("c:/myfile.txt");
 
Next time I'll show an overview of the whole application and how it works concurrently with a WPF UI.