Tag Archives: lab49

Concurrency Pattern: Finding and exploiting areas of latent parallelism

With the JDK 7 developer preview out and a final release fast approaching it’s important to not only to become aware of what the new version offers but also, in certain areas where existing programming paradigms have radically changed, to make a mental shift in the way we think and understand how we can leverage these new paradigms best to our advantage. One such area is that of finding and exploiting areas of latent parallelism using a coarse grained parallelism approach.

As I mentioned in my previous post about the JDK7 developer preview being released – we’ve been using jsr166y and extra166y at work for some time now and this post really stems from an impassioned discussion that took place on finding and exploiting areas of latent parallelism in code so here’s what I have to say on the matter (inspired obviously by Doug Lea, Brian Goetz and my esteemed colleagues). The traditional and very much outdated mindset has only understood threads and ever since java 5 the executor framework on top. However this mechanism is fundamentally limited in its design in the extent of parallelism it can offer.

Firstly threads are expensive not only in their creation and stack size allocation but also in terms of context switching between them. Deciding on how many threads to have is also always at best an educated guess. A particular service within a process may decide to use all available cores but if every service in the process does the same then you have a disproportionately large number of threads and I have worked with applications with more than 150-200 threads operating at a time. Secondly, the executor framework has helped considerably in taking away some of the decision making from the developer and absorbing that complexity but it still suffers from heavy contention from multiple threads on the internal queue of tasks that it holds again adversely impacting performance. Thirdly, threads and executor frameworks normally do not scale up or down based on the hardware that they’re running on and certainly do not scale based on load. Their performance is very much constant by way of their underlying design.

Enter the fork join framework and parallel arrays. This is not a paragraph about how to use these new features but, in my opinion, a far more important note on how to rid ourselves of a legacy mindset on parallelism and make room for a new one. The fork join framework and parallel arrays (which are backed by the fork join framework and fork join pool internally) should not be perceived only as threading tools. That’s very dangerous as it means that we are only likely to use them in those areas where we previously used threads. They can in fact help us find and exploit areas of latent parallelism.

What does that mean? In all applications there are areas of code that operate sequentially. This code may be thread confined or stack confined and we almost never reconsider the way they perform. With FJ/PA we can now start making these areas concurrent. How is this an improvement? Well FJ/PA offer the following key features which makes them an ideal fit for such a use case.

Firstly, they are fundamentally decoupled from the number of threads in the way they add value which is a good thing. They tend to perform well regardless of how many threads they are using. Secondly, instead of using a single work queue for all threads they use one work queue per thread. This means further decoupling between threads and the way tasks are stored. Thirdly, given multiple work queues and multiple threads, FJ/PA perform work stealing. Every queue is a double ended queue and when one thread has completed all its tasks it then starts to process the tasks from the tail of another queue and because it is dequeuing off the tail there is no contention on the head of the queue from which the owner of the queue is dequeuing. Not only that but the largest tasks are placed towards the end of queues so that when another thread does steal work off another queue it gets enough work to effectively reduce the interval at which it steals again thereby again reducing contention. And finally, and most importantly, given a piece of FJ/PA code it will not only scale up but effectively scale down based not only on the hardware it runs but also on the load of the incoming work. When you understand this new paradigm suddenly the legacy paradigm seems so primitive and fundamentally stunted.

So the next time you are browsing your code consider using jsr166y and extra166y to find and exploit latent areas of parallelism. Generally the rule of thumb should be that this approach works best for operations that are cpu intensive and the legacy paradigm is better for io or network bound operations for obvious reasons. If operations are io or network bound there is less contention and the limitations of the legacy paradigm are less exposed. Don’t forget that the two libraries above can be used in java 6 so there’s no need to wait for java 7!

RSA asymmetric cryptography in Java

I’ve been working with rsa asymmetric cryptography in Java recently and found that this area was very poorly documented. The article, RSA Public Key Cryptography in Java, provided the breakthrough I required in how to generate asymmetric keys in a way that would then allow Java to read those keys in and understand them for use in encryption and decryption. Thanks to the author whom I’m crediting here through this post.

JDK7 developer preview available

The JDK7 developer preview is now available.

The following are what I’m looking forward to the most amongst all the new features.

We’re already using jsr166y and extra166y with Java 6 at work amazingly.

Good code

Good code

Good code. The Reality. How true.

Good code bad code

Good code and bad code. The Naked Truth.

How true. Is there really such a thing as good code? If so how can it be attained, qualified and quantified? Perhaps there is only code that serves a purpose for a length of time and another that is retired or is never adopted. This is a philosophical debate I fear so I’m going to stop here. The above osnews sketch is probably my favourite programming cartoon of all time as nothing is truer.

Presentation: Development at the Speed and Scale of Google

Since I’ve never had the good fortune of being able to afford QCon (one day this will change) I appreciate the fact that InfoQ post QCon videos online for free albeit late. Recently I watched ‘Development at the Speed and Scale of Google‘.

Prior to watching this presentation I knew only what I had encountered in the wider industry and really could not have foreseen any of what I was about to watch. The tools that I use on a daily basis and the difficulties that impede me now both seem primitive and outdated in comparison to the progress that Google has made. The key point on the subject matter of this presentation is that it is not about development but what makes it possible to develop at the speed and scale of google: in this case – build and release engineering.

Highlights from the talk that I found worthy of note are listed below.

  • Working on build and engineering tools requires strong computer science skills and as such the best people.
  • We cannot improve what we cannot measure. Measure everything. This, in my opinion, is a fantastic quote. This stops a team going off on open ended endeavours that yield either intangible or no results.
  • Compute intensive IDE functions have been migrated to the cloud such as generating and searching indexes for cross referencing types across a large codebase.
  • The codebase required for building and running tests is generally larger than that which is worked upon but delivering the entire codebase to every developer either in source or in binary form would kill the network. Here – a fuse daemon detects when surrounding code is required using a fuse (user space) filesystem and retrieves it incrementally on demand.
  • For similar reasons to the above point – they’ve developed a virtual filesystem under Eclipse and contributed it back. The obvious benefit is that directly importing a large code base into Eclipse kills it whereas incremental loads perform.
  • They build off source and not binaries and maintain an extremely stable trunk from which they release. If you imagine that all code is in a single repository (in fact the largest Perforce repository in the world) then it really puts into perspective the achievement of using only trunk.
  • The designated owners for a given project who review code have at their fingertips all the intelligence metadata on the code to assist them in the reviewing process. If you think about it that makes a lot of sense. To review you need more than just the code to spend your time effectively. You may want the output of introspection, test runs etc.
  • Compilations are distributed and parallelised in the cloud and output is aggressively cached. It’s fascinating to hear a case study where this has actually been implemented. I’ve often considered remote compilations but never come across a concrete implementation until now.

The importance of build and release engineering is often underestimated. It is often portrayed and perceived as an area of work that’s second class in nature and rather unglamorous. However, as this talk attests, it is very much the contrary. It can massively boost developer and organisational productivity and efficiency and requires the best people. I’ll end with a quote from the presenter: “Every developer worth their salt has worked on a build system at least once”.

Java concurrency bug patterns

Rather randomly and on a legacy note – here’s a series of links on the subject of java concurrency bug patterns from both IBM DeveloperWorks and from Alex Miller at Pure Danger Tech both for my reference and yours. Most readers, I expect will know all of this, but if you don’t the list is worth reading. It’s also good to raise awareness of such concurrency bug patterns in general as a surprisingly large number of interview candidates fail to answer questions correctly in this area.

In general, although most people tend to consider this optional or are blissfully unaware of it, documenting the thread safety of your code is very good practice and essential if you consider that FindBugs detects them and validates your assertions.

Intel Sandy Bridge announced

The eagerly awaited Intel Sandy Bridge processors have finally been announced (yesterday) and have received superb reviews. Read more at Macrumors, Engadget, TechReport, Intel, Intel Blogs (an older link). They feature, amongst overall improvements on all fronts, vastly improved graphics performance and battery life. These really can’t be found on the mac line soon enough. No doubt Apple will be touting 15 hours battery life with these if they’re touting 10 hours now. Mid year release, I reckon, along with Lion though I’d like to see those on the MacBook Air more than any other model as they are, without a doubt, best of breed now. To quote Intel on the most significant feature of this release in my opinion:

Improved Cores with Innovative Ring Interconnect: The 2nd generation Intel Core Processor family microarchitecture features vastly improved cores that are better connected with an innovative ring interconnect for improved data bandwidth, performance and power efficiency. The ring interconnect is a high bandwidth, low latency modular on-die system for connection between processor components for improved performance. The ring interconnect enables high speed and low latency communication between the upgraded processor cores, processor graphics, and other integrated components such as memory controller and display.

Tech arms race in the Tron landscape

On the train, back to the real world, this evening for another year at work I came across a fascinating article in the New York Times titled ‘Electronic Trading Creates A New Trading Landscape – The New Speed Of Money Reshaping Markets‘. For the duration of that journey I was wholly engrossed in the article and the radial thought processes it triggered effortlessly and constantly on technology and finance. It was an inspiring read and one that made me glad and relieved that I happened to work in this industry.

Predominantly it talked about how, over time, smaller exchanges (such as Direct Edge) had reclaimed the overwhelming dominance and market share of the historic exchange duopoly of NASDAQ-NYSE and how, during that process, New Jersey had been transformed to become ‘the heart of Wall St’ through the placements of data centres within it that now host and operate some of the largest stock exchanges in the US. The charming reference to a ‘Tron landscape’ was made in the article based on the likeness of the blue phosphorescent lighting used in the datacentres for illumination to that in the film.

More interesting to me, however, was the story of how this progression had been driven from its core by the breakneck speed and sheer extent of technological automation, advancement and innovation leaving traders, regulators and the market struggling to keep up in its trail. So where are we now? Exchanges are distributed, leaner and more competitive. Through colocation, software advancement, closer proximity to targets and with new fibre optic pathways constantly being laid between critical geographic exchange data routes trading is faster. Through high frequency trading, dark pools and strategic algorithms – trading is more intelligent allowing arbitrage and price exploitation through micro trading under stealth.

What have been the consequences of such advancements over time however? The use of HFT to place a very large bulk order in small increments was found to be the root cause of a market crash last May when this particular HFT algorithm continued placing trades as part of a large order despite prices sinking part way through. As a result the SEC and the exchanges introduced a halt to trading on individual stocks if the price fell more than ten percent in a five minute period. Dark pools have been in the spotlight for being opaque and exempt from public scrutiny. And there is talk of regulation not only of data centres and colocation but of perhaps technology’s greatest achievement of all – speed. The unattended and perahps ill-considered advancement of technology for mostly selfish motives has resulted in a disproportionate loss of control, transparency and ethical considerations away from human discretion and towards machine code. Can technology continue to dominate this industry progression at its core to its advantage or will it become the very victim of its own success? I wonder where we go from here. What do you think?

Java still #1

OracleTechNet have just posted: “New Tiobe Index is out – Java still #1, C# on the march”. Cool. It’s obvious that the first five are major players but the one that makes me curious is Python (Tiobe). I wonder what areas it’s gaining its popularity and adoption in. Objective C also appears to be making major strides but the single reason for that is obvious and this will be nothing but a steep upward trend.

VirtualBox: Sharing folders between host and guest

Recently, having finally refused to surrender to windows, I installed Ubuntu virtualised as a guest on Windows as a host using Oracle’s recently released VirtualBox. Here’s a tip on how to share folders between guest and host in the official way.

On the guest VM virtualbox menu open ‘Shared folders’.

Open Shared Folders

Open Shared Folders

On the top right of the dialogue box that comes up click the ‘+’ icon. Fill in the dialogue by adding a name and location.

Add share

Add share

After that you should have a shares dialogue as below.

Shares Dialogue

Shares Dialogue

Next, as root, mount manually.

mkdir /mnt/share
mount -t vboxsf virtual-box-ubuntu-share /mnt/share/

And, finally, add the following entry into /etc/fstab for future boots.

virtual-box-ubuntu-share /mnt/share vboxsf defaults 0 0

Done. Ubuntu on VirtualBox running as guest on a Windows host is by far the best and most compelling complement to your development environment if you are forced into using Windows as a host. VirtualBox even supports seamless mode which means that you can have Linux and Windows windows intermingled on the windows desktop. Superb. And best of all – both VirtualBox and Ubuntu being completely free.

Update: Great news. VirtualBox 4.0 is out. Here’s what’s new.