Multicore Drives Parallelism Towards Impending Crisis
Compiler Community Prepares for Parallel Programming Challenge
Amsterdam, 1 October 2008
Parallelism was the dominant thread of the third CoSy Community Gathering, organised by ACE Associated Compiler Experts in Amsterdam last week. The emergence of multicore technology and multiprocessor design has generated a huge upsurge in interest in parallel programming and the tools required to compile applications software to take best advantage of the new hardware platforms. World class compiler development company, ACE, organised a two day technical programme plus a strategic-level discussion on parallelism, for customers, researchers and partners to debate the challenges facing the industry.
Introducing the technical programme, ACE CEO and founder, Martijn de Lange set the scene with his predictions of an impending crisis. “Within three years the multicore and multiprocessor silicon to support the broadbased move into parallelism will be ready. But the tool chain needed to support applications software development will not,” he said. “In short, we will have silicon but no systems.” A group of experts in the field from all over the world, comprising CoSy users and developers plus ACE partners, from the commercial arena, research organisations and academia, gathered in Amsterdam to exchange ideas and share their knowledge.

ACE, developer of the successful CoSy compiler development system, hosted the meeting to learn more about the technical challenges customers are facing as they move towards multicore technology and parallelism, the tools and solutions they are seeking, and importantly, the timeframe. CoSy users were attracted by the opportunity to learn first hand from the expert developers how best to exploit the new features of the latest CoSy release, and to hear how independent researchers are utilising the advanced features of the compiler framework.
De Lange estimates that EUR 80m investment is needed to develop a consistent and coherent tool chain to enable and exploit parallelism. ACE is already making a significant contribution, and is facing an investment of EUR 20m over the next couple of years in compiler technology. “We are looking for partners to join us to develop complementary tools such as profilers, debuggers, visualisers and OS technology. This is a very large undertaking - a leap, not a gradual process. And we need to accelerate now,” he concluded.
Compiler-Centric Solutions
Technical presentations from ACE experts on the first day focused on specific techniques to help optimise efficiency, make code more compact and support parallelism. Topics included: delay slot filling, the use of backtracking, out-of-order execution and the perils of negative latency; predicated execution support in the form of new engines to analyse source code, new annotations and advanced cost estimation; and value range analysis using an innovative extended algorithm which can be incorporated with the CoSy toolset and with users' proprietary back-end tools. The in-depth technical sessions continued for the specialists on the second day, spanning retargetable SIMD recognition, code generator feedback and applications of the polyhedral model.
Lars Gesellensetter of the Technical University of Berlin delighted the audience with his ideas on making compilers more intelligent. He outlined techniques his department had been researching to address the memory gap. “Over the last 10 years, cpu speed has increased dramatically, but memory speed has simply not kept up, making memory accesses very expensive,” he said. During compilation, data dependencies hinder optimisation while alias analyses are too conservative and over-approximate, leading to 50% stalls at run time, he explained. The solution is to use speculative techniques which ignore unlikely dependencies. To achieve this his department has been exploring artificial intelligence techniques, including heuristics and machine learning, to develop a memory dependence predictor. “Early results are most promising, he concluded”.

C Stands for Compiler
Among the audience were several comparatively new commercial users of CoSy. “We have chosen the multicore route to keep the power budget down,” said one telecom company. “Compilers are becoming a strategic issue as we must generate compilers that take full advantage of the target hardware. CoSy is definitely regarded as a productivity tool in this context,” he added. Another firm in the wireless sector has, until now, always developed applications software directly in assembly code. But a shift to more advanced, increased functionality products incorporating DSP technology, has led it firmly into the C language domain. Several delegates agreed that the popular open source GCC compiler technology was no longer effective nor efficient when it comes to multiprocessing and multicore targets. Increasingly, they are looking for new tools to address data and task parallelism issues, dealing with shared and distributed memory, and dynamic compilation for a heterogeneous architecture. Generally, they believe, tools need to be compiler-centric and fully integrated.

Exploding the Myths
The strategic-level parallel session on parallelism served to explode a number of myths about multicore design and the need for parallelism to address systems development issues. A select group of strategic thinkers and compiler experts was led into discussion by Martijn de Lange, CEO of ACE.
“That multicore is new is the first misperception in this market,” he said. De Lange
recalled the first multiprocessor based system (the 'Naked-mini') that ACE worked on
back in 1978, and then a few years later, the firm developed a Unix kernel for a
heterogeneous shared memory multiprocessor workstation for graphics applications.
By 1990, ACE had developed the parallel C compiler for the Transputer, and in 1995,
announced the first validated F90 High Performance Fortran compiler. “Don't forget
the lessons learned,” de Lange emphasised. “In the software and parallel programming
world, experience is everything. It is just as important to recall what you chose not to
do, and why.”
Parallel programming is difficult, he continued, as the human brain is not good at
thinking in parallel while writing sequentially. Although incoming 'killer' applications
will prove the benefits of parallel programming, De Lange pointed out that existing
code will not be rewritten. The key will be efficient re-compilation optimised for
multicore hardware. Portability is another critical consideration. “Hardware life cycles
are reducing while software life cycles are increasing. Today's hardware will not be
tomorrow's best choice,” he added.
Tools are the key ingredients for successful parallel programming for multicore
systems, de Lange continued. The tools must address parallelism at all levels, from
instructions, through loops, algorithms, applications programming to systems
integration. “And they must be a coherent, integrated set of tools with the same notion
of a parallel system to ensure portability,” he added.
Reiterating his earlier point, that multicore silicon will be ready but not deployable due
to the lack of tools, de Lange highlighted the huge gap between what is currently being
invested in tool development and the level of investment required to meet forecast
demand in the coming years. “There will be panic as people realise they have a
powerful need for tools for parallel systems development,” he said. De Lange is
equally concerned about the attitude among software development teams towards
purchasing tools, often an order of magnitude below the budgets for hardware
development. “Open source solutions will never be developed swiftly enough, nor
sufficiently to meet the quality standards demanded,” he predicts. “There will have to
be a mindset change.”
It was pointed out that huge investments are being made by Intel, Microsoft, IBM and
others in conjunction with US universities into training for parallel programming. But
the resulting solutions are only going to suit certain proprietary architectures. On the
question of finding EU funding, it was argued that such projects typically end at the
proof of concept stage and it can take at least five times the investment to reach a
viable, commercial product. “Software is a strange product,” de Lange explained.
“The value of the investment will be close to zero for most of the way through the
development process. Typically investors want to see that their projects are making
progress.”

Portability and Parallelism
Meanwhile, ACE's partners, CoWare and Compaan are not only committed to developing generic solutions to ensure portability, but also have the stamina to invest for the long term. This raised an interesting discussion on the different models in common use for parallelism and whether there was a need to bridge these models or whether a single model will ultimately prevail. The philosophers in the group extolled their ideals: 'parallelism is all about dataflow', without locality and safe composability, it won't work, 'determinism: do we want predictable performance or predictable results?'. The interchange continued with views on the necessity of dynamic compilation and the incompatibility of a dynamic approach with heterogeneous architectures, the benefits of sequential programming with message passing, multithreading and compensating for mixed memory, multiprocessor architectures. Ultimately, the group concluded, we need to increase awareness of the importance of these issues, and that compilers can play a much more central role in exploiting parallelism. In the short term, and the long term, whether the drive comes from the processor vendors or the software developers, the compiler is the bridge.
Evolution or Revolution
Customers attending the ACE CoSy Community Gathering were largely agreed that an interactive, integrated set of coherent tools supporting parallelism is essential. Some are already deploying multicore based systems, and although they are not yet panicking, they are looking for programming tool solutions. The boundaries between high performance computing and embedded systems is blurring. OEMs want increased performance and functionality, but they want tools that are simple and elegant, and above all, easy to adopt into current design flows by both hardware and software teams. The message to ACE was clear - evolution in a radically changing environment. ACE's response is one of quiet optimism - as the firm repeats its urgent call for experts to join them from allied fields, including virtual platforms, simulation and debugging, among others. It will be interesting to see what progress has been made by the time of the next CCG.