3rd CoSy Community Gathering Report

Multicore Drives Parallelism Towards Impending Crisis

Compiler Community Prepares for Parallel Programming Challenge

Amsterdam, 1 October 2008 Parallelism was the dominant thread of the third CoSy Community Gathering, organised by ACE Associated Compiler Experts in Amsterdam last week. The emergence of multicore technology and multiprocessor design has generated a huge upsurge in interest in parallel programming and the tools required to compile applications software to take best advantage of the new hardware platforms. World class compiler development company, ACE, organised a two day technical programme plus a strategic-level discussion on parallelism, for customers, researchers and partners to debate the challenges facing the industry.

Introducing the technical programme, ACE CEO and founder, Martijn de Lange set the scene with his predictions of an impending crisis. "Within three years the multicore and multiprocessor silicon to support the broadbased move into parallelism will be ready. But the tool chain needed to support applications software development will not," he said. "In short, we will have silicon but no systems." A group of experts in the field from all over the world, comprising CoSy users and developers plus ACE partners, from the commercial arena, research organisations and academia, gathered in Amsterdam to exchange ideas and share their knowledge. ACE, developer of the successful CoSy compiler development system, hosted the meeting to learn more about the technical challenges customers are facing as they move towards multicore technology and parallelism, the tools and solutions they are seeking, and importantly, the timeframe. CoSy users were attracted by the opportunity to learn first hand from the expert developers how best to exploit the new features of the latest CoSy release, and to hear how independent researchers are utilising the advanced features of the compiler framework. De Lange estimates that EUR 80m investment is needed to develop a consistent and coherent tool chain to enable and exploit parallelism. ACE is already making a significant contribution, and is facing an investment of EUR 20m over the next couple of years in compiler technology. "We are looking for partners to join us to develop complementary tools such as profilers, debuggers, visualisers and OS technology. This is a very large undertaking - a leap, not a gradual process. And we need to accelerate now," he concluded.

Compiler-Centric Solutions

Technical presentations from ACE experts on the first day focused on specific techniques to help optimise efficiency, make code more compact and support parallelism. Topics included: delay slot filling, the use of backtracking, out-of-order execution and the perils of negative latency; predicated execution support in the form of new engines to analyse source code, new annotations and advanced cost estimation; and value range analysis using an innovative extended algorithm which can be incorporated with the CoSy toolset and with users' proprietary back-end tools. The in-depth technical sessions continued for the specialists on the second day, spanning retargetable SIMD recognition, code generator feedback and applications of the polyhedral model. Lars Gesellensetter of the Technical University of Berlin delighted the audience with his ideas on making compilers more intelligent. He outlined techniques his department had been researching to address the memory gap. "Over the last 10 years, cpu speed has increased dramatically, but memory speed has simply not kept up, making memory accesses very expensive," he said. During compilation, data dependencies hinder optimisation while alias analyses are too conservative and over-approximate, leading to 50% stalls at run time, he explained. The solution is to use speculative techniques which ignore unlikely dependencies. To achieve this his department has been exploring artificial intelligence techniques, including heuristics and machine learning, to develop a memory dependence predictor. "Early results are most promising, he concluded".

C Stands for Compiler

Among the audience were several comparatively new commercial users of CoSy. "We have chosen the multicore route to keep the power budget down," said one telecom company. "Compilers are becoming a strategic issue as we must generate compilers that take full advantage of the target hardware. CoSy is definitely regarded as a productivity tool in this context," he added. Another firm in the wireless sector has, until now, always developed applications software directly in assembly code. But a shift to more advanced, increased functionality products incorporating DSP technology, has led it firmly into the C language domain. Several delegates agreed that the popular open source GCC compiler technology was no longer effective nor efficient when it comes to multiprocessing and multicore targets. Increasingly, they are looking for new tools to address data and task parallelism issues, dealing with shared and distributed memory, and dynamic compilation for a heterogeneous architecture. Generally, they believe, tools need to be compiler-centric and fully integrated.

Exploding the Myths

The strategic-level parallel session on parallelism served to explode a number of myths about multicore design and the need for parallelism to address systems development issues. A select group of strategic thinkers and compiler experts was led into discussion by Martijn de Lange, CEO of ACE.

"That multicore is new is the first misperception in this market," he said. De Lange recalled the first multiprocessor based system (the 'Naked-mini') that ACE worked on back in 1978, and then a few years later, the firm developed a Unix kernel for a heterogeneous shared memory multiprocessor workstation for graphics applications. By 1990, ACE had developed the parallel C compiler for the Transputer, and in 1995, announced the first validated F90 High Performance Fortran compiler. "Don't forget the lessons learned," de Lange emphasised. "In the software and parallel programming world, experience is everything. It is just as important to recall what you chose not to do, and why."
Parallel programming is difficult, he continued, as the human brain is not good at thinking in parallel while writing sequentially. Although incoming 'killer' applications will prove the benefits of parallel programming, De Lange pointed out that existing code will not be rewritten. The key will be efficient re-compilation optimised for multicore hardware. Portability is another critical consideration. "Hardware life cycles are reducing while software life cycles are increasing. Today's hardware will not be tomorrow's best choice," he added. Tools are the key ingredients for successful parallel programming for multicore systems, de Lange continued. The tools must address parallelism at all levels, from instructions, through loops, algorithms, applications programming to systems integration. "And they must be a coherent, integrated set of tools with the same notion of a parallel system to ensure portability," he added. Reiterating his earlier point, that multicore silicon will be ready but not deployable due to the lack of tools, de Lange highlighted the huge gap between what is currently being invested in tool development and the level of investment required to meet forecast demand in the coming years. "There will be panic as people realise they have a powerful need for tools for parallel systems development," he said. De Lange is equally concerned about the attitude among software development teams towards purchasing tools, often an order of magnitude below the budgets for hardware development. "Open source solutions will never be developed swiftly enough, nor sufficiently to meet the quality standards demanded," he predicts. "There will have to be a mindset change." It was pointed out that huge investments are being made by Intel, Microsoft, IBM and others in conjunction with US universities into training for parallel programming. But the resulting solutions are only going to suit certain proprietary architectures. On the question of finding EU funding, it was argued that such projects typically end at the proof of concept stage and it can take at least five times the investment to reach a viable, commercial product. "Software is a strange product," de Lange explained. "The value of the investment will be close to zero for most of the way through the development process. Typically investors want to see that their projects are making progress."

Portability and Parallelism

Meanwhile, ACE's partners, CoWare and Compaan are not only committed to developing generic solutions to ensure portability, but also have the stamina to invest for the long term. This raised an interesting discussion on the different models in common use for parallelism and whether there was a need to bridge these models or whether a single model will ultimately prevail. The philosophers in the group extolled their ideals: 'parallelism is all about dataflow', without locality and safe composability, it won't work, 'determinism: do we want predictable performance or predictable results?'. The interchange continued with views on the necessity of dynamic compilation and the incompatibility of a dynamic approach with heterogeneous architectures, the benefits of sequential programming with message passing, multithreading and compensating for mixed memory, multiprocessor architectures. Ultimately, the group concluded, we need to increase awareness of the importance of these issues, and that compilers can play a much more central role in exploiting parallelism. In the short term, and the long term, whether the drive comes from the processor vendors or the software developers, the compiler is the bridge.

Evolution or Revolution

Customers attending the ACE CoSy Community Gathering were largely agreed that an interactive, integrated set of coherent tools supporting parallelism is essential. Some are already deploying multicore based systems, and although they are not yet panicking, they are looking for programming tool solutions. The boundaries between high performance computing and embedded systems is blurring. OEMs want increased performance and functionality, but they want tools that are simple and elegant, and above all, easy to adopt into current design flows by both hardware and software teams. The message to ACE was clear - evolution in a radically changing environment. ACE's response is one of quiet optimism - as the firm repeats its urgent call for experts to join them from allied fields, including virtual platforms, simulation and debugging, among others. It will be interesting to see what progress has been made by the time of the next CCG.