You are here: Home / Blog

Polyhedral compilation for the masses

The PARKAS (INRIA and ENS/DI), PEQUAN (UPMC/LIP6) teams and IRILL are glad to invite you to a mini-workshop bringing together researchers in parallelizing compilation, polyhedral algorithms, scientific computing on GPUs, and formal methods for signal-processing and control systems.


Universite Pierre et Marie Curie
Room 25-26/105 (first floor, enter through tower 26)

Tuesday October 21st
Mini-workshop:   9:30 - 12:15
Thesis defense: 14:15 - ?

For logistical purposes, please send an email before October 14
if you plan to attend the mini-workshop:

Preliminary program

09:30-09:35   Welcome

09:35-10:10 - P. Sadayappan, Ohio State U.
10:10-10:45 - Sven Verdoolaege, INRIA and K. U. Leuven

10:45-11:00   Break

11:00-11:35 - Franz Franchetti, Carnegie Mellon U.
11:35-12:10 - Issam Said, U. Pierre et Marie Curie

14:15-      - Thesis defense, Tobias Grosser, INRIA

Titles and abstracts

Speaker: P. Sadayappan, Ohio State U.
Title: On Other Uses for the Polyhedral Model Besides Performance

The use of the polyhedral model is now widely acknowledged as the
most effective approach to compiler optimization of affine loop
computations. However, a significant unsolved challenge is that of
effectively using the polyhedral model to optimize real application
codes, which rarely are fully affine, but invariably also contain
interleaved non-affine computations. In this talk, two uses of the
polyhedral model for other goals than loop transformations will be
discussed: 1) a compiler-assisted approach to detecting transient
memory errors, and 2) compiler-assistance for software cache
coherence. With both these uses, arbitrarily interleaved
affine/non-affine codes can be handled, where the polyhedral
analysis offers benefits for the affine code portions, along with a
conservative handling of the non-affine computations.

Speaker: Sven Verdoolaege, INRIA and K. U. Leuven
Title: Integer Set Coalescing for Polyhedral Compilation

Polyhedral compilation is widely used in high-level synthesis tools
and in production compilers such as gcc, LLVM and IBM/XL.  It is
based on the polyhedral model, a powerful abstraction for analyzing
and transforming (parts of) programs that are ``sufficiently
regular''.  The key feature of this model is that it is instance
based, allowing for a representation and treatment of individual
dynamic executions of a statement inside a loop nest and/or
individual array elements.  These individual elements are
represented by tuples of integers which are collected in sets and
relations described by affine constraints.  Polyhedral compilation
then mainly consists of an analysis and transformation of these sets
and relations.

There are typically many ways to describe the same set of integer
tuples using affine constraints.  Most operations that can be
performed on these sets do not depend on which particular
representation is used, in the sense that only the time required for
the computation is affected by this choice but not the value of the
result.  There are however some operations where this choice can
have a significant impact, notably the computation of
(overapproximations of) transitive closures and the construction of
an AST for visiting each point in a set.  Assuming a representation
in disjunctive normal form, it is in all cases important to keep the
number of disjuncts as small as possible.  Excess disjuncts may
result from operations such as subtraction, lexicographic
optimization and integer projection.  Set coalescing tries to
combine several disjuncts into a single disjunct without affecting
the elements in the set.  We describe several cases where coalescing
can be applied.

Speaker: Franz Franchetti, Carnegie Mellon U.
Title: High Assurance Spiral: Co-Synthesizing Proof and Implementation
From High-Level Specification

In this talk we introduce "High Assurance SPIRAL" to solve the "last
mile" problem for the synthesis of high assurance implementations of
controllers for vehicular systems that are executed in todays and
future embedded and high performance embedded system
processors. High Assurance SPIRAL is a methodology to translate a
high level specification of a high assurance controller into a
highly resource-efficient, platform-adapted, verified control
software implementation for a given platform in a language like C or
C++. High Assurance SPIRAL proves that the implementation is
equivalent to the specification written in the control engineer's
domain language. Our approach supports problems involving
floating-point calculations and provides highly optimized
synthesized code.  At the core of High Assurance SPIRAL is the
Hybrid Control Operator Language (HCOL) that leverages advanced
mathematical constructs expressing the controller specification to
provide high quality translation capabilities.

Speaker: Issam Said, U. Pierre et Marie Curie
Title: Evaluating APU for high performance scientific computing

AMD APUs eliminate the PCI Express bus which bottlenecks many GPU
scientific applications. However, integrated GPUs in APUs are less
powerful than mainstream discrete GPUs, and while the upcoming APUs
will rely on a unified memory, the first APUs still have a distinct
GPU memory partition. Hence, it is worthwhile to investigate the
applications, as well as problem sizes, for which the GPU part of an
APU may outperform a discrete GPU. In this talk we assess the
relevance of the early generations of APUs for scientific HPC via
hardware and applicative OpenCL micro-benchmarks.  We present
detailed measurements of CPU-GPU data transfers and performance
tests of highly optimized 3D finite difference stencils.  We also
emphasize the importance of data placement when APUs are used.  Our
results show that integrated GPUs of upcoming APUs can outperform
discrete high-end GPUs for medium sized problems with high
communication requirements.

PhD candidate: Tobias Grosser, INRIA
Title: A Decoupled Approach to High-Level Loop Optimization,
tile shapes, polyhedral building blocks and low-level compilers

Despite decades of research on high-level loop optimizations and
their successful integration in production C/C++/FORTRAN compilers,
most compiler internal loop transformation systems only partially
address the challenges posed by the increased complexity and
diversity of today's hardware. Especially when exploiting domain
specific knowledge to obtain optimal code for complex targets such
as accelerators or many-cores processors, many existing loop
optimization frameworks have difficulties exploiting this
hardware. As a result, new domain specific optimization schemes are
developed independently without taking advantage of existing loop
optimization technology. This results both in missed optimization
opportunities as well as low portability of these optimization
schemes to different compilers. One area where we see the need for
better optimizations are iterative stencil computations, an
important computational problem that is regularly optimized by
specialized, domain specific compilers, but where generating
efficient code is difficult.

In this work we present new domain specific optimization strategies
that enable the generation of high-performance GPU code for stencil
computations. Different to how most existing domain specific
compilers are implemented, we decouple the high-level optimization
strategy from the low-level optimization and specialization
necessary to yield optimal performance. As high-level optimization
scheme we present a new formulation of split tiling, a tiling
technique that ensures reuse along the time dimension as well as
balanced coarse grained parallelism without the need for redundant
computations.  Using split tiling we show how to integrate a domain
specific optimization into a general purpose C-to-CUDA translator,
an approach that allows us to reuse existing non-domain specific
optimizations.  We then evolve split tiling into a hybrid
hexagonal/parallelogram tiling scheme that allows to generate code
that even better addresses GPU specific concerns. To conclude our
work on tiling we investigate the relation between diamond and
hexagonal tiling.  Starting with a detailed analysis of diamond
tiling including the requirements it poses on tile sizes and
wavefront coefficients, we provide a unified formulation of
hexagonal and diamond tiling to perform hexagonal tiling for two
dimensional problems (one time, one space) in the context of a
general purpose optimizer such as Pluto. Finally, we use this
formulation to evaluate hexagonal and diamond tiling in terms of
compute-to-communication and compute-to-synchronization ratios.

In the second part of this work, we discuss our contributions to
important infrastructure components, our building blocks, that
enviable us to decouple our high-level optimizations from both the
nec- essary code generation optimizations as well as the compiler
infrastructure we apply the optimization to. We start with
presenting a new polyhedral extractor that obtains a polyhedral
representation from a piece of C code, widening the supported C code
to exploit the full generality of Presburger arithmetic and taking
special care of modeling language semantics even in the presence of
defined integer wrapping. As a next step, we present a new
polyhedral AST generation approach, which extends AST generation
beyond classical control flow generation by allowing the generation
of user provided mappings. Providing a fine-grained option
mechanism, we give the user fine grained control about AST generator
decisions and add extensive support for specialization e.g., with a
new generalized form of polyhedral unrolling. To facilitate the
implementation of polyhedral transformations, we present a new
schedule representation, schedule trees, which proposes to make the
inherent tree structure of schedules explicit to simplify the work
with complex polyhedral schedules.

The last part of this work takes a look at our contributions to
low-level compilers. The main focus in this part is our work on
optimistic delinearization, an approach to derive a
multi-dimensional array view for multi-variate polynomial
expressions which commonly result from code that models data as
multi-dimensional arrays of parametric size.

Livrets bleus du logiciel libre

IRILL, par l'intermédiaire du groupe thématique Logiciel Libre, a contribué aux nouveaux livrets bleus.

Les trois livrets publiés traitent des sujets clés portés par IRILL. Ils sont disponibles aux adresses suivantes (PDF):

Plus d'information

Opam Weather Service now online!

Irill starts a package weather service for Opam, and the OCaml community.

Irill is now providing a valuable service to the OCaml community, via a static analysis of the packages present in the Opam repository, based on technology developed in the Mancoosi project.

On it is now possible to find an overview of the OCaml packages that can or cannot be properly installed, with detailed explanations of the reasons for the failures.

More detailed information can be found in this blog article.

Meetup Salt

For the third edition, IRILL is glad to host the Salt meetup. The event will take place Monday 19th, from 19:00 to 23:00.

SaltStack is similar to Chef or Puppet. It is an open source configuration management and remote execution application. More information:

Thomas S Hatch, the original author of Salt, will be present to this event.

More information on the Logilab blog.

This meetup is organized by our friends from Logilab.

SaltStack Paris Meetup

Logilab has set up the second meetup for salt users in Paris on Feb 6th, 2014 at IRILL, near Place d'Italie, starting at 18:00.

Here is the announce in french

Please forward it to whom may be interested, underlining that pizzas will be offered to refuel the chatters ;)

Conveniently placed a week after the Salt Conference, topics will include anything related to salt and its uses, demos, new ideas, exchange of salt formulas, commenting the talks/videos of the saltconf, etc.

If you are interested in Salt, Python and Devops and will be in Paris at that time, we hope to see you there !

Analyse de l'utilisation mémoire des applications OCaml sans changer leur comportement

Thursday January 30th, Cagdas Bozman (OCamlPro - ENSTA) will come at IRILL to talk about "Analyse de l'utilisation mémoire des applications OCaml sans changer leur comportement"

Nous étudions le comportement mémoire des applications OCaml, dans le but de mieux comprendre l'utilisation mémoire et essayer de détecter des problèmes comme les fuites mémoire. Dans l'exposé, nous présenterons un premier outil qui génère des graphes de la mémoire des applications OCaml.

Debian Ruby Team 2014 meeting

IRILL hosts the Debian team in charge of Ruby.

On January 15-17, IRILL will host a meeting of the Debian team in charge of Ruby. Organized as a satellite event of the Paris MiniDebConf, this meeting will be focused on two main goals: the transition toward the removal of ruby1.8 from the Debian archive and quality assurance work around Ruby on Rails Debian packages.

QEMU/CPC: Static Analysis and CPS Conversion for Safe, Portable, and Efficient Coroutines

Thursday January 16th, Gabriel Kerneis (Computer Laboratory - University of Cambridge) will come at IRILL to talk about "QEMU/CPC: Static Analysis and CPS Conversion for Safe, Portable, and Efficient Coroutines"

Coroutines and events are two common abstractions for writing concurrent programs. Because coroutines are often more convenient, but events more portable and efficient, it is natural to want to translate the former into the latter. CPC is such a source-to-source translator for C programs, based on a partial conversion into continuation-passing style (CPS conversion) of functions annotated as cooperative.

In this article, we study the application of the CPC translator to QEMU, an open-source machine emulator which also uses annotated coroutine functions for concurrency. We first propose a new type of annotations to identify functions which never cooperate, and we introduce CoroCheck, a tool for the static analysis and inference of cooperation annotations. Then, we improve the CPC translator, defining CPS conversion as a calling convention for the C language, with support for indirect calls to CPS-converted function through function pointers. Finally, we apply CoroCheck and CPC to QEMU (750 000 lines of C code), fixing hundreds of missing annotations and comparing performance of the translated code with existing implementations of coroutines in QEMU.

Our work shows the importance of static annotation checking to prevent actual concurrency bugs, and demonstrates that CPS conversion is a flexible, portable, and efficient compilation technique, even for very large programs written in an imperative language.

Ceylon Tour Paris 2014

January 24th, IRILL will host the Ceylon Tour Paris 2014. The Ceylon Project is a relatively new, high-level, statically and strong-typed programming language and SDK, created by Red Hat. It is based on the Java programming language.

It is a Ceylon conference, for an entire day, free, located in Paris. We will have most of the Ceylon core team, and members of the Ceylon community, who will give short talks and a workshop. Whether you don't know Ceylon yet, or want to know more, this is the place to be. We will present many aspects of the Ceylon language and ecosystem, as well as discuss the future of Ceylon.

Naturally, the conference should be a place of exchange and discussions, so we expect as much value between the talks as during the presentations.


More information on the event

Ceylon on Wikipedia

More videos on IRILL website

IRILL is glad to publish 3 more video events

Three more events recorded by IRILL are now available in video.