EuroGP is the premier annual conference on Genetic Programming, the oldest and the only meeting worldwide devoted specifically to this branch of evolutionary computation. It is always a very enjoyable event attracting participants from all continents, offering excellent opportunities for networking, informal contact, exchange of ideas and discussions with fellow researchers, in a friendly and relaxed environment. High quality papers describing new original research are sought on topics strongly related to the evolution of computer programs, ranging from theoretical work to innovative applications. The conference will feature a mixture of oral presentations and poster sessions. In 2015, the EuroGP acceptance rate was 50% (33.3% for oral presentations).
Topics include but are not limited to:
A comprehensive bibliography of genetic programming literature and links to related material is accessible at the Genetic Programming Bibliography web page, part of the Collection of Computer Science Bibliographies maintained and managed by William Langdon, Steven Gustafson, and John Koza.
Title: Evolutionary Program Sketching
Authors: Iwo Błądek, Krzysztof Krawiec
Abstract: Program synthesis can be posed as a satisfiability problem and approached
with generic SAT solvers. Only short programs can be however synthesized in this
way. Program sketching by Solar-Lezama assumes that a human provides a partial
program (sketch), and that synthesis takes place only within the uncompleted parts of
that program. This allows synthesizing programs that are overall longer, while
maintaining manageable computational effort. In this paper, we propose Evolutionary
Program Sketching (EPS), in which the role of sketch provider is handed over to
genetic programming (GP). A GP algorithm evolves a population of partial programs,
which are being completed by a solver while evaluated. We consider several variants
of EPS, which vary in program terminals used for completion (constants, variables, or
both) and in the way the completion outcomes are propagated to future generations.
When applied to a range of benchmarks, EPS outperforms the conventional GP, also
when the latter is given similar time budget.
Title: Exploring Fitness and Edit Distance of Mutated Python Programs
Authors: Saemundur O. Haraldsson, John R. Woodward, Alexander I.E. Brownlee,
David Cairns
Abstract: Genetic Improvement (GI) is the process of using computational search
techniques to improve existing software e.g. in terms of execution time, power
consumption or correctness. As in most heuristic search algorithms, the search is
guided by fitness with GI searching the space of program variants of the original
software. The relationship between the program space and fitness is seldom simple
and often quite difficult to analyse. This paper makes a preliminary analysis of GI's
fitness distance measure on program repair with three small Python programs. Each
program undergoes incremental mutations while the change in fitness as measured by
proportion of tests passed is monitored. We conclude that the fitnesses of these
programs often does not change with single mutations and we also confirm the
inherent discreteness of bug fixing fitness functions. Although our findings cannot be
assumed to be general for other software they provide us with interesting directions
for further investigation.
Title: Differentiable Genetic Programming
Authors: Dario Izzo, Francesco Biscani, Alessio Mereta
Abstract: We introduce the use of high order automatic differentiation, implemented
via the algebra of truncated Taylor polynomials, in genetic programming. Using the
Cartesian Genetic Programming encoding we obtain a high-order Taylor
representation of the program output that is then used to back-propagate errors during
learning. The resulting machine learning framework is called differentiable Cartesian
Genetic Programming (dCGP). In the context of symbolic regression, dCGP offers a
new approach to the long unsolved problem of constant representation in GP
expressions. On several problems of increasing complexity we find that dCGP is able
to find the exact form of the symbolic expression as well as the constants values. We
also demonstrate the use of dCGP to solve a large class of differential equations and
to find prime integrals of dynamical systems, presenting, in both cases, results that
confirm the efficacy of our approach.
Title: Evolving Game State Features from Raw Pixels
Authors: Baozhu Jia, Marc Ebner
Abstract: General video game playing is the art of designing artificial intelligence
programs that are capable of playing different video games with little domain
knowledge. One of the great challenges is how to capture game state features from
different video games in a general way. The main contribution of this paper is to apply
genetic programming to evolve game state features from raw pixels. A voting method
is implemented to determine the actions of the game agent. Three different video
games are used to evaluate the effectiveness of the algorithm: Missile Command,
Frogger, and Space Invaders. The results show that genetic programming is able to
find useful game state features for all three games.
Title: Emergent Tangled Graph Representations for Atari Game Playing Agents
Authors: Stephen Kelly, Malcolm I. Heywood
Abstract: Organizing code into coherent programs and relating different programs to
each other represents an underlying requirement for scaling genetic programming to
more difficult task domains. Assuming a model in which policies are defined by teams
of programs, in which team and program are represented using independent
populations and coevolved, has previously been shown to support the development of
variable sized teams. In this work, we generalize the approach to provide a complete
framework for organizing multiple teams into arbitrarily deep/wide structures through a
process of continuous evolution; hereafter the Tangled Program Graph (TPG).
Benchmarking is conducted using a subset of 20 games from the Arcade Learning
Environment (ALE), an Atari 2600 video game emulator. The games considered here
correspond to those in which deep learning was unable to reach a threshold of play
consistent with that of a human. Information provided to the learning agent is limited to
that which a human would experience. That is, screen capture sensory input, Atari
joystick actions, and game score. The performance of the proposed approach
exceeds that of deep learning in 15 of the 20 games, with 7 of the 15 also exceeding
that associated with a human level of competence. Moreover, in contrast to solutions
from deep learning, solutions discovered by TPG are also very `sparse'. Rather than
assuming thatall of the state space contributes to every decision, each action in TPG
is resolved following execution of a subset of an individual's graph. This results in
significantly lower computational requirements for model building than presently the
case for deep learning.
Title: A General Feature Engineering Wrapper for Machine Learning Using ε-Lexicase
Survival
Authors: William La Cava, Jason Moore
Abstract: We propose a general wrapper for feature learning that interfaces with other
machine learning methods to compose effective data representations. The proposed
feature engineering wrapper (FEW) uses genetic programming to represent and
evolve individual features tailored to the machine learning method with which it is
paired. In order to maintain feature diversity, ε-lexicase survival is introduced, a
method based on ε-lexicase selection. This survival method preserves semantically
unique individuals in the population based on their ability to solve difficult subsets of
training cases, thereby yielding a population of uncorrelated features. We
demonstrate FEW with five different off-the-shelf machine learning methods and test it
on a set of real-world and synthetic regression problems with dimensions varying
across three orders of magnitude. The results show that FEW is able to improve
model test predictions across problems for several ML methods. We discuss and test
the scalability of FEW in comparison to other feature composition strategies, most
notably polynomial feature expansion.
Title: Visualising the Search Landscape of the Triangle Program
Authors: William B. Langdon, Nadarajen Veerapen, Gabriela Ochoa
Abstract: High order mutation analysis of a software engineering benchmark,
including schema and local optima networks, suggests program improvements may
not be as hard to find as is often assumed. 1) Bit-wise genetic building blocks are not
deceptive and can lead to all global optima. 2) There are many neutral networks,
plateaux and local optima, nevertheless in most cases near the human written C
source code there are hill climbing routes including neutral moves to solutions.
Title: RANSAC-GP: Dealing with Outliers in Symbolic Regression with Genetic
Programming
Authors: Uriel López, Leonardo Trujillo, Yuliana Martinez, Pierrick Legrand, Enrique
Naredo, Sara Silva
Abstract: Genetic programming (GP) has been shown to be a powerful tool for
automatic modeling and program induction. It is often used to solve difficult symbolic
regression tasks, with many examples in real-world domains. However, the
robustness of GP-based approaches has not been substantially studied. In particular,
the present work deals with the issue of outliers, data in the training set that represent
severe errors in the measuring process. In general, a datum is considered an outlier
when it sharply deviates from the true behavior of the system of interest. GP
practitioners know that such data points usually bias the search and produce
inaccurate models. Therefore, this work presents a hybrid methodology based on the
RAndom SAmpling Consensus (RANSAC) algorithm and GP, which we call RANSAC-
GP. RANSAC is an approach to deal with outliers in parameter estimation problems,
widely used in computer vision and related fields. On the other hand, this work
presents the first application of RANSAC to symbolic regression with GP, with
impressive results. The proposed algorithm is able to deal with extreme amounts of
contamination in the training set, evolving highly accurate models even when the
amount of outliers reaches 90%.
Title: Symbolic Regression on Network Properties
Authors: Marcus Märtens, Fernando Kuipers, Piet Van Mieghem
Abstract: Networks are continuously growing in complexity, which creates challenges
for determining their most important characteristics. While analytical bounds are often
too conservative, the computational effort of algorithmic approaches does not scale
well with network size. This work uses Cartesian Genetic Programming for symbolic
regression to evolve mathematical equations that relate network properties directly to
the eigenvalues of network adjacency and Laplacian matrices. In particular, we show
that these eigenvalues are powerful features to evolve approximate equations for the
network diameter and the isoperimetric number, which are hard to compute
algorithmically. Our experiments indicate a good performance of the evolved
equations for several real-world networks and we demonstrate how the generalization
power can be influenced by the selection of training networks and feature sets.
Title: Evolving Time-Invariant Dispatching Rules in Job Shop Scheduling with Genetic
Programming
Authors: Yi Mei, Su Nguyen, Mengjie Zhang
Abstract: Genetic Programming (GP) has achieved success in evolving dispatching
rules for job shop scheduling problems, particularly in dynamic environments.
However, there is still great potential to improve the performance of GP. One
challenge that is yet to be addressed is the huge search space. In this paper, we
propose a simple yet effective approach to improve the effectiveness and efficiency of
GP. The new approach is based on a newly defined time-invariance property of
dispatching rules, which is derived from the idea of translational invariance from
machine learning. Then, we develop a new terminal selection scheme to guarantee
the time-invariance throughout the GP process. The experimental studies show that
by considering the time-invariance, GP can achieve much better rules in a much
shorter time.
Title: Strategies for Improving the Distribution of Random Function Outputs in GSGP
Authors: Luiz Otavio V. B. Oliveira, Felipe Casadei, Gisele Pappa
Abstract: In the last years, different approaches have been proposed to introduce
semantic information to genetic programming. In particular, the geometric semantic
genetic programming (GSGP) and the interesting properties of its evolutionary
operators have gotten the attention of the community. This paper is interested in the
use of GSGP to solve symbolic regression problems, where semantics is defined by
the output set generated by a given individual when applied to the training cases. In
this scenario, both mutation and crossover operators defined with fitness function
based on Manhattan distance use randomly built functions to generate offspring.
However, the outputs of these random functions are not guaranteed to be uniformly
distributed in the semantic space, as the functions are generated considering the
syntactic space. We hypothesize that the non-uniformity of the semantics of these
functions may bias the search, and propose three different standard normalization
techniques to improve the distribution of the outputs of these random functions over
the semantic space. The results are compared with a popular strategy that uses a
logistic function as a wrapper to the outputs, and show that the strategies tested can
improve the results of the previous method. The experimental analysis also indicates
that a more uniform distribution of the semantics of these functions does not
necessarily imply in better results in terms of test error.
Title: Synthesis of Mathematical Programming Constraints with Genetic Programming
Authors: Tomasz P. Pawlak, Krzysztof Krawiec
Abstract: We identify a novel application of Genetic Programming to automatic
synthesis of mathematical programming (MP) models for business processes. Given a
set of examples of states of a business process, the proposed Genetic Constraint
Synthesis (GenetiCS) method constructs well-formed constraints for an MP model.
The form of synthesized constraints (e.g., linear or polynomial) can be chosen
accordingly to the nature of the process and the desired type of MP problem. In
experimental part, we verify syntactic and semantic fidelity of the synthesized models
to the actual benchmark models of varying complexity. The obtained symbolic models
of constraints can be combined with an objective function of choice, fed into an off-
shelf MP solver, and optimized.
Title: Grammatical Evolution of Robust Controller Structures using Wilson Scoring
and Criticality Ranking
Authors: Elias Reichensdörfer, Dirk Odenthal, Dirk Wollherr
Abstract: In process control it is essential that disturbances and parameter
uncertainties do not affect the process in a negative way. Simultaneously optimizing
an objective function for different scenarios can be solved in theory by evaluating
candidate solutions on all scenarios. This is not feasible in real-world applications,
where the scenario space often forms a continuum. A traditional approach is to
approximate this evaluation using Monte Carlo sampling. To overcome the difficulty of
choosing an appropriate sampling count and to reduce evaluations of low-quality
solutions, a novel approach using Wilson scoring and criticality ranking within a
grammatical evolution framework is presented. A nonlinear spring mass system is
considered as benchmark example from robust control. The method is tested against
Monte Carlo sampling and the results are compared to a backstepping controller. It is
shown that the method is capable of outperforming state of the art methods.
Title: Using Feature Clustering for GP-Based Feature Construction on High-
Dimensional Data
Authors: Binh Tran, Bing Xue, Mengjie Zhang
Abstract: Feature construction is a pre-processing technique to create new features
with better discriminating ability from the original features. Genetic programming (GP)
has been shown to be a prominent technique for this task. However, applying GP to
high-dimensional data is still challenging due to the large search space. Feature
clustering groups similar features into clusters, which can be used for dimensionality
reduction by choosing representative features from each cluster to form the feature
subset. Feature clustering has been shown promising in feature selection; but has not
been investigated in feature construction for classification. This paper presents the
first work of utilising feature clustering in this area. We propose a cluster-based GP
feature construction method called CGPFC which uses feature clustering to improve
the performance of GP for feature construction on high-dimensional data. Results on
eight high-dimensional datasets with varying difficulties show that the CGPFC
constructed features perform better than the original full feature set and features
constructed by the standard GP constructor based on the whole feature set.
Title: Geometric Semantic Crossover with an Angle-aware Mating Scheme in Genetic
Programming for Symbolic Regression
Authors: Qi Chen, Bing Xue, Yi Mei, Mengjie Zhang
Abstract: Recent research shows that incorporating semantic knowledge into the
genetic programming (GP) evolutionary process can improve its performance. This
work proposes an angle-aware mating scheme for geometric semantic crossover in
GP for symbolic regression. The angle-awareness guides the crossover operating on
parents which have a large angle between their relative semantics to the target
semantics. The proposed idea of angle-awareness has been incorporated into one
state-of-the-art geometric crossover, the locally geometric semantic crossover. The
experimental results show that, compared with locally geometric semantic crossover
and the regular GP crossover,
the locally geometric crossover with angle-awareness not only has a significantly
better learning performance but also has a notable generalisation gain on unseen test
data. Further analysis has been conducted to see the difference between the angle
distribution of crossovers with and without angle-awareness, which confirms that the
angle-awareness changes the original distribution of angles by decreasing the number
of parents with zero degree while increasing their counterparts with large angles,
leading to better performance.
Title: RECIPE: A Grammar-based Framework for Automatically Evolving
Classification Pipelines
Authors: Alex G. C. de Sá, Walter José G. S. Pinto, Luiz Otavio V. B. Oliveira, Gisele
Pappa
Abstract: Automatic Machine Learning is a growing area of machine learning that has
a similar objective to the area of hyper-heuristics: to automatically recommend
optimized pipelines, algorithms or appropriate parameters to specific tasks without
much dependency on user knowledge. The background knowledge required to solve
the task at hand is actually embedded into a search mechanism that builds
personalized solutions to the task. Following this idea, this paper proposes RECIPE
(REsilient ClassifIcation Pipeline Evolution), a framework based on grammar-based
genetic programming that builds customized classification pipelines. The framework is
flexible enough to receive different grammars and can be easily extended to other
machine learning tasks. RECIPE overcomes the drawbacks of previous evolutionary-
based frameworks, such as generating invalid individuals, and organizes a high
number of possible suitable data pre-processing and classification methods into a
grammar. Results of f-measure obtained by RECIPE are compared to those two state-
of-the-art methods, and shown to be as good as or better than those previously
reported in the literature. RECIPE represents a first step towards a complete
framework for dealing with different machine learning tasks with the minimum required
human intervention.
Title: A Grammar Design Pattern for Arbitrary Program Synthesis Problems in Genetic
Programming
Authors: Stefan Forstenlechner, David Fagan, Miguel Nicolau, Michael O'Neill
Abstract: Grammar Guided Genetic Programming has been applied to many problem
domains. It is well suited to tackle program synthesis, as it has the capability to evolve
code in arbitrary languages. Nevertheless, grammars designed to evolve code have
always been tailored to specific problems resulting in bespoke grammars, which
makes them difficult to reuse. In this study a more general approach to grammar
design in the program synthesis domain is presented. The approach undertaken is to
create a grammar for each data type of a language and combine these grammars for
the problem at hand, without having to tailor a grammar for every single problem. The
approach can be applied to arbitrary problem instances of program synthesis and can
be used with any programming language. The approach is also extensible to use
libraries available in a given language. The grammars presented can be applied to
any grammar-based Genetic Programming approach and make it easy for researches
to rerun experiments or test new problems. The approach is tested on a suite of
benchmark problems and compared to PushGP, as it is the only GP system that has
presented results on a wide range of benchmark problems. The object of this study is
to match or outperform PushGP on these problems without tuning grammars to solve
each specific problem.
Title: Improving the Tartarus Problem as a Benchmark in Genetic Programming
Authors: Thomas D. Griffiths, Anikó Ekárt
Abstract: For empirical research on computer algorithms, it is essential to have a set
of benchmark problems on which the relative performance of different methods and
their applicability can be assessed. In the majority of computational research fields
there are established sets of benchmark problems; however, the field of genetic
programming lacks a similarly rigorously defined set of benchmarks. There is a strong
interest within the genetic programming community to develop a suite of benchmarks.
Following recent surveys, the desirable characteristics of a benchmark problem are
now better defined. In this paper the Tartarus problem is proposed as a tunably
difficult benchmark problem for use in Genetic Programming. The justification for this
proposal is presented, together with guidance on its usage as a benchmark.
Title: A New Subgraph Crossover for Cartesian Genetic Programming
Authors: Roman Kalkreuth, Günter Rudolph, Andre Droschinsky
Abstract: While tree-based Genetic Programming is often used with crossover,
Cartesian Genetic Programming is mostly used only with mutation as genetic
operator. In this paper, a new crossover technique is introduced which recombines
subgraphs of two selected graphs. Experiments on symbolic regression, boolean
functions and image operator design problems indicate that the use of the subgraph
crossover improves the search performance of Cartesian Genetic Programming. A
preliminary comparison to a former proposed crossover technique indicates that the
subgraph crossover performs better on our tested problems.
Title: A Comparative Study of Different Grammar-based Genetic Programming
Approaches
Authors: Nuno Lourenço, Joaquim Ferrer, Francisco B. Pereira, Ernesto Costa
Abstract: Grammars are useful formalisms to specify constraints, and not
surprisingly, they have attracted the attention of Evolutionary Computation (EC)
researchers to enforce problem restrictions. Context-Free-Grammar GP (CFG-GP)
established the foundations for the application of grammars in Genetic Programming
(GP), whilst Grammatical Evolution (GE) popularised the use of these approaches,
becoming one of the most used GP variants. However, studies have shown that GE
suffers from issues that have impact on its performance. To minimise these issues,
several extensions have been proposed, which made the distinction between GE and
CFG-GP less noticeable. Another direction was followed by Structured Grammatical
Evolution (SGE) that maintains the separation between genotype and phenotype from
GE, but overcomes most of its issues. Our goal is to perform a comparative study
between CFG-GP, GE and SGE to examine their relative performance. The results
show that in most of the selected benchmarks, CFG-GP and SGE have a similar
performance, showing that SGE is a good alternative to GE.
Title: A Comparative Analysis of Dynamic Locality and Redundancy in Grammatical
Evolution
Authors:Eric Medvet
Abstract: The most salient feature of Grammatical Evolution (GE) is a procedure
which maps genotypes to phenotypes using the grammar production rules; however,
the search effectiveness of GE may be affected by low locality and high redundancy,
which can prevent GE to comply with the basic principle that offspring should inherit
some traits from their parents. Indeed, many studies previously investigated the
locality and redundancy of GE as originally proposed in [1]. In this paper, we extend
those results by considering redundancy and locality during the evolution, rather than
statically, hence trying to understand if and how they are influenced by the selective
pressure determined by the fitness. Moreover, we consider not only the original GE
formulation, but three other variants proposed later (BGE, piGE, and SGE). We
experimentally find that there is an interaction between locality/redundancy and other
evolution-related measures, namely diversity and growth of individual size. In
particular, the combined action of the crossover operator and the genotype-phenotype
mapper makes SGE less redundant at the beginning of the evolution, but with very
high redundancy after some generations, due to the low phenotype diversity.
Title: On Evolutionary Approximation of Sigmoid Function for HW/SW Embedded
Systems
Authors: Milos Minarik, Lukas Sekanina
Abstract: Providing machine learning capabilities on low cost electronic devices is a
challenging goal especially in the context of the Internet of Things paradigm. In order
to deliver high performance machine intelligence on low power devices, suitable
hardware accelerators have to be introduced. In this paper, we developed a method
enabling to evolve a hardware implementation together with a corresponding software
controller for key components of smart embedded systems. The proposed approach is
based on a multi-objective design space exploration conducted by means of extended
linear genetic programming. The approach was evaluated in the task of approximate
sigmoid function design which is an important component of hardware
implementations of neural networks. During these experiments, we automatically rediscovered some approximate sigmoid functions known from the literature. The
method was implemented as an extension of an existing platform supporting
concurrent evolution of hardware and software of embedded systems.
Accepted papers will be presented orally or as posters at the Conference and will be printed in the proceedings published by Springer Verlag in the Lecture Notes in Computer Science (LNCS) series.
Previous editions of EuroGP were published in the following Springer Verlag LNCS volumes
The papers which receive the best reviews will be nominated for the Best Paper Award.
Submissions must be original and not published elsewhere. They will be peer reviewed by at least three members of the program committee. The reviewing process will be double-blind, so please omit information about the authors in the submitted paper. Submit your manuscript in Springer LNCS format.
Page limit: 16 pages (this number has been increased for all conferences in Evostar 2017)
The authors of accepted papers will have to improve their paper on the basis
of the reviewers' comments and will be asked to send a camera ready version
of their manuscripts. At least one author of each accepted work has to register
for the conference, attend the conference and present the work.
Submission link: https://myreview.saclay.inria.fr/eurogp17