EvoBIO Logo 2014 in Granada, Spain -- 23rd - 25th April, 2014 

Detailed Programme

The European Conference on Evolutionary Computation, Machine Learning and Data Mining in Computational Biology

Join us in Granada for EvoBIO, a multidisciplinary conference that brings together researchers working in Bioinformatics and Computational Biology that apply advanced techniques coming from Evolutionary Computation, Machine Learning, and Data Mining to address important problems in biology, from the molecular and genomic dimension, to the individual and population level. EvoBIO engages researchers who are developing or applying advanced artificial intelligence or machine learning algorithms to cutting edge problems in bioinformatics and computational biology. Each accepted paper will be presented orally or as a poster at the conference and will be printed in the proceedings published by Springer Verlag in the LNCS series.  Possible topics of interest include:

EvoBIO and other EvoSTAR conference have significant thematic overlap; often the work of one conference or application is of interest to attendees of another. Select EvoBIO papers may be considered for presentation in a joint session with other EvoSTAR conferences (EuroGP, EvoCOP, and EvoMUSART).


For more information, visit our web page at http://www.evostar.org, or follow us on Twitter @EVOBio2014, join our evobio2014 group on LinkedIn and our EvoBIO group on Facebook, or email to evobio.conference(at)gmail.com.

Submission Details

EvoBIO is interested in papers in three article types:

  1. Full research articles (maximum 12 pages) describing new methodologies, approaches, and/or applications (oral or poster presentation).
  2. Short reports (maximum 8 pages, poster presentation) describing new methodologies, approaches, and/or applications and System Demonstrations (maximum 8 pages) outlining the nature of the system and describe why the demonstration is likely to be of interest for the conference.
  3. Abstracts (maximum 4 pages) discussing work previously published in a journal: it is therefore essential that a reference to the previous article is clearly cited in the abstract (oral or poster presentation).

Submissions must be original and not published elsewhere. They will be peer reviewed by at least three members of the program committee. Submit your manuscript in Springer LNCS format.

Submission link: http://myreview.csregistry.org/evobio14

The authors of accepted papers will have to improve their paper on the basis of the reviewers' comments and will be asked to send a camera ready version of their manuscripts. At least one author of each accepted work has to register for the conference, attend the conference and present the work.

IMPORTANT DATES

Submission deadline: 1 November 2013 11 November 2013
Notification: 06 January 2014
Camera ready: 01 February 2014
EvoBIO: 23-25 April 2014

EvoBIO Programme Chairs

Programme Committee

EvoBIO Programme

Wednesday 23 April

Wed 1120-1300 Session 1: Genomics
Chair: Federico Divina


What Do We Learn from Network-Based Analysis of Genome-Wide Association Data? Marzieh Ayati, Sinan Erten, Mehmet Koyuturk
Network based analyses are commonly used as powerful tools to interpret the findings of genome-wide association studies (GWAS) in a functional context. In particular, identification of disease-associated functional modules, i.e., highly connected protein-protein interaction (PPI) subnetworks with high aggregate disease association are shown to be promising in uncovering the functional relationships among disease-associated genes and proteins. An important issue in this regard is the scoring of subnetworks by integrating two quantities that are not readily compatible: disease association of individual gene products and network connectivity among proteins. Current scoring schemes either disregard the level of connectivity and focus on the aggregate disease association of connected proteins or use a linear combination of these two quantities. However, such scoring schemes may produce arbitrarily large subnetworks which are often not statistically significant, or require tuning of parameters that are used to weigh the contributions of network connectivity and disease association. Here, we propose a parameter-free scoring scheme that aims to score subnetworks by assessing the disease association of pairwise interactions and incorporating the statistical significance of network connectivity and disease association. We test the proposed scoring scheme on a GWAS dataset for type II diabetes (T2D). Our results suggest that subnetworks identified by commonly used methods may fail tests of statistical significance after correction for multiple hypothesis testing. In contrast, the proposed scoring scheme yields highly significant subnetworks, which contain biologically relevant proteins that cannot be identified by analysis of genome-wide association data alone.

Benefits Of Accurate Imputations In GWAS Shefali Setia, Peggy Peissig, Deanna Cross, Carol Waudby, Murray Brilliant, Catherine McCarty, Marylyn Ritchie
Imputation methods have been suggested as an efficient way to increase both utility and coverage in genome-wide association studies, especially when combining data generated from different genotyping arrays. We aim to demonstrate that imputation results are extremely accurate and the association analysis from imputed data does not over-inflate the results. Instead imputation leads to an increase in the power of the dataset without introducing any systematic biases. The majority of common variants can be imputed with very high accuracy (r 2 >0.9) and we validated the accuracy of imputations by comparing actual genotypes from low-throughput genotyping assays against imputed genotypes. Imputation was performed using IMPUTE2 and the 1000. Genomes cosmopolitan reference panel, which results in about 38 million SNPs. After quality control and filtering we performed case-control associations with 3,159,556 markers. We show a comparison of results from genotyped and imputed data and also determine how accurate ancestry is determined by imputations.

Genotype Correlation Analysis Reveals Pathway-Based Functional Disequilibrium and Potential Epistasis in the Human Interactome William Bush, Jonathan Haines
Epistasis is thought to be a pervasive part of complex phenotypes due to the dynamics and complexity of biological systems, and a further understanding of epistasis in the context of biological pathways may provide insight into the etiology of complex disease. In this study, we use genotype data from the International HapMap Project to characterize the functional dependencies between alleles in the human interactome as defined by KEGG pathways. We performed chi-square tests to identify non-independence between functionally-related SNP pairs within parental Caucasian and Yoruba samples. We further refine this list by testing for skewed transmission of pseudo-haplotypes to offspring using a haplotype-based TDT test. From these analyses, we identify pathways enriched for functional disequilibrium, and a set of 863 SNP pairs (representing 453 gene pairs) showing consistent non-independence and transmission distortion. These results represent gene pairs with strong evidence of epistasis within the context of a biological function.

An Integrated Analysis Of Genome-Wide Dna Methylation And Genetic Variants Underlying Etoposide-Induced Cytotoxicity In European And African Populations Ruowang Li, Dokyoon Kim, Scott Dudek, Marylyn Ritchie
Genetic variations among individuals account for a large portion of variability in drug response. The underlying mechanism of the variability is still not known, but it is expected to comprise of a wide range of genetic factors that interact and communicate with each other. Here, we present an integrated genome-wide approach to uncover the interactions among genetic factors that can explain some of the inter-individual variation in drug response. The International HapMap consortium generated genotyping data on human lymphoblastoid cell lines of (Center d’Etude du Polymorphisme Humain population – CEU) European descent and (Yoruba population - YRI) African descent. Using genome-wide analysis, Huang et al. identified SNPs that are associated with etoposide, a chemotherapeutic drug, response on the cell lines. Using the same lymphoblastoid cell lines, Fraser et al. generated genome-wide methylation profiles for gene promoter regions. We evaluated associations between candidate SNPs generated by Huang et al and genome-wide methylation sites. The analysis identified a set of methylation sites that are associated with etoposide related SNPs. Using the set of methylation sites and the candidate SNPs, we built an integrated model to explain etoposide response observed in CEU and YRI cell lines. This integrated method can be extended to combine any number of genomics data types to explain many phenotypes of interest.

Wed 1430-1610 Session 2: Proteins and Proteomics
Chair: William Bush


Determining Positions Associated With Drug Resistance On HIV-1 Proteins: A Computational Approach Gonzalo Nápoles, Isel Grau, Ricardo Pérez-García, Rafael Bello
The computational modeling of HIV-1proteins has become a useful framework allowing understanding the virus behavior (e.g. mutational patterns, replication process or resistance mechanism). For instance, predicting the drug resistance from genotype means to solve a complicated sequence classification problem. In such kind of problems proper feature selection could be essential to increase the classifiers performance. Several sequence positions that have been previously associated with resistance are known, although we believe that other positions could be discovered. More explicitly, we observed that using positions reported in the literature for the reverse transcriptase protein, the final decision system exhibited inconsistent mutations. However, finding a minimal subset of features characterizing the whole sequence involve a challenging combinatorial problem. This research proposes a model based on Variable Mesh Optimization and Rough Sets Theory for computing those sequence positions associated with resistance, leading to more consistent decision systems. Finally, our model is validated across eleven well-known reverse transcriptase inhibitors.

GPMS: A Genetic Programming Based Approach to Multiple Alignment of Liquid Chromatography-Mass Spectrometry Data Soha Ahmed, Mengjie Zhang, Lifeng Peng
Alignment of samples from Liquid chromatography-mass spectrometry (LC-MS) measurements has a significant role in the detection of biomarkers and in metabolomic studies. The machine drift causes differences between LC-MS measurements, and an accurate alignment of the shifts introduced to the same peptide or metabolite is needed. In this paper, we propose the use of genetic programming (GP) for multiple alignment of LC-MS data. The proposed approach consists of two main phases. The first phase is the peak matching where the peaks from different LC-MS maps (peak lists) are matched to allow the calculation of the retention time deviation. The second phase is to use GP for multiple alignment of the peak lists with respect to a reference. In this paper, GP is designed to perform multiple-output regression by using a special node in the tree which divides the output of the tree into multiple outputs. Finally, the peaks that show the maximum correlation after dewarping the retention times are selected to form a consensus aligned map.The proposed approach is tested on one proteomics and two metabolomics LC-MS datasets with different number of samples. The method is compared to several benchmark methods and the results show that the proposed approach outperforms these methods in three fractions of the protoemics dataset and the metabolomics dataset with a larger number of maps. Moreover, the results on the rest of the datasets are highly competitive with the other methods.

Wed 1745-1900 EvoBIO poster

Replication of SCN5A Associations with Electrocardiographic Traits in African Americans from Clinical and Epidemiologic Studies Janina Jeff, Kristin Brown-Gentry, Robert Goodloe, Marylyn Ritchie, Joshua Denny, Abel Kho, Loren Armstrong, Bob McClellan, Jr, Ping Mayo; Melissa Allen; Hailing Jin; Niloufar B. Gillani; Nathalie Schnetz-Boutaud; Holli H. Dilks; Melissa A. Basford; Jennifer A. Pacheco; Gail P. Jarvik; Rex L. Chisholm; Dan M. Roden; M. Geoffrey Hayes; Dana C. Crawford
The NAv1.5 sodium channel α subunit is the predominant α‐subunit expressed in the heart and is associated with cardiac arrhythmias. We tested five previously identified SCN5A variants (rs7374138, rs7637849, rs7637849, rs7629265, and rs11129796) for an association with PR interval and QRS duration in two unique study populations: the Third National Health and Nutrition Survey (NHANES III,n=552) accessed by the Epidemiologic Architecture for Genes Linked to Environment (EAGLE) as part of the Population Architecture using Genomics and Epidemiology (PAGE) I study and a combined dataset (n=455) from two biobanks linked to electronic medical records from Vanderbilt University (BioVU) and Northwestern University (Nugene) as part of the electronic Medical Records & Genomics (eMERGE) network. A meta-analysis including all three study populations (n~4,000) suggests that eight SCN5A associations were significant for both QRS duration and PR interval (p<5.0E-3). There was little evidence for heterogeneity across the study populations for either trait. These results suggest that published SCN5A associations replicate across different study designs in a meta-analysis and represent an important first step in utility of multiple study designs for genetic studies and the identification/characterization of genetic variants associated with ECG traits in African‐descent populations.