Big Data: Technologies and algorithms to deal with challenges
Dpto. de Ciencias de la Computación e Inteligencia Artificial
Universidad de Granada
Slides of the talk
In this age, big data applications are increasingly becoming the main focus of attention because of the enormous increment of data generation and storage that has taken place in the last years, in science, business, ... This situation becomes a challenge when huge amounts of data are processed to extract knowledge because the data mining techniques are not adapted to the new space and time requirements. We must consider the new paradigms to develop scalable algorithms. At this conference we will introduce briefly the technologies that have emerged strongly in recent years (Hadoop ecosystem, Spark, …) and the libraries such as Mahout, ML lib, … We will discuss some applications, and we will focus the attention on the steps to create a learning algorithms that was winner for the ECBDL'14 big data competition, processing an extremely imbalanced big data bioinformatics problem.
About the Author
Francisco Herrera is a Professor in the Department of Computer Science and Artificial Intelligence at the University of Granada, Spain. He has been the supervisor of 36 Ph.D. students. He has published more than 290 journal papers (H-index 99) that have received more than 35000 citations (Scholar Google). He is co-author of the books "Genetic Fuzzy Systems" (World Scientific, 2001) and "Data Preprocessing in Data Mining" (Springer, 2015). He currently acts as Editor in Chief of the international journals "Information Fusion" (Elsevier) and “Progress in Artificial Intelligence (Springer). He acts as editorial board member of a dozen of journals, among others: International Journal of Computational Intelligence Systems, IEEE Transactions on Fuzzy Systems, IEEE Transactions on Cybernetics, Information Sciences, Knowledge and Information Systems, Fuzzy Sets and Systems, Applied Intelligence, Knowledge-Based Systems, and Swarm and Evolutionary Computation. He is a Fellow of the European Coordinating Committee for Artificial Intelligence and the International Fuzzy Systems Association. He has been given many awards and honors for his personal work or for his publications in journals and conferences. His areas of interest include, among others, data science, data preprocessing, cloud computing and big data.
Text Classification Using Novel “Anti-Bayesian” Techniques
Chancellor’s Professor; Fellow: IEEE; Fellow: IAPR
Carleton University. Ottawa
The problem of Text Classification has been studied for decades, and this problem is particularly interesting because the features are derived from syntactic or semantic indicators, while the classification, in and of itself, is based on statistical Pattern Recognition (PR) strategies. Thus, all the recorded TC schemes work using the fundamental paradigm that once the statistical features are inferred from the syntactic/semantic indicators, the classifiers themselves are the well-established ones such as the Bayesian, the Naïve Bayesian, the SVM etc. and those that are neural or fuzzy. In this paper, we shall demonstrate that by virtue of the skewed distributions of the features, one could advantageously work with information latent in certain “non-central” quantiles (i.e., those distant from the mean) of the distributions. We, indeed, demonstrate that such classifiers exist and are attainable, and show that the design and implementation of such schemes work with the recently-introduced paradigm of Quantile Statistics (QS)-based classifiers. These classifiers, referred to as Classification by Moments of Quantile Statistics (CMQS), are essentially “Anti”-Bayesian in their modus operandi. Being a Plenary/Keynote talk, we will concentrate and survey the new “Anti”-Bayesian paradigm. We shall show that by using it, we can obtain optimal or near-optimal results by working with a very few (sometimes as small as two) points distant from the mean.
About the Author
Dr. John Oommen was born in Coonoor, India on September 9, 1953. He obtained his B.Tech. degree from the Indian Institute of Technology, Madras, India in 1975. He obtained his M.E. from the Indian Institute of Science in Bangalore, India in 1977. He then went on for his M.S. and Ph. D. which he obtained from Purdue University, in West Lafayettte, Indiana in 1979 and 1982 respectively. He joined the School of Computer Science at Carleton University in Ottawa, Canada, in the 1981-82 academic year. He is still at Carleton and holds the rank of a Full Professor. Since July 2006, he has been awarded the honorary rank of Chancellor's Professor, which is a lifetime award from Carleton University. His research interests include Automata Learning, Adaptive Data Structures, Statistical and Syntactic Pattern Recognition, Stochastic Algorithms and Partitioning Algorithms. He is the author of more than 410 refereed journal and conference publications, and is a Fellow of the IEEE and a Fellow of the IAPR. Dr. Oommen has also served on the Editorial Board of the IEEE Transactions on Systems, Man and Cybernetics, and Pattern Recognition.
Collective information processing in fish schools: from data to computational models
Centre de Recherches sur la Cognition Animale,
CNRS, UMR 5169, Université Paul Sabatier,
118 route de Narbonne, 31062 Toulouse, France
Swarms of insects, schools of fish and flocks of birds display an impressive variety of collective behaviors that emerge from local interactions among group members. These puzzling phenomena raise a variety of questions about the interactions rules that govern the coordination of individuals’ motions and the emergence of large-scale patterns. While numerous models have been proposed, there is still a strong need for detailed experimental studies to foster the biological understanding of such collective motion. I will present the methods that we used to characterize interactions among individuals and build models for animal group motion from data gathered at the individual scale. Using video tracks of fish shoal in a tank, we determined the stimulus/response function governing an individual’s moving decisions from an incremental analysis at the local scale. We found that both attraction and alignment interactions are present and act upon the fish turning speed, yielding a novel schooling model whose parameters are all estimated from data. We also found that the magnitude of these interactions changes as a function of the swimming speed of fish and the group size. The consequence being that groups of fish adopt different shapes and motions: group polarization increases with swimming speed while it decreases as group size increases. The phase diagram of model also revealed that the relative weights of attraction and alignment interactions play a key role in the emergent collective states at the school level. Of particular interest is the existence of a transition region in which the school exhibits multistability and intermittence between schooling and milling for the same combination of individual parameters. In this region the school becomes highly sensitive to any kind of perturbations that can affect the behavior of just a single fish.
About the Author
Guy Theraulaz is a senior research fellow at the National Center for Scientific Research CNRS) and an expert in the study of collective animal behaviors. He is also a leading researcher in the field of swarm intelligence, primarily studying social insects but also distributed algorithms, e.g. for collective robotics, directly inspired by nature. His research focuses on the understanding of a broad spectrum of collective behaviors in animal societies by quantifying and then modeling the individual level behaviors and interactions, thereby elucidating the mechanisms generating the emergent, group-level properties. He was one of the main characters of the development of quantitative social ethology and collective intelligence in France. He published many papers on nest construction in ant and wasp colonies, collective decision in ants and cockroaches, and collective motion in fish schools and pedestrian crowds. He has also coauthored five books, among which Swarm Intelligence: From Natural to Artificial Systems (Oxford University Press, 1999) and Self-organization in Biological Systems (Princeton University Press, 2001) that are now considered as reference textbooks.
Trading and Poker: Using computers to take intelligent decisions
In order to be a successful poker player and/or a trader in stock markets it is necessary to acquire a specific set of skills. Despite the fact that both poker and trading involve a high degree of chance, it is crucial to use statistical models to analyse, interpret and quantify patterns that repeat themselves. For example in trading we analyse patterns related to price, volume, seasonal and sentiment trends while in poker we study patterns related to betting, sequences, ranges in different boards and psychological tendencies such as ‘tells’. Therefore, the collection and generation of intelligence through complex software tools is a must. For example, we can use tools such as Tradestation, ProRealTime and Ninja Trader for trading, and Poker Tracker, Holdem Resources and Flopzilla in the world of poker.
The key for success in poker and trading is to gain a small edge over the rest of players and be able to exploit it. Once we have developed our own method and strategy with a positive mathematical expectation, we need to put it in practice many times. These small advantages will in turn generate highly probable earnings, an important factor considering that the variance of results in these games is significant. In addition, a proper management and assess of risk is also fundamental to succeed.
One of the characteristics that make both poker and trading such fantastic games is that they reward the best players in the long term, while still giving almost any player a chance to succeed in the short term. In this talk I will review the main tools and methodologies used by players and traders and will illustrate, with examples, the differences and similarities.
About the Author
Jorge Ufano was born in La Coruña, Spain on January 5, 1982. He obtained his Economics degree from the San Pablo CEU University, Madrid, Spain in 2005. He obtained his Master Quantitative Finances from International Financial Analysts (AFI), Madrid Spain in 2008. He is an investment and portfolio manager, and also a stock markets teacher in Clasesdebolsa.com since 2007. From 2010 he is a professional Poker player that has played the Mayor Poker events around the world. His research interests include Financial Risk Analysis, Game Theory, Statistical Pattern Recognition and Algorithms applied to Poker and Trading.