Machine Learning
Similarity of Neural Network Representations Revisited (1905.00414v1)
Simon Kornblith, Mohammad Norouzi, Honglak Lee, Geoffrey Hinton
2019-05-01
Recent work has sought to understand the behavior of neural networks by comparing representations between layers and between different trained models. We examine methods for comparing neural network representations based on canonical correlation analysis (CCA). We show that CCA belongs to a family of statistics for measuring multivariate similarity, but that neither CCA nor any other statistic that is invariant to invertible linear transformation can measure meaningful similarities between representations of higher dimension than the number of data points. We introduce a similarity index that measures the relationship between representational similarity matrices and does not suffer from this limitation. This similarity index is equivalent to centered kernel alignment (CKA) and is also closely connected to CCA. Unlike CCA, CKA can reliably identify correspondences between representations in networks trained from different initializations.
Dynamic Prediction of Origin-Destination Flows Using Fusion Line Graph Convolutional Networks (1905.00406v1)
Xi Xiong, Kaan Ozbay, Li Jin, Chen Feng
2019-05-01
Modern intelligent transportation systems provide data that allow real-time demand prediction, which is essential for planning and operations. The main challenge of prediction of Origin-Destination (O-D) flow matrices is that demands cannot be directly measured by traffic sensors; instead, they have to be inferred from aggregate traffic flow data on traffic links. Specifically, spatial correlation, congestion and time dependent factors need to be considered in general transportation networks. In this paper we propose a novel O-D prediction framework based on Fusion Line Graph Convolutional Networks (FL-GCNs). We use FL-GCN to recognize spatial and temporal patterns simultaneously. The underlying road network topology is transformed into a corresponding line graph. This structure provides a general framework for predicting spatial-temporal O-D information from link traffic flows. Data from a New Jersey Turnpike network is used to evaluate the proposed model. The results show that FL-GCN can recognize spatial and temporal patterns. We also compare FL-GCN with Kalman filter; the results show that our model can outperform Kalman filter by 17.87% in predicting the whole O-D pairs.
Fast AutoAugment (1905.00397v1)
Sungbin Lim, Ildoo Kim, Taesup Kim, Chiheon Kim, Sungwoong Kim
2019-05-01
Data augmentation is an indispensable technique to improve generalization and also to deal with imbalanced datasets. Recently, AutoAugment has been proposed to automatically search augmentation policies from a dataset and has significantly improved performances on many image recognition tasks. However, its search method requires thousands of GPU hours to train even in a reduced setting. In this paper, we propose Fast AutoAugment algorithm that learns augmentation policies using a more efficient search strategy based on density matching. In comparison to AutoAugment, the proposed algorithm speeds up the search time by orders of magnitude while maintaining the comparable performances on the image recognition tasks with various models and datasets including CIFAR-10, CIFAR-100, and ImageNet.
Detecting Adversarial Examples through Nonlinear Dimensionality Reduction (1904.13094v2)
Francesco Crecchi, Davide Bacciu, Battista Biggio
2019-04-30
Deep neural networks are vulnerable to adversarial examples, i.e., carefully-perturbed inputs aimed to mislead classification. This work proposes a detection method based on combining non-linear dimensionality reduction and density estimation techniques. Our empirical findings show that the proposed approach is able to effectively detect adversarial examples crafted by non-adaptive attackers, i.e., not specifically tuned to bypass the detection method. Given our promising results, we plan to extend our analysis to adaptive attackers in future work.
Scalable Population Synthesis with Deep Generative Modeling (1808.06910v2)
Stanislav S. Borysov, Jeppe Rich, Francisco C. Pereira
2018-08-21
Population synthesis is concerned with the generation of synthetic yet realistic representations of populations. It is a fundamental problem in the modeling of transport where the synthetic populations of micro-agents represent a key input to most agent-based models. In this paper, a new methodological framework for how to 'grow' pools of micro-agents is presented. The model framework adopts a deep generative modeling approach from machine learning based on a Variational Autoencoder (VAE). Compared to the previous population synthesis approaches, including Iterative Proportional Fitting (IPF), Gibbs sampling and traditional generative models such as Bayesian Networks or Hidden Markov Models, the proposed method allows fitting the full joint distribution for high dimensions. The proposed methodology is compared with a conventional Gibbs sampler and a Bayesian Network by using a large-scale Danish trip diary. It is shown that, while these two methods outperform the VAE in the low-dimensional case, they both suffer from scalability issues when the number of modeled attributes increases. It is also shown that the Gibbs sampler essentially replicates the agents from the original sample when the required conditional distributions are estimated as frequency tables. In contrast, the VAE allows addressing the problem of sampling zeros by generating agents that are virtually different from those in the original data but have similar statistical properties. The presented approach can support agent-based modeling at all levels by enabling richer synthetic populations with smaller zones and more detailed individual characteristics.
Quantum Generalized Linear Models (1905.00365v1)
Colleen M. Farrelly, Srikanth Namuduri, Uchenna Chukwu
2019-05-01
Generalized linear models (GLM) are link function based statistical models. Many supervised learning algorithms are extensions of GLMs and have link functions built into the algorithm to model different outcome distributions. There are two major drawbacks when using this approach in applications using real world datasets. One is that none of the link functions available in the popular packages is a good fit for the data. Second, it is computationally inefficient and impractical to test all the possible distributions to find the optimum one. In addition, many GLMs and their machine learning extensions struggle on problems of overdispersion in Tweedie distributions. In this paper we propose a quantum extension to GLM that overcomes these drawbacks. A quantum gate with non-Gaussian transformation can be used to continuously deform the outcome distribution from known results. In doing so, we eliminate the need for a link function. Further, by using an algorithm that superposes all possible distributions to collapse to fit a dataset, we optimize the model in a computationally efficient way. We provide an initial proof-of-concept by testing this approach on both a simulation of overdispersed data and then on a benchmark dataset, which is quite overdispersed, and achieved state of the art results. This is a game changer in several applied fields, such as part failure modeling, medical research, actuarial science, finance and many other fields where Tweedie regression and overdispersion are ubiquitous.
Information-Theoretic Considerations in Batch Reinforcement Learning (1905.00360v1)
Jinglin Chen, Nan Jiang
2019-05-01
Value-function approximation methods that operate in batch mode have foundational importance to reinforcement learning (RL). Finite sample guarantees for these methods often crucially rely on two types of assumptions: (1) mild distribution shift, and (2) representation conditions that are stronger than realizability. However, the necessity ("why do we need them?") and the naturalness ("when do they hold?") of such assumptions have largely eluded the literature. In this paper, we revisit these assumptions and provide theoretical results towards answering the above questions, and make steps towards a deeper understanding of value-function approximation.
High-Dimensional Bayesian Optimization with Manifold Gaussian Processes (1902.10675v2)
Riccardo Moriconi, K. S. Sesh Kumar, Marc P. Deisenroth
2019-02-27
Bayesian optimization (BO) is a powerful approach for seeking the global optimum of expensive black-box functions and has proven successful for fine tuning hyper-parameters of machine learning models. The Bayesian optimization routine involves learning a response surface and maximizing a score to select the most valuable inputs to be queried at the next iteration. These key steps are subject to the curse of dimensionality so that Bayesian optimization does not scale beyond 10--20 parameters. In this work, we address this issue and propose a high-dimensional BO method that learns a nonlinear low-dimensional manifold of the input space. We achieve this with a multi-layer neural network embedded in the covariance function of a Gaussian process. This approach applies unsupervised dimensionality reduction as a byproduct of a supervised regression solution. This also allows exploiting data efficiency of Gaussian process models in a Bayesian framework. We also introduce a nonlinear mapping from the manifold to the high-dimensional space based on multi-output Gaussian processes and jointly train it end-to-end via marginal likelihood maximization. We show this intrinsically low-dimensional optimization outperforms recent baselines in high-dimensional BO literature on a set of benchmark functions in 60 dimensions.
Active Manifolds: A non-linear analogue to Active Subspaces (1904.13386v2)
Robert A. Bridges, Anthony D. Gruber, Christopher Felder, Miki Verma, Chelsey Hoff
2019-04-30
We present an approach to analyze functions that addresses limitations present in the Active Subspaces (AS) method of Constantine et al.(2015; 2014). Under appropriate hypotheses, our Active Manifolds (AM) method identifies a 1-D curve in the domain (the active manifold) on which nearly all values of the unknown function are attained, and which can be exploited for approximation or analysis, especially when is large (high-dimensional in-put space). We provide theorems justifying our AM technique and an algorithm permitting functional approximation and sensitivity analysis. Using accessible, low-dimensional functions as initial examples, we show AM reduces approximation error by an order of magnitude compared to AS, at the expense of more computation. Following this, we revisit the sensitivity analysis by Glaws et al. (2017), who apply AS to analyze a magnetohydrodynamic power generator model, and compare the performance of AM on the same data. Our analysis provides detailed information not captured by AS, exhibiting the influence of each parameter individually along an active manifold. Overall, AM represents a novel technique for analyzing functional models with benefits including: reducing -dimensional analysis to a 1-D analogue, permit-ting more accurate regression than AS (at more computational expense), enabling more informative sensitivity analysis, and granting accessible visualizations(2-D plots) of parameter sensitivity along the AM.
Optimization and Abstraction: A Synergistic Approach for Analyzing Neural Network Robustness (1904.09959v2)
Greg Anderson, Shankara Pailoor, Isil Dillig, Swarat Chaudhuri
2019-04-22
In recent years, the notion of local robustness (or robustness for short) has emerged as a desirable property of deep neural networks. Intuitively, robustness means that small perturbations to an input do not cause the network to perform misclassifications. In this paper, we present a novel algorithm for verifying robustness properties of neural networks. Our method synergistically combines gradient-based optimization methods for counterexample search with abstraction-based proof search to obtain a sound and ({\delta}-)complete decision procedure. Our method also employs a data-driven approach to learn a verification policy that guides abstract interpretation during proof search. We have implemented the proposed approach in a tool called Charon and experimentally evaluated it on hundreds of benchmarks. Our experiments show that the proposed approach significantly outperforms three state-of-the-art tools, namely AI^2 , Reluplex, and Reluval.