

  • A graph similarity for deep learning
  • An Unsupervised Information-Theoretic Perceptual Quality Metric
  • Self-Supervised MultiModal Versatile Networks
  • Benchmarking Deep Inverse Models over time, and the Neural-Adjoint method
  • Off-Policy Evaluation and Learning for External Validity under a Covariate Shift
  • Neural Methods for Point-wise Dependency Estimation
  • Fast and Flexible Temporal Point Processes with Triangular Maps
  • Backpropagating Linearly Improves Transferability of Adversarial Examples
  • PyGlove: Symbolic Programming for Automated Machine Learning
  • Fourier Sparse Leverage Scores and Approximate Kernel Learning
  • Improved Algorithms for Online Submodular Maximization via First-order Regret Bounds
  • Synbols: Probing Learning Algorithms with Synthetic Datasets
  • Adversarially Robust Streaming Algorithms via Differential Privacy
  • Trading Personalization for Accuracy: Data Debugging in Collaborative Filtering
  • Cascaded Text Generation with Markov Transformers
  • Improving Local Identifiability in Probabilistic Box Embeddings
  • Permute-and-Flip: A new mechanism for differentially private selection
  • Deep reconstruction of strange attractors from time series
  • Reciprocal Adversarial Learning via Characteristic Functions
  • Statistical Guarantees of Distributed Nearest Neighbor Classification
  • Stein Self-Repulsive Dynamics: Benefits From Past Samples
  • The Statistical Complexity of Early-Stopped Mirror Descent
  • Algorithmic recourse under imperfect causal knowledge: a probabilistic approach
  • Quantitative Propagation of Chaos for SGD in Wide Neural Networks
  • A Causal View on Robustness of Neural Networks
  • Minimax Classification with 0-1 Loss and Performance Guarantees
  • How to Learn a Useful Critic? Model-based Action-Gradient-Estimator Policy Optimization
  • Coresets for Regressions with Panel Data
  • Learning Composable Energy Surrogates for PDE Order Reduction
  • Efficient Contextual Bandits with Continuous Actions
  • Achieving Equalized Odds by Resampling Sensitive Attributes
  • Multi-Robot Collision Avoidance under Uncertainty with Probabilistic Safety Barrier Certificates
  • Hard Shape-Constrained Kernel Machines
  • A Closer Look at the Training Strategy for Modern Meta-Learning
  • On the Value of Out-of-Distribution Testing: An Example of Goodhart’s Law
  • Generalised Bayesian Filtering via Sequential Monte Carlo
  • Deterministic Approximation for Submodular Maximization over a Matroid in Nearly Linear Time
  • Flows for simultaneous manifold learning and density estimation
  • Simultaneous Preference and Metric Learning from Paired Comparisons
  • Efficient Variational Inference for Sparse Deep Learning with Theoretical Guarantee
  • Learning Manifold Implicitly via Explicit Heat-Kernel Learning
  • Deep Relational Topic Modeling via Graph Poisson Gamma Belief Network
  • One-bit Supervision for Image Classification
  • What is being transferred in transfer learning?
  • Submodular Maximization Through Barrier Functions
  • Neural Networks with Recurrent Generative Feedback
  • Learning to Extrapolate Knowledge: Transductive Few-shot Out-of-Graph Link Prediction
  • Exploiting weakly supervised visual patterns to learn from partial annotations
  • Improving Inference for Neural Image Compression
  • Neuron Merging: Compensating for Pruned Neurons
  • FixMatch: Simplifying Semi-Supervised Learning with Consistency and Confidence
  • Reinforcement Learning with Combinatorial Actions: An Application to Vehicle Routing
  • Towards Playing Full MOBA Games with Deep Reinforcement Learning
  • Rankmax: An Adaptive Projection Alternative to the Softmax Function
  • Online Agnostic Boosting via Regret Minimization
  • Causal Intervention for Weakly-Supervised Semantic Segmentation
  • Belief Propagation Neural Networks
  • Over-parameterized Adversarial Training: An Analysis Overcoming the Curse of Dimensionality
  • Post-training Iterative Hierarchical Data Augmentation for Deep Networks
  • Debugging Tests for Model Explanations
  • Robust compressed sensing using generative models
  • Fairness without Demographics through Adversarially Reweighted Learning
  • Stochastic Latent Actor-Critic: Deep Reinforcement Learning with a Latent Variable Model
  • Ridge Rider: Finding Diverse Solutions by Following Eigenvectors of the Hessian
  • The route to chaos in routing games: When is price of anarchy too optimistic?
  • Online Algorithm for Unsupervised Sequential Selection with Contextual Information
  • Adapting Neural Architectures Between Domains
  • What went wrong and when? Instance-wise feature importance for time-series black-box models
  • Towards Better Generalization of Adaptive Gradient Methods
  • Learning Guidance Rewards with Trajectory-space Smoothing
  • Variance Reduction via Accelerated Dual Averaging for Finite-Sum Optimization
  • Tree! I am no Tree! I am a low dimensional Hyperbolic Embedding
  • Deep Structural Causal Models for Tractable Counterfactual Inference
  • Convolutional Generation of Textured 3D Meshes
  • A Statistical Framework for Low-bitwidth Training of Deep Neural Networks
  • Better Set Representations For Relational Reasoning
  • AutoSync: Learning to Synchronize for Data-Parallel Distributed Deep Learning
  • A Combinatorial Perspective on Transfer Learning
  • Hardness of Learning Neural Networks with Natural Weights
  • Higher-Order Spectral Clustering of Directed Graphs
  • Primal-Dual Mesh Convolutional Neural Networks
  • The Advantage of Conditional Meta-Learning for Biased Regularization and Fine Tuning
  • Watch out! Motion is Blurring the Vision of Your Deep Neural Networks
  • Sinkhorn Barycenter via Functional Gradient Descent
  • Coresets for Near-Convex Functions
  • Bayesian Deep Ensembles via the Neural Tangent Kernel
  • Improved Schemes for Episodic Memory-based Lifelong Learning
  • Adaptive Sampling for Stochastic Risk-Averse Learning
  • Deep Wiener Deconvolution: Wiener Meets Deep Learning for Image Deblurring
  • Discovering Reinforcement Learning Algorithms
  • Taming Discrete Integration via the Boon of Dimensionality
  • Blind Video Temporal Consistency via Deep Video Prior
  • Simplify and Robustify Negative Sampling for Implicit Collaborative Filtering
  • Model Selection for Production System via Automated Online Experiments
  • On the Almost Sure Convergence of Stochastic Gradient Descent in Non-Convex Problems
  • Automatic Perturbation Analysis for Scalable Certified Robustness and Beyond
  • Adaptation Properties Allow Identification of Optimized Neural Codes
  • Global Convergence and Variance Reduction for a Class of Nonconvex-Nonconcave Minimax Problems
  • Model-Based Multi-Agent RL in Zero-Sum Markov Games with Near-Optimal Sample Complexity
  • Conservative Q-Learning for Offline Reinforcement Learning
  • Online Influence Maximization under Linear Threshold Model
  • Ensembling geophysical models with Bayesian Neural Networks
  • Delving into the Cyclic Mechanism in Semi-supervised Video Object Segmentation
  • Asymmetric Shapley values: incorporating causal knowledge into model-agnostic explainability
  • Understanding Deep Architecture with Reasoning Layer
  • Planning in Markov Decision Processes with Gap-Dependent Sample Complexity
  • Provably Good Batch Off-Policy Reinforcement Learning Without Great Exploration
  • Detection as Regression: Certified Object Detection with Median Smoothing
  • Contextual Reserve Price Optimization in Auctions via Mixed Integer Programming
  • ExpandNets: Linear Over-parameterization to Train Compact Convolutional Networks
  • FleXOR: Trainable Fractional Quantization
  • The Implications of Local Correlation on Learning Some Deep Functions
  • Learning to search efficiently for causally near-optimal treatments
  • A Game Theoretic Analysis of Additive Adversarial Attacks and Defenses
  • Posterior Network: Uncertainty Estimation without OOD Samples via Density-Based Pseudo-Counts
  • Recurrent Quantum Neural Networks
  • No-Regret Learning and Mixed Nash Equilibria: They Do Not Mix
  • A Unifying View of Optimism in Episodic Reinforcement Learning
  • Continuous Submodular Maximization: Beyond DR-Submodularity
  • An Asymptotically Optimal Primal-Dual Incremental Algorithm for Contextual Linear Bandits
  • Assessing SATNet’s Ability to Solve the Symbol Grounding Problem
  • A Bayesian Nonparametrics View into Deep Representations
  • On the Similarity between the Laplace and Neural Tangent Kernels
  • A causal view of compositional zero-shot recognition
  • HiPPO: Recurrent Memory with Optimal Polynomial Projections
  • Auto Learning Attention
  • CASTLE: Regularization via Auxiliary Causal Graph Discovery
  • Long-Tailed Classification by Keeping the Good and Removing the Bad Momentum Causal Effect
  • Explainable Voting
  • Deep Archimedean Copulas
  • Re-Examining Linear Embeddings for High-Dimensional Bayesian Optimization
  • UnModNet: Learning to Unwrap a Modulo Image for High Dynamic Range Imaging
  • Thunder: a Fast Coordinate Selection Solver for Sparse Learning
  • Neural Networks Fail to Learn Periodic Functions and How to Fix It
  • Distribution Matching for Crowd Counting
  • Correspondence learning via linearly-invariant embedding
  • Learning to Dispatch for Job Shop Scheduling via Deep Reinforcement Learning
  • On Adaptive Attacks to Adversarial Example Defenses
  • Sinkhorn Natural Gradient for Generative Models
  • Online Sinkhorn: Optimal Transport distances from sample streams
  • Ultrahyperbolic Representation Learning
  • Locally-Adaptive Nonparametric Online Learning
  • Compositional Generalization via Neural-Symbolic Stack Machines
  • Graphon Neural Networks and the Transferability of Graph Neural Networks
  • Unreasonable Effectiveness of Greedy Algorithms in Multi-Armed Bandit with Many Arms
  • Gamma-Models: Generative Temporal Difference Learning for Infinite-Horizon Prediction
  • Deep Transformers with Latent Depth
  • Neural Mesh Flow: 3D Manifold Mesh Generation via Diffeomorphic Flows
  • Statistical control for spatio-temporal MEG/EEG source imaging with desparsified mutli-task Lasso
  • A Scalable MIP-based Method for Learning Optimal Multivariate Decision Trees
  • Efficient Exact Verification of Binarized Neural Networks
  • Ultra-Low Precision 4-bit Training of Deep Neural Networks
  • Bridging the Gap between Sample-based and One-shot Neural Architecture Search with BONAS
  • On Numerosity of Deep Neural Networks
  • Outlier Robust Mean Estimation with Subgaussian Rates via Stability
  • Self-Supervised Relationship Probing
  • Information Theoretic Counterfactual Learning from Missing-Not-At-Random Feedback
  • Prophet Attention: Predicting Attention with Future Attention
  • Language Models are Few-Shot Learners
  • Margins are Insufficient for Explaining Gradient Boosting
  • Fourier-transform-based attribution priors improve the interpretability and stability of deep learning models for genomics
  • MomentumRNN: Integrating Momentum into Recurrent Neural Networks
  • Marginal Utility for Planning in Continuous or Large Discrete Action Spaces
  • Projected Stein Variational Gradient Descent
  • Minimax Lower Bounds for Transfer Learning with Linear and One-hidden Layer Neural Networks
  • SE(3)-Transformers: 3D Roto-Translation Equivariant Attention Networks
  • On the equivalence of molecular graph convolution and molecular wave function with poor basis set
  • The Power of Predictions in Online Control
  • Learning Affordance Landscapes for Interaction Exploration in 3D Environments
  • Cooperative Multi-player Bandit Optimization
  • Tight First- and Second-Order Regret Bounds for Adversarial Linear Bandits
  • Just Pick a Sign: Optimizing Deep Multitask Models with Gradient Sign Dropout
  • A Loss Function for Generative Neural Networks Based on Watson’s Perceptual Model
  • Dynamic Fusion of Eye Movement Data and Verbal Narrations in Knowledge-rich Domains
  • Scalable Multi-Agent Reinforcement Learning for Networked Systems with Average Reward
  • Optimizing Neural Networks via Koopman Operator Theory
  • SVGD as a kernelized Wasserstein gradient flow of the chi-squared divergence
  • Adversarial Robustness of Supervised Sparse Coding
  • Differentiable Meta-Learning of Bandit Policies
  • Biologically Inspired Mechanisms for Adversarial Robustness
  • Statistical-Query Lower Bounds via Functional Gradients
  • Near-Optimal Reinforcement Learning with Self-Play
  • Network Diffusions via Neural Mean-Field Dynamics
  • Self-Distillation as Instance-Specific Label Smoothing
  • Towards Problem-dependent Optimal Learning Rates
  • Cross-lingual Retrieval for Iterative Self-Supervised Training
  • Rethinking pooling in graph neural networks
  • Pointer Graph Networks
  • Gradient Regularized V-Learning for Dynamic Treatment Regimes
  • Faster Wasserstein Distance Estimation with the Sinkhorn Divergence
  • Forethought and Hindsight in Credit Assignment
  • Robust Recursive Partitioning for Heterogeneous Treatment Effects with Uncertainty Quantification
  • Rescuing neural spike train models from bad MLE
  • Lower Bounds and Optimal Algorithms for Personalized Federated Learning
  • Black-Box Certification with Randomized Smoothing: A Functional Optimization Based Framework
  • Deep Imitation Learning for Bimanual Robotic Manipulation
  • Stationary Activations for Uncertainty Calibration in Deep Learning
  • Ensemble Distillation for Robust Model Fusion in Federated Learning
  • Falcon: Fast Spectral Inference on Encrypted Data
  • On Power Laws in Deep Ensembles
  • Practical Quasi-Newton Methods for Training Deep Neural Networks
  • Approximation Based Variance Reduction for Reparameterization Gradients
  • Inference Stage Optimization for Cross-scenario 3D Human Pose Estimation
  • Consistent feature selection for analytic deep neural networks
  • Glance and Focus: a Dynamic Approach to Reducing Spatial Redundancy in Image Classification
  • Information Maximization for Few-Shot Learning
  • Inverse Reinforcement Learning from a Gradient-based Learner
  • Bayesian Multi-type Mean Field Multi-agent Imitation Learning
  • Bayesian Robust Optimization for Imitation Learning
  • Multiview Neural Surface Reconstruction by Disentangling Geometry and Appearance
  • Riemannian Continuous Normalizing Flows
  • Attention-Gated Brain Propagation: How the brain can implement reward-based error backpropagation
  • Asymptotic Guarantees for Generative Modeling Based on the Smooth Wasserstein Distance
  • Online Robust Regression via SGD on the l1 loss
  • PRANK: motion Prediction based on RANKing
  • Fighting Copycat Agents in Behavioral Cloning from Observation Histories
  • Tight Nonparametric Convergence Rates for Stochastic Gradient Descent under the Noiseless Linear Model
  • Structured Prediction for Conditional Meta-Learning
  • Optimal Lottery Tickets via Subset Sum: Logarithmic Over-Parameterization is Sufficient
  • The Hateful Memes Challenge: Detecting Hate Speech in Multimodal Memes
  • Stochasticity of Deterministic Gradient Descent: Large Learning Rate for Multiscale Objective Function
  • Identifying Learning Rules From Neural Network Observables
  • Optimal Approximation - Smoothness Tradeoffs for Soft-Max Functions
  • Weakly-Supervised Reinforcement Learning for Controllable Behavior
  • Improving Policy-Constrained Kidney Exchange via Pre-Screening
  • Learning abstract structure for drawing by efficient motor program induction
  • Why Do Deep Residual Networks Generalize Better than Deep Feedforward Networks? — A Neural Tangent Kernel Perspective
  • Dual Instrumental Variable Regression
  • Stochastic Gradient Descent in Correlated Settings: A Study on Gaussian Processes
  • Interventional Few-Shot Learning
  • Minimax Value Interval for Off-Policy Evaluation and Policy Optimization
  • Biased Stochastic First-Order Methods for Conditional Stochastic Optimization and Applications in Meta Learning
  • ShiftAddNet: A Hardware-Inspired Deep Network
  • Network-to-Network Translation with Conditional Invertible Neural Networks
  • Intra-Processing Methods for Debiasing Neural Networks
  • Finding Second-Order Stationary Points Efficiently in Smooth Nonconvex Linearly Constrained Optimization Problems
  • Model-based Policy Optimization with Unsupervised Model Adaptation
  • Implicit Regularization and Convergence for Weight Normalization
  • Geometric All-way Boolean Tensor Decomposition
  • Modular Meta-Learning with Shrinkage
  • A/B Testing in Dense Large-Scale Networks: Design and Inference
  • What Neural Networks Memorize and Why: Discovering the Long Tail via Influence Estimation
  • Partially View-aligned Clustering
  • Partial Optimal Tranport with applications on Positive-Unlabeled Learning
  • Toward the Fundamental Limits of Imitation Learning
  • Logarithmic Pruning is All You Need
  • Hold me tight! Influence of discriminative features on deep network boundaries
  • Learning from Mixtures of Private and Public Populations
  • Adversarial Weight Perturbation Helps Robust Generalization
  • Stateful Posted Pricing with Vanishing Regret via Dynamic Deterministic Markov Decision Processes
  • Adversarial Self-Supervised Contrastive Learning
  • Normalizing Kalman Filters for Multivariate Time Series Analysis
  • Learning to summarize with human feedback
  • Fourier Spectrum Discrepancies in Deep Network Generated Images
  • Lamina-specific neuronal properties promote robust, stable signal propagation in feedforward networks
  • Learning Dynamic Belief Graphs to Generalize on Text-Based Games
  • Triple descent and the two kinds of overfitting: where & why do they appear?
  • Multimodal Graph Networks for Compositional Generalization in Visual Question Answering
  • Learning Graph Structure With A Finite-State Automaton Layer
  • A Universal Approximation Theorem of Deep Neural Networks for Expressing Probability Distributions
  • Unsupervised object-centric video generation and decomposition in 3D
  • Domain Generalization for Medical Imaging Classification with Linear-Dependency Regularization
  • Multi-label classification: do Hamming loss and subset accuracy really conflict with each other?
  • A Novel Automated Curriculum Strategy to Solve Hard Sokoban Planning Instances
  • Causal analysis of Covid-19 Spread in Germany
  • Locally private non-asymptotic testing of discrete distributions is faster using interactive mechanisms
  • Adaptive Gradient Quantization for Data-Parallel SGD
  • Finite Continuum-Armed Bandits
  • Removing Bias in Multi-modal Classifiers: Regularization by Maximizing Functional Entropies
  • Compact task representations as a normative model for higher-order brain activity
  • Robust-Adaptive Control of Linear Systems: beyond Quadratic Costs
  • Co-exposure Maximization in Online Social Networks
  • UCLID-Net: Single View Reconstruction in Object Space
  • Reinforcement Learning for Control with Multiple Frequencies
  • Complex Dynamics in Simple Neural Networks: Understanding Gradient Flow in Phase Retrieval
  • Neural Message Passing for Multi-Relational Ordered and Recursive Hypergraphs
  • A Unified View of Label Shift Estimation
  • Optimal Private Median Estimation under Minimal Distributional Assumptions
  • Breaking the Communication-Privacy-Accuracy Trilemma
  • Audeo: Audio Generation for a Silent Performance Video
  • Ode to an ODE
  • Self-Distillation Amplifies Regularization in Hilbert Space
  • Coupling-based Invertible Neural Networks Are Universal Diffeomorphism Approximators
  • Community detection using fast low-cardinality semidefinite programming

  • Modeling Noisy Annotations for Crowd Counting
  • An operator view of policy gradient methods
  • Demystifying Contrastive Self-Supervised Learning: Invariances, Augmentations and Dataset Biases
  • Online MAP Inference of Determinantal Point Processes
  • Video Object Segmentation with Adaptive Feature Bank and Uncertain-Region Refinement
  • Inferring learning rules from animal decision-making
  • Input-Aware Dynamic Backdoor Attack
  • How hard is to distinguish graphs with graph neural networks?
  • Minimax Regret of Switching-Constrained Online Convex Optimization: No Phase Transition
  • Dual Manifold Adversarial Robustness: Defense against Lp and non-Lp Adversarial Attacks
  • Cross-Scale Internal Graph Neural Network for Image Super-Resolution
  • Unsupervised Representation Learning by Invariance Propagation
  • Restoring Negative Information in Few-Shot Object Detection
  • Do Adversarially Robust ImageNet Models Transfer Better?
  • Robust Correction of Sampling Bias using Cumulative Distribution Functions
  • Personalized Federated Learning with Theoretical Guarantees: A Model-Agnostic Meta-Learning Approach
  • Pixel-Level Cycle Association: A New Perspective for Domain Adaptive Semantic Segmentation
  • Classification with Valid and Adaptive Coverage
  • Learning Global Transparent Models consistent with Local Contrastive Explanations
  • Learning to Approximate a Bregman Divergence
  • Diverse Image Captioning with Context-Object Split Latent Spaces
  Le

