Designing architectures for deep neural networks requires expert knowledge and substantial computation time. We propose a technique to accelerate model selection by learning an auxiliary HyperNet that generates the weights of a main model conditioned on that model’s architecture. By comparing the relative validation performance of networks with HyperNet-generated weights, we can effectively search over a wide range of architectures at the cost of a single training run. To facilitate this search, we develop a flexible mechanism based on memory read- writes that allows us to define a wide range of network connectivity patterns, with ResNet, DenseNet, and FractalNet blocks as special cases. We validate our method (SMASH) on CIFAR-100 and CIFAR-10, achieving competitive performance with similarly-sized hand-designed networks.
"SMASH: One-Shot Model Architecture Search through HyperNetworks." Submitted to ICLR 2018
"Neural Photo Editing with Introspective Adversarial Networks." ICLR 2017
"Generative and Discriminative Voxel Modeling with Convolutional Neural Networks." 3D Deep Learning Workshop, NIPS 2016
"FreezeOut: Accelerate Training by Progressively Freezing Layers." Optimization workshop, NIPS 2017
"ConvNet-Based Optical Recognition for Engineering Drawings." ASME IDETC/CIE 2017
"Context-Aware Content Generation for Virtual Environments." ASME IDETC/CIE 2016
Andrew Brock's background is in computational heat transfer and fluid dynamics for hybrid rocketry, control systems engineering for full-body haptic feedback, and social dance instruction. He holds an MSME from Cal Poly SLO.