Quantum annealers aim at solving non-convex optimization problems by exploiting cooperative tunneling effects to escape local minima. The underlying idea consists in designing a classical energy function whose ground states are the sought optimal solutions of the original optimization problem and adding a controllable quantum transverse field to generate tunneling processes. A key challenge is to identify classes of non-convex optimization problems for which this scheme can induce an exponential speed up of quantum annealing compared to thermal annealing. We show that this happens for a wide class of optimization problems which are also central to machine learning. Their energy landscape is exponentially dominated by local minima that trap classical thermal annealers, while quantum annealing converges efficiently to rare dense regions of optimal solutions. For non convex problems, these will not in general correspond to configuration dominating the Gibbs measure.