Category: Uncategorized

  • Guided Discrete Diffusion for Constraint Satisfaction Problems

    Guided Discrete Diffusion for Constraint Satisfaction Problems pdf version of this post Justin Jung — January 10, 2025 Introduction AI for constraint satisfaction problems is an important field researched for more than half a century. Sudoku, a puzzle where no row, column, or block can have two of the same number, is a popular benchmark…

  • Geometric intuitions behind generalization

    Geometric intuitions behind generalization pdf version of this post Timothy Hanson — January 13, 2025 Deep learning models generalize surprisingly well despite being overparameterized. Traditional measures of capacity, such as VC-dimension or Rademacher complexity, suggest that overparameterization should lead to overfitting – but it usually doesn’t. Prominent explanations for this phenomenon include that stochastic gradient…

  • Response to “Machines of Loving Grace” by Dario Amodei

    “Machines of Loving Grace” is a very well-written, thoughtful, and interesting bit of prognostication on the future of AI11 Apparently, Amodei means one whom god loves – which makes the blog post title apt, even though the content has the directionality reversed: he (or we) love AI.. While long, it is not vacuously so; the…

  • Thoughts on Wolfram’s “What’s Really Going On in Machine Learning?”

    Preface At a conference this past weekend I had the luck of meeting & having a good discussion with Stephen Wolfram.  This mathematician / physicist / scientist has been a personal hero for many years – indeed, after A New Kind of Science came out, as part of a college course I made a VLSI…

  • Active learning for program induction

    Active learning for program induction pdf version of this post Timothy Hanson & Justin Jung May 10 2024 Abstract This post goes into more detail on what is meant by active learning and how it relates to program induction. We discuss the use of a simulator for running a program (\(\sim \) compressed model), and…

  • Fast linear transforms using Butterfly Factorizations

    This post discusses an older paper that shows how a clever idea (butterfly factorizations) can be learned in a typical deep-learning pipeline. It concludes with some conjecture how these fast algorithms can be applied to other slow, data-intensive algorithms: attention. Learning fast algorithms for linear transforms with butterfly factorizations. Tri Dao (author of FlashAttention), Albert…

  • Are Transformers all you need?

    Are Transformers all you need? pdf version of this post Timothy Hanson – November 8, 2023 Abstract This post discusses active learning and reasoning, and the strengths and limitations of using transformers for it. After setting up the problem context, we conclude that for transformers to serve as world-models to these purposes, they will need…

  • Hello world!

    Hello hello — this is a blog where we’ll post research progress, thoughts, things too small to be preprints, company news, etc. Looking forward to sharing 🙂