When I was leafing through the pages of a cooking recipes book yesterday, finally it hit me why I love recipes: They are algorithms which are robust to small variations and mistakes, and more often than not, they benefit from them. They have further nice properties: They are empirical algorithms, obtained without a theory, but rather obtained by searching through the space of algorithms. We have no idea why a particular recipe tastes in the particular way it does, we have no theory for it, and there is no recipe derived from a theory, rather some people recently started trying to figure out theory from the empirical recipes transmitted to us from our ancestors.
This coincides with a theme I loved at NIPS 2016. Normally, in machine learning, we derive algorithms based on theories: We have mathematical models and variables about which we want to learn. Then, in order to estimate them, we derive algorithms using the mathematical model (theory). But if, in practice, the only real entity is the algorithm but not the model, why bother with the model in the first place, if we can search in the space of algorithms? Recipes are just fascinating examples of this attitude. Instead of derived from The Scientific Theory of Food, Taste and Molecular Chemistry by a Professor in the Department of Molecular Food Chemistry, recipes are extremely effective heuristics developed through trial-and-error, which have dominated the theory by presenting us with tastes that we wouldn’t be able to find with developing models of chemicals. Anybody who enjoys the taste of food wouldn’t care about the underlying chemistry (I don’t!) because as soon as you have the recipe, the chemistry is an additional curiosity, something you “would love to learn” if you have no other thrills in your life (but trial and error in the kitchen is far more enjoyful than food chemistry — sorry). Interestingly enough, this idea started to be taken seriously in the ML community. Instead of developing poor and inexpressive mathematical models with our poor imagination, why not learn “the code” or “the algorithm” directly, with the expense of having an inexplicable, black-box model as a result? As soon as this satisfies us in terms of solving the problem, why would we care about the underlying complex and inexplicable mathematical model?
Before leaving the further interested with some relevant links, I would like to note that this kind of approach led to a discussion about interpretability. People are annoyed by this kind of thinking because at the end you have an algorithm which works but you have no idea why it works — well, exactly like recipes. I have also discussed this in length with a friend of mine at NIPS: I think interpretability is not one of the goals we need. Any interpretation is likely to be a deception for any human being, interpretations are for the people who love niceness, beautifulness, nice interpretations, and similar constructs. I don’t. But I do not want to elaborate this point with more psychology — at least not now.
We think our actions are “interpretable” because we fit arbitrary interpretations to them and we complain because machines don’t. #NIPS2016
— Deniz (@odakyildiz) December 8, 2016
Here are some links about algorithms that can learn algorithms — you can browse further through the connections implied by these links.