Lessons Learned After 8 Years of Machine Learning

a decade old now.

Back then, OpenAI felt like one (well-baked) startup among others. DeepMind was already around, but not yet fully integrated into Google. And, back then, the “triad of deep learning” — LeCun, Hinton, and Bengio — published Deep Learning in Nature*.

Today, AI is like a common good. Back then, it was mostly scholars and tech nerds that knew and cared about it. Today, even kids know what AI is and interact with it (for worse or even worse).

It’s a fast-paced field, and I’m fortunate to have joined it only slightly afterwards “back then” — eight years ago, when momentum was building but classic ML was still taught at the universities: clustering, k-means, SVMs. It also coincided with the year that the community began to understand that attention (and …

a decade old now.

Today, AI is like a common good. Back then, it was mostly scholars and tech nerds that knew and cared about it. Today, even kids know what AI is and interact with it (for worse or even worse).

As the year now closes, it feels like the right time to zoom out. On a monthly basis I reflect on small, practical lessons and publish them. Roughly every half a year, I then look for the larger themes underneath: the patterns that keep recurring, even when projects change.

This time, four threads show up everywhere in my notes:

Deep Work (my all-time favorite)
Over-identification with one’s work
Sports (and movement in general)
Blogging

Deep Work

Deep Work seems to be my favorite theme — and in machine learning it shows up everywhere.

Machine learning works can have several focus points, but most days revolve around some combination of:

theory (math, proofs, careful reasoning),
coding (pipelines, training loops, debugging),
writing (project reports, papers, documentation).

All of them require sustained focus for extended time.

Theorem proofs don’t emerge from five-minute fragments. Coding, needless to say, punishes interruptions: if you’re deep in a bug and someone pulls you out, you don’t just “resume” — you need to reconstruct, which just burns time**.

Writing, too, is fragile. Crafting good sentences needs attention, and attention is the first thing that disappears when your day becomes a sequence of small message pings.

I’m fortunate enough to work in an environment that allows multiple hours of deep work, several times a week. This is not the norm — honestly, it might be the exception. But it’s incredibly fulfilling. I can dive into a problem for hours and come out exhausted afterwards.

Exhausted, but satisfied.

For me, deep work has always meant two things, and I already highlighted this half a year ago:

The skill: being able to concentrate deeply for long stretches.
The environment: having conditions that allow and protect that concentration.

Usually, the skill is easier to acquire (or re-acquire) if you don’t have it. It’s the environment that’s harder to change. You can train focus, but you can’t single-handedly delete meetings from your calendar, or change your company’s culture overnight.

Still, it helps to name the two parts. If you’re struggling with deep work, it might not be a lack of discipline. Sometimes, as my experiences tell me, it’s simply that your environment doesn’t permit the thing you’re trying to do.

Over-identification with one’s work

Do you like your job?

Let’s hope so, because a big fraction of your waking hours is spent doing it. But even if you generally like your job, there will be times when you like it more — and times when you like it less.

Like all people, I’ve had both.

There were periods where I felt jolted with energy just from the fact that I was “doing something with ML.”

Wow!

And then there were periods where lack of progress — or a setback because an idea simply didn’t work — dragged me down hard.

Not-wow.

Over the years, I’ve come to believe that deriving too much identity from the job is generally not a smart strategy. Work on and with ML is full of variance: experiments fail, baselines beat your fancy ideas, reviewers misunderstand, deadlines compress, data breaks, priorities shift. If your sense of self rises and falls with the latest training run, you could equally well be visiting Disneyland for a roller coaster ride.

A simple analogy: imagine you’re a gymnast. You train for years. You’re flexible, strong, in control of your movements. Then you break your ankle. Suddenly, you can’t even do the simplest jumps. You can’t train in the same way you’ve done it the years before. If you’re only an athlete — if that’s the whole identity — it will feel like losing yourself.

Thankfully most people are more than their profession. Even if they forget it sometimes.

The same applies to ML. You can be an ML engineer, or a researcher, or a “theory person” — and also be a friend, a partner, a sibling, a teammate, a reader, a runner, a writer. When one part goes through a low, the others hold you steady.

This is not “I don’t care about my job”. It’s about caring without collapsing into it.

Sports, or movement in general

Granted, this is a no-brainer.

Jobs in ML are not known for containing a lot of movement. The miles you make are finger-miles on the keyboard. Meanwhile, the rest of the body sits still.

I need not go into what happens if you just let that happen.

The good news is: it’s easier than ever to counteract. There are many boring but effective options now:

height-adjustable desks
meetings spent walking (especially when cameras are off anyway)
walking pads under the desk
short mobility routines (ideally, between deep work blocks)

Over the years, movement has become an integral part for my workday. It helps me start the day in a smoother state — not stiff, not slouched, not already “compressed.” And it helps me de-exhaust after deep work. Deep concentration is mentally tiring, but also has physical effects: shoulders rise up, neck falls forward, breathing becomes shallow.

Moving resets that.

I don’t treat it as “fitness.” I treat it as an insurance that allows me to do my job for years to come.

Blogging

Daniel Bourke.***

If you’ve been reading ML content on Towards Data Science for a long time (at least five, six years), that name might sound familiar. He published a lot of ML articles (when TDS was still hosted on Medium), and his unique style of writing brought ML to a wider audience.

His example inspired me to start blogging as well — also for TDS. I began at the end of 2019, beginning of 2020.

At first, writing these articles was simple: write an article, publish it, move on. But over time, it became something else: a practice. Writing forces precision in putting your thoughts to paper. If you can’t explain something in a way that holds together, you probably don’t understand it as well as you think you do.

Over the years, I covered machine learning roadmaps, wrote tutorials (like how to handle TFRecords), and, yes, kept circling back to deep work — because it keeps proving itself important for ML practitioners.

And blogging has been rewarding in two ways.

It’s been rewarding in monetary terms (to the point where, over the years, it helped finance the computer I’m using to write this). But more importantly, it has been rewarding as a practice in writing. I see blogging as a way of training my ability to translate: taking something technical and putting it into words that another audience can actually carry.

In a field that moves quickly and loves novelty, such translation skill is oddly stable. Models change. Frameworks change (Theano, anybody?). But the ability to think clearly and write clearly compounds.

Closing thoughts

Looking back after eight years of “doing ML”, none of these themes turn out to be about a specific model or a specific trick.

They are about:

Deep work, which makes progress possible.
Not over-identifying, which makes setbacks survivable.
Movement, which keeps your body from silently degrading.
Blogging, which turns experience into something shareable — and trains clarity.

The funny thing is: these are all “boring” lessons.

But they’re the ones that keep showing up.

References

* The deep learning Nature article from LeCun, Bengio, and Hinton: https://www.nature.com/articles/nature14539; the annotated reference section is itself worth a read.

** See a quite accessible digest by the American Psychological Association at https://www.apa.org/topics/research/multitasking.

*** Daniel Bourke’s homepage with his posts on machine learning: https://www.mrdbourke.com/tag/machine-learning.