Why tinyML?

MARCH 26, 2021 — In about 2004 this reporter asked a top IBM software development leader what cloud computing looked like to him. “It looks like a mainframe,” he said with only half a smile. True enough, cloud is a centralized way of computing, which is beginning to raise questions.

One of which is: Will machine learning be forever in the “glass room?” That is the old-time descriptor for the home sweet home of the immortal mainframe era, where numbers got crunched and good ideas went to die.

Today, technologists are working to bring machine learning out of the closet and into the real world in the form of Edge computing.

For that to happen machine-made observations and decisions will have to succeed on individual chips in devices and on boards, far from the cloud data center where a lot of electrical power allows infinite compute.

For that to happen, machine learning at the edge, which is often more project than reality today, will have to become productized. It will have to work within much tighter constrains. That is the motivation behind TinyML, which — thank goodness — is more a way of doing things, than it is a standard or product.

Issues facing TinyML as it struggles to leave the cocoon are worth consideration. As with client server and other computing paradigm shifts, the outcome will rely on how teams on the cloud and on the edge deal with the details of implementation.

That was seen in a panel at this week’s tinyML Summit 2021. It afforded opportunity for such consideration. Here I am going to share some comments and impressions from a panel that featured expert implementers working to make it happen.

The lively panel discussion entitled “tinyML inference SW – Where do we go from here?” was moderated by Ian Bratt, Distinguished Engineer & Fellow, Arm. Joining Brat were Chris Lattner, President, Engineering and Product, SiFive; Tianqi Chen, CTO, OctoML; Raziel Alvarez, Technical Lead for PyTorch at Facebook AI; and Pete Warden, Technical Lead, Google. (A link to the panel recording on YouTube is found at the bottom of this page.)

A familiar view emerged, one that showed the creators of the trained machine learning model handing off their work, hoping a dedicated engineer can make the code run in the end. That conjures the old saw about ‘throwing it over the wall,’ and hoping system programmers can do the finished carpentry.

The tableau suggested the objectives of the researchers in a sort of ivory tower of cloud machine learning were somewhat at odds with the objectives of the front-line inference engineers at the edge where cycle economy is paramount and power consumption is crucial.

That echoes yet another developer saw that goes ‘it worked on my machine’ – one of the classic crunch time excuses over the history of computing.

Other issues:

-It may take top gun skills to make a trained model work in practice. “Somebody has to do magic to get it into production,” said Raziel Alvarez.

-People are able to deploy these models but the effort is very considerable. The many different cogs in machine learning (for example, the link between a CPU and a GPU) have to be managed deftly. In practice that means “people have to go outside their [practice] boundaries,” said T.Q. Chen.

-They hope to deploy inference on a variety of hardware, but each hardware version and type requires special tuning. And, low-level hardware changes can effect a cascading chain of changes all the way up the machine learning development stack. “As soon as you get heterogenous hardware, models tend to break,” said Peter Warden.

Hmmm, maybe that is enough on the ‘challenges.’ Obviously, people go at this to succeed, not to loll in obstacles. But obstacles go with the move to production for machine learning inference. As one tinyML Summit 2021 panelist said of recent history, “we have found a lot of what doesn’t work – we know what we don’t know.”

It will be interesting to see if and how the machine learning technology moves to the edge from the cloud. In architecture, the devil isn’t in the details, but in building, it is. What is likely is that the leap from science project to useful product will depend on the future work of the participants at tinyML Summit 2021 and other conferences to come. – Jack Vaughan