Riding on a wave of an explosion in the use of machine learning to power, well, just about everything is the emergence of GPUs as one of the go-to methods to handle all the processing for those operations.
But getting access to those GPUs — whether using the cards themselves or possibly through something like AWS — might still be too difficult or too expensive for some companies or research teams. So Davit Buniatyan and his co-founders decided to start Snark AI, which helps companies rent GPUs that aren’t in use across a distributed network of companies that just have them sitting there, rather than through a service like Amazon. While the larger cloud providers offer similar access to GPUs, Buniatyan’s hope is that it’ll be attractive enough to companies and developers to tap a different network if they can lower that barrier to entry. The company is launching out of Y Combinator’s Summer 2018 class.
“We bet on that there will always be a gap between mining and AWS or Google Cloud prices,” Buniatyan said. “If the mining will be [more profitable than the cost of running a GPU], anyone can get into AWS and do mining and be profitable. We’re building a distributed cloud computing platform for clients that can easily access the resources there but are not used.”
The startup works with companies with a lot of spare GPUs that aren’t in use, such as gaming cloud companies or crypto mining companies. Teams that need GPUs for training their machine learning models get access to the raw hardware, while teams that just need those GPUs to handle inference get access to them through a set of APIs. There’s a distinction between the two because they are two sides to machine learning — the former building the model that the latter uses to execute some task, like image or speech recognition. When the GPUs are idle, they run mining to pay the hardware providers, and Snark AI also offers the capability to both mine and run deep learning inference on a piece of hardware simultaneously, Buniatyan said.
Snark AI matches the proper amount of GPU power to whatever a team needs, and then deploys it across a network of distributed idle cards that companies have in various data centers. It’s one way to potentially reduce the cost of that GPU over time, which may be a substantial investment initially but get a return over time while it isn’t in use. If that’s the case, it may also encourage more companies to sign up with a network like this — Snark AI or otherwise — and deploy similar cards.
There’s also an emerging trend of specialized chips that focus on machine learning or inference, which look to reduce the cost, power consumption or space requirements of machine learning tasks. That ecosystem of startups, like Cerebras Systems, Mythic, Graphcore or any of the other well-funded startups, all potentially have a shot at unseating GPUs for machine learning tasks. There’s also the emergence of ASICs, customized chips that are better suited to tasks like crypto mining, which could fracture an ecosystem like this — especially if the larger cloud providers decide to build or deploy something similar (such as Google’s TPU). But this also means that there’s room to potentially create some new interface layer that can snap up all the leftovers for tasks that companies might need, but don’t necessarily need bleeding-edge technology like that from those startups.
There’s always going to be the same argument that was made for Dropbox prior to its significant focus on enterprises and collaboration: the price falls dramatically as it becomes more commoditized. That might be especially true for companies like Amazon and Google, which have already run that playbook, and could leverage their dominance in cloud computing to put a significant amount of pressure on a third-party network like Snark AI. Google also has the ability to build proprietary hardware like the TPU for specialized operations. But Buniatyan said the company’s focus on being able to juggle inference and mining, in addition to keeping that cost low for idle GPUs of companies that are just looking to deploy, should keep it viable, even amid a changing ecosystem that’s focusing on machine learning.