Advertisement

Solving ML’s Biggest Problems with Human-In-The-Loop Machine Learning

By on

loopby Angela Guess

Lukas Biewald, CEO of CrowdFlower, recently wrote in Inside Big Data, “machine learning is getting easier because more and more of big players in the space are open-sourcing their algorithms. In the past year alone, IBM, Facebook, Google, and Microsoft have done so. Having those open-sourced algorithms means businesses spend far less time (and money) creating and fine-tuning their own models. Take all those things together and you can see why machine learning is near the peak of the technology hype cycle. There’s promise and ML is accessible. But there are two major issues that more and more companies are butting up against with regard to the promise of machine learning: accuracy and training data. And interestingly, both are solved with people. We call this human-in-the-loop machine learning.”

Biewald goes on, “First off, let’s talk about training data. There’s a reason that those big players I mentioned above open-sourced their algorithms without worrying too much about giving away any secrets: it’s because the actual secret sauce isn’t the algorithm, it’s the data. Just think about Google. They can release TensorFlow without a worry that someone else will come along and create a better search engine because there are over a trillion searches on Google each year. Those searches are training data and that training data comes from people; no algorithm can learn without data. After all, it’s not that machine learning models are smarter than people, it’s that they can parse and learn from near unfathomable amounts of data. But those models can’t figure out what to do with new data or how to make judgments on it without training data, created by humans, to actually inform their learning process.”

Read more here.

photo credit: Flickr/ Loozrboy

Leave a Reply