“The whole point of Heterogeneous Computing is to have the right tools available, so you can use the right processor, in the right place, at the right time,” said Pat McGarry, the VP of Engineering at Ryft in a recent DATAVERSITY® interview. Such a statement certainly sounds both pertinent and beneficial, but in reality what does it mean to an enterprise? The continued push to gain the best performance possible with the lowest costs, while trying to fully leverage all of an enterprise’s data assets (whether big, small, or smart) in an environment that is increasingly seeing commodity server, x86-based, scale-out architectures reach bottlenecks has literally “reached a tipping point,” commented McGarry.
Heterogeneous (also known as Hybrid) Computing is a term that many in the data center and server architecture world are well aware of, as that proverbial tipping point has been on their radar for some time. It is really only starting to reach the business side of board rooms and the awareness of C-Level executives, though. IT costs are on the rise: Big Data and all its associated technologies and trends such as Machine Learning, Artificial Intelligence, Advanced Analytics, Internet of Things (IoT), Real-Time Processing, Edge Computing, The Cloud, and numerous others have forced enterprises of all sizes to continually upgrade their server hardware. It has now reached the point that some larger organizations have thousands upon thousands of server nodes with server farms that span the globe.
Certainly, the overall costs of computing power have plummeted in the past decade or so, with commodity servers running multi-core, sequential CPUs (and now GPUs) and gigabytes of RAM at costs never seen so low. The necessity of constantly adding more and more machines, with an ever-decreasing return on investment – especially in terms of actual computing power – is causing enterprises to rethink their architectural philosophies and look at other options. According to McGarry:
“The real reason why [people are looking at other options] is because right now, today the state of the industry is to use large x86-based clusters running various cluster technologies, running software like Hadoop, Spark, or Storm, and a few others. And the problem with that is a classic hammer and nail problem. These tools that we have now, these clustering technologies based on x86 monolithic architectures, are based literally on seventy-year-old von Neumann technology. Seventy. Now, think about that for a second.”
These traditional data center technologies that are used in server farms everywhere in the world have reached the point of diminishing returns, even when scaled out to literally thousands or tens of thousands of machines. Such returns inevitably cause IT costs to skyrocket for enterprises seeking to reliably conduct real-time processing on their streaming data at the edge of their networks, no matter whether that edge is a retail storefront, manufacturing sensors, mobile phone data, RFID tags, or any others. Add IoT “edge” data into their other Big Data assets and traditional transactional/legacy data, and the modern enterprise is stuck in a data quagmire. “That’s the whole problem in a nutshell,” said McGarry. “Sequential architectures drive the way we do business, as opposed to the business driving our technical architectures.”
Architecture 101 Quick Study
Enterprises are realizing “in a kind of backwards fashion,” said McGarry, that they can no longer get the needed performance from their existing architectures, so they’ve started to look elsewhere. The fundamental answer, “it turns out, is a simple one: we can make all this happen by using Hybrid or Heterogeneous Computing architectures. Don’t throw away your existing cluster, but add more to it.” Such architectures include a number of terms:
- John von Neumann Architectures: Pioneered by John von Neumann in a 1945 white paper on sequential processing, these are the traditional CPUs used in computers all over the world. They also include newer GPUs which add considerably more cores and other technological advances, but are in essence the same. They employ a fixed hardware structure, with sequential instruction sets, fixed bus widths, and rely on “a type of software parallelism” that is usually executed through languages such as Java or Python. They both offer considerable flexibility in terms of actual computing tasks and ease of programming, and thus have been the go to processor technology for decades.
- FPGAs: An acronym for Field Programmable Gate Array, these processor types use a completely different philosophy. An FPGA is literally “a sea of gates, a whole sea, millions of logic gates,” said McGarry. Whereas a CPU/GPU already has all its gates connected together and thus can’t be changed, an FPGA allows for the constant changing and mixing of the gates at the hardware level to do anything the programmer wants. They are not constrained by bus widths, provide much lower latency than CPU/GPU architectures, and offer true parallelism – not glorified software parallelism – at the hardware level. FPGAs are a completely different processing architecture.
So why aren’t FPGAs widely used if they are seemingly so much better than von Neumann architectures? The discussion is complicated, but can be distilled down to a few easier-to-digest points:
- Ease of Use: Back in the 1970s and 80s when these architectures were coming online as computers were becoming more popular, FPGAs were just too complicated to program. According to a 2015 article by McGarry, “FPGAs required very specific programming expertise that was rare and, therefore, expensive.” CPUs are much easier to program and at the time could easily handle the processing power required of them. Such was the case until recently. Open APIs, coupling FPGA and x86 fabrics together, and simplified (especially pre-defined) algorithms have made FPGAs much more useable.
- Education: CPUs were the primary architecture in most computers, so colleges and trade schools educated their students in von Neumann architecture programming styles. FPGAs were still used, especially by the government and financial sectors, but learning to program them was a different path of study. “It requires a different type of program mentality,” said McGarry. “People that are good in programming software are not good at programming FPGAs in general. It’s a parallel way of thinking, not a sequential way of thinking — it’s completely different.” So other than a few specialized areas, FPGAs have not seen wide use.
- Use Case Need: CPU-based technologies have worked well for decades. Major chip manufacturers have been able to employ the science of Moore’s Law for more than 50 years with great success – though that success is now deteriorating. The finite instruction sets inherent to sequential processing architectures have reached a point that even massive scalability has hit bottlenecks for many organizations. FPGAs have been shown to be at least 100x faster in specific, data-centric analytics applications like fuzzy search, term frequency, and others. “You buy about 200 to 400 times faster in most cases,” said McGarry. “We’re actually very conservative in our marketing literature. On our website, we only talk about a 100x increase.”
Business Solutions with a Purpose
Heterogeneous Computing is about focusing on purpose – whether that is a general purpose task best suited for CPUs/GPUs or a more purpose-built problem best suited for FPGAs, the point is to build an environment that allows flexibility around the business solution.
“There are lots of vendors making FPGAs now,” commented McGarry. “The two biggest are Xilinx and Altera, but as you may know, Altera was recently bought by Intel for about $16.7 billion, so clearly Intel has figured out that FPGAs are important.”
Such a focus by major chip manufacturers is allowing enterprises – who before couldn’t even consider working with FPGAs, or even more so, within a highly complex Heterogeneous Computing landscape — to move into this new space, said McGarry:
“You have a heterogeneous environment now which is everything and its mother all together: x86s and GPUs, along with FPGAs and then even other processor types like ARM processors, or maybe even others. All those things can equally fit in a heterogeneous architecture and the key becomes abstracting all those interfaces away programmatically with a business-level API. That’s the crux of the situation. That makes it all work, when you have all this technology to do the work for you, all the tools in your toolbox, with ease of use, then you’re solving whatever business problem you need to solve.”
Apache Spark is at the forefront of this evolution since it is “probably the current best of breed cluster management platform,” said McGarry. Its preferred node architecture allows specific business algorithms to point to particular nodes on a cluster:
“Therefore, instead of being one size fits all of only x86 nodes, you can have your x86 nodes along with a Ryft FPGA node and even a GPU node, so you can segment your problem space automatically by the Spark cluster, and thus send things that Ryft does better to Ryft, and so forth.”
Such revelations have not been lost on many larger enterprises. The US government has been using FPGA architectures for more than thirty years. More recently, the financial industry has employed such nodes for high-frequency trading, market movement tracking, fraud detection, and others. Microsoft employs FPGA technologies for Bing Search. Other use cases include those in Machine Learning applications, advanced visualization, anomaly detection, and streaming analytics across a variety of industries from healthcare to energy, manufacturing to online retail.
“The beauty of Heterogeneous Computing,” said McGarry, “is that you can do your analytics directly where the data is created, or if you choose, you can cleanse the data at the edge to reduce the amount of it, and then send it to your data center for analysis.” Real-time processing can become a reality since the ever-present bottlenecks are reduced through the use of the right technologies for the job.
True Heterogeneous Computing is only just beginning to be realized. It’s going to take more investment and time from the larger hardware manufacturers to get the technologies into widespread use. The technical problems in bringing the different paradigms together are moving forward. As this process continues, “I think we will see more movement away from the low-level technical development to solving business-level problems with business-level APIs. We are already doing that,” said McGarry. “There is a convergence. You’ll see the silicon converge and you’ll see the APIs converge. That’s what is coming in the future.”