GOAI, The Open Analytics Initiative Updates
GOAI or GPU Open Analytics Initiative is to serve as common data frameworks enabling developers and statistical researchers to accelerate data science on GPUs. Continuum Analytics, a creator of Python tools for GPUs; and H2O, which provides machine learning algorithms that run on GPUs are all working together on this initiative. The fact that it is an open source platform, it also helps in fostering the development of a community of data science and deep learning workloads running on GPUs. A python based API will also be introduced soon to address certain complexities. Continuum Analytics, H2O.ai and MapD Technologies are the founding members of GOAI, which was unveiled at Nvidia’s annual GPU Technology Conference in San Jose, California.
The main objective of GOAI is accessing and working with the same data in a GPU environment. The GPU Data Frame is a common API that enables the efficient interchange of data between processes running on the GPU.
The idea is also to provide end-to-end computing on top of GPUs.
The First Project
The first project that the GOAI members are working on its called the GPU Data Frame which is resembles the Apache Arrow Project in certain ways. Spark has a concept of a data frame, and so does the Python NumPy analytics routines commonly used in analytics, and these are common data formats that would allow different parts of a composite application mashing up database, analytics, and machine learning functions to access the same data as it resides inside of the memory of GPUs. Using this, Anaconda is mobilizing the Open Data Science movement by helping teams avoid the data transfer process between CPUs and GPUs and move nimbly toward their larger business goals.
A working demo of the GPU pipeline was demonstrated at the GPU Technology Conference. Also, it is intriguing as various frameworks working in it helps in integrating within GPU databases.
Opening Up the Database
With the machine learning frameworks and popular analytics tools all being open source, the company investors were not keen on making it open source due to obvious reasons. MapD Technologies is a next-generation analytics software company. Its technology harnesses the massive parallelism of modern graphics processing units (GPUs) to power lightning-fast SQL queries and visualization of large data sets. The MapD analytics platform includes the MapD Core database and MapD Immerse visualization client. These software products provide analysts and data scientists with the fastest time to insight, performance not possible with traditional CPU-based solutions. MapD software runs on-premise and on all leading cloud providers.
About the Author
DataFactZ is a professional services company that provides consulting and implementation expertise to solve the complex data issues facing many organizations in the modern business environment. As a highly specialized system and data integration company, we are uniquely focused on solving complex data issues in the data warehousing and business intelligence markets.