Spark speeds up SQL on Hadoop in Splice Machine 2.0
Apache Spark rose to fame quite some time ago. This was an in-memory framework for processing data that is also used with Hadoop. This is now fast transforming in to a great nucleus for the building of products for the processing of data. The version 2.0 of the SQL-RDBMS on Hadoop solution Splice machine has been newly released. This new version uses Spark as one of the two different engines used for processing. The incoming work has been divided. This division has been made depending upon whether the OLTP or the OLAP workload.
About the platform
This new platform, Splice machine has a strong background. It first made a name for itself as a system for replacement for the multi-terabyte workloads that are quite conventional to the ACID RDBMS solutions such as Oracles. In fact, the company has made certain claims about former employees and the workloads on them. According to this company, they had made one former employee of Oracle to run an order of a much faster magnitude. However, the native scale out of Hadoop architecture resulted in the solution growing to the size of the different workloads. However, the cost was much lower from the conventional cost of a RDBMS>
What it can do
There have been several claims about its abilities from former and present employees. The big new innovation is all about a couple of new abilities. Among these the machine can allow the OLTP as well as the OLAP workloads to work along with each other side by side. This will ultimately use the same inventions, data and architecture. However, the processing engines will be quite different. Hence, it will become much easier to make the different decisions related to the business with the data. The architecture can identify any type of query to the proper computational engine. The queries are run under the HBase. The different queries are also processed the Spark. This will allow the memory as well as the usage of the CPU for the different types of queries to be separated from each other.
The new techniques
The decision to add Spark to the system of the Splice Machine may have been quite inevitable. This has been announced by many of the people working with the platform. The original aim of the platform was to provide the data scientists and users with an easy way to perform the different kinds of data processing that requires a lot of codes. The Splice platform has been used to rewrite a number of important data transformation products including the one belonging to IBM. With the help of Splice a completely new functionality was added. Hence, simply enhancing the product would have not been of any particular use whatsoever. This platform now faces a lot of stiff competition in its field that is constantly growing and improving. It offers a huge wealth of possibilities for the people such as the NoSQL, the NewSQL as well as options for processing in the memory that has been developed for specific purposes.
About the Author
DataFactZ is a professional services company that provides consulting and implementation expertise to solve the complex data issues facing many organizations in the modern business environment. As a highly specialized system and data integration company, we are uniquely focused on solving complex data issues in the data warehousing and business intelligence markets.