Microsoft rolls out open source toolkit for machine learning
Very recently, Google has released a framework for open source machine learning. Now, Microsoft too has released a similar type of project called the DMLT or the distributed machine learning toolkit. This helps to simplify the manner of machine learning work across the different types of distributed systems. This is done by allowing the models to be trained on multiple types of nodes all at once. This is a very important part of the procedure of machine learning.
About the new model
Microsoft has already introduced its new framework to the audience. In the introduction it mentioned that the bigger models are generally more accurate. Hence, for the common machine learning researchers as well as practitioners it is a challenge to learn about the bigger models. The DMLT has a core that is a C++ SDK. It runs on client server architecture. There are a number of different servers. These run on different machines. They are also responsible for maintaining the global parameters for the models. All of this is registered by the Microsoft Company in the official documentation records. The training routines are also very helpful. These help to access as well as update the different parameters with some of the APIs of the clients that call the different underlying communication facilities.
What it wants
Microsoft has its own plans for the new framework as well. It wants to make DMLT easier for the many data scientists who are using the framework in order to perform across the multiple machine nodes. With this they will never have to worry about the nitty-gritty of the managing threads or the huge amount of workloads. The process will simplify the entire operation of the interprocess communication as well. Now two different libraries for these are available. These may soon be used interchangeably.
Some other features of the framework
Some other important features have also been included in the model. There are two major algorithms that have been included with the DMLT. The feature called light LDA will also most probably be used for the fast training of the large data models. In fact, Microsoft has even claimed that it will be able to train the models with the different trillions of parameters. This is being done on a system of eight nodes only with the light LDA. Some other features are also included such as the distributed word embedding and the distribute multi sense word embedding as well. These are simple algorithms for the determination of the relationships of the different words to each other. The release of the DMLT framework was kept extremely low key by Microsoft. The only fanfare at its release was a string of blog posts that popped up. The website on the DMLT framework has announced that it is available to all users from the early days of November. However, Microsoft claims that this is only the starting point of their future plans. It has already planned to offer a lot with DMLT. Some other algorithms are also on the way.
About the Author
DataFactZ is a professional services company that provides consulting and implementation expertise to solve the complex data issues facing many organizations in the modern business environment. As a highly specialized system and data integration company, we are uniquely focused on solving complex data issues in the data warehousing and business intelligence markets.