11 open source projects for machine learning

11 open source projects for machine learning

Machine learning is a hot topic in the field of data analysis. Various machine learning algorithms are often used in daily study and life. In fact, many machine learning algorithms based on Python, Java, etc. have been implemented many times by predecessors. Many of these algorithms can be found on the Internet, but there are often many "dirty" or "messy" open source codes.

In this context, InfoWorld recently announced 11 of the most popular open source projects in the field of machine learning. Most of these 11 open source projects are related to spam filtering, face recognition, and recommendation engines. Most of them are based on today's most popular languages ​​and platforms, and promote and expand many important algorithms in the field of machine learning. From them, users can not only find topic models such as LDA, but also hidden Markov models such as HMM. These models are hot spots in the application field and are most needed by researchers.

  1. Scikit-learn

    Scikit-learn is a very powerful Python machine learning toolkit. It provides very convenient mathematical tools by building NumPy and Matplotlib on the existing Python. This toolkit includes many simple and efficient tools, which are very suitable for data mining and data analysis.

    On the homepage, you can see the User Guide, which is an index of the entire machine learning, where users can learn various effective methods. In the Reference, users can find the specific usage index of each class.

  2. Shogun

    Shogun is the oldest open source machine learning library based on C++, which was created in 1999. As a SWIG library, Shogun can be easily embedded in mainstream processing languages ​​such as Java, Python, C#, etc. Its focus is on kernel methods on a large scale, especially the learning toolbox of "support vector machines". Among them, it includes a large number of linear methods, such as LDA, LPM, HMM, etc.

  3. Accord Framework/AForge.net

    Accord is an extension of AForge.net, a machine learning and signal processing framework based on .Net. It includes a series of machine learning algorithms for images and audio, such as face detection, SIFT stitching, etc. At the same time, Accord supports real-time tracking of moving objects and other functions. It provides a machine learning library from neural networks to decision tree systems.

  4. Mahout

    Mahout is a well-known open source project under Apache Software, which provides the implementation of many classic machine learning algorithms, aiming to help developers create intelligent applications more conveniently and quickly. Mahout contains many classic algorithms such as clustering, classification, and recommendation, and provides a very convenient cloud service interface.

  5. MLlib

    MLlib is Apache's own Spark and Hadoop machine learning library, which is designed to execute most of the common machine learning algorithms included in MLlib at a large scale and high speed. MLlib is a project developed based on Java, and can be easily connected to languages ​​such as Python. Users can design and write code for MLlib themselves, which is a very personalized design.

  6. H2O

    H2O is 0xdata's flagship product and a core data analysis platform. Part of it is written in R, and the other part is written in Java and Python. Users can deploy H2O's R program installation package, and then run it in the R language environment. H2P's algorithm is aimed at business fraud or trend prediction, and is currently in a new round of financing.

  7. Cloudera Oryx

    Oryx is also an open source machine learning project designed by Hadoop, provided by the creators of Cloudera Hadoop Distribution. Oryx enables machine learning models to be used on real-time data streams, such as spam filtering.

  8. GoLearn

    GoLearn is an all-in-one machine learning library for the Go language built by Google, with the goal of being simple and customizable. Go is Google's flagship language and is increasingly being used. The simplicity of GoLearn lies in the fact that data is loaded and processed within the library, so data structures can be customizable and extended with source code.

  9. Weka

    >Weka is an open source project for user data mining developed in Java. As an open data mining platform, Weka integrates a large number of machine learning algorithms that can undertake data mining tasks, including data preprocessing, classification, regression, clustering, etc. At the same time, Weka realizes the visualization of big data and realizes the interaction between people and programs through a new interactive interface designed in Java.

  10. CUDA-Convnet

    CUDA is a well-known GPU acceleration suite. CUDA-Convnet is a machine learning library for neural network applications based on GPU acceleration. It is written in C++ and uses NVidia's CUDA GPU processing technology.

    Currently, this project has been reorganized into CUDA-Convnet2, which supports multiple GPUs and Kepler-generation GPUs. The Vuples project is similar, written in F# and suitable for the .Net platform.

  11. ConvNetJS

    ConvNetJS is an online deep learning library based on JavaScript, which provides an online deep learning training method. It can help beginners of deep learning understand the algorithm faster and more intuitively, and give users the most intuitive explanation through some simple demos.

<<:  NetEase, don’t drift away

>>:  The dilemma Chinese Internet companies leave for Apple_Mobile Technology Semi-monthly Issue 47_51CTO.com.htm

Recommend

Forgotten interests, where is the operational value of mini programs?

In the open class on December 28, Zhang Xiaolong ...

APP Promotion Operation Manual Complete Strategy

Starting from the position of mobile Internet mar...

Choosing and cooking pork in this way ensures it is healthy and delicious

Meat is an indispensable delicacy on many people&...

Keep product operation analysis

This article is a product analysis report of Keep...

Weibo advertising creative optimization skills, placement and traffic generation

I believe most advertisers are very familiar with...

Whoosh~Eeh? They seemed to smell the "smell" of the universe...

The universe, this mysterious and vast space, alw...

Gu Yue "Next Generation Game Props Production"

Course Catalog ├──Chapter 1: Middle Model Constru...