introduction As a data scientist, I have always had a dream that top technology companies continue to launch new products in fields related to me. If you watched Apple's latest iPhone X announcement, you'll see that the iPhone X has some really cool features like FaceID, Animoji, and Augmented Reality, all of which use machine learning. Being a hacker myself, I decided to try my hand at exploring how to build such a system. After further investigation, I found an interesting tool, which is CoreML, Apple's official machine learning framework tool for developers. It can be used on any Apple device, including iPhone, Macbook, Apple TV, Apple watch, etc.
Another interesting revelation is that Apple designed a custom GPU in its latest iPhones, as well as an A11 advanced Bionic processing chip with a Neural Engine for optimized machine learning. As the core component computing engine becomes increasingly powerful, the iPhone will open up new avenues for machine learning, and CoreML will become increasingly important in the future. After reading this article, you will understand what Apple CoreML is and why it is gaining momentum. We will also explore the implementation details of CoreML by developing a spam SMS classification app on iPhone. At the same time, we will also end this article by objectively evaluating the pros and cons of CoreML. Article Directory:
01. What is CoreML? This year, Apple has been hyping CoreML at its annual Worldwide Developers Conference (WWDC), similar to Google's I/O conference. To better understand the role of CoreML, we need to understand some background. Background on CoreML Interestingly, this is not the first time Apple has released a mobile machine learning framework. Last year it released some of the same framework libraries:
The difference between these two framework libraries is that one is optimized for CPU and the other for GPU. This is because sometimes during inference, the CPU is faster than the GPU, while during training, the GPU is almost always faster. But to improve performance, the frameworks get very close to the underlying hardware, making these hybrid frameworks confusing for developers and difficult to program. Enter CoreML CoreML will provide an abstraction layer on top of the two libraries mentioned above and will also provide a simple interface to achieve the same efficiency. Another benefit is that when our app is running, CoreML fully takes care of the context switching between the CPU and GPU. In other words, if we have a memory-intensive task that involves text processing (natural language processing), CoreML will automatically run on the CPU; and if we have a computationally heavy task, such as image recognition, it will use the GPU; when the app contains both functions, it will automatically switch so that both are utilized to the maximum extent. What else will CoreML offer? CoreML also comes with three libraries on top:
All the libraries mentioned above can be easily used with some simple interfaces and can be used to complete a range of tasks. With the above libraries, the final framework diagram of CoreML is as follows: Note that the above design provides a nice modular structure for iOS apps. You can use different layers for different tasks and use them in multiple ways (for example, using NLP for image classification in your app). Learn more: Vision, Foundation, and GameplayKit. Ok, now that we have enough theory, it's time to put it into practice! "Due to WeChat layout limitations, students who need the code can look up the original link at the end of the article to find it themselves." 02. Establish a system To fully use CoreML, you need to follow these requirements: 1.OS: MacOS (Sierra 10.12 or above) 2. Python 2.7 and pip: Click to download python on mac. Open the terminal and enter the following code to install pip:
3.coremltools: This package helps convert your model from Python to a format that CoreML can understand. Enter the following code in the terminal to install it:
4. Xcode 9: This is the default software for building applications on Apple devices. Click here to download. Before downloading Xcode, you need to log in with your Apple ID. After logging in, you will need to verify your Apple ID. You will receive the same notification as the device where your Apple ID is registered. Click "Allow" and enter the 6-digit password displayed by the website. When you complete this step, you will see a download option. You can download Xcode from there. Now that we have the system set up, let's move on to the implementation part if we are ready! 03. Case Study: Implementing a spam SMS classification app on iPhone In this development, we will focus on using CoreML's capabilities in two important ways. Let's get started! Convert your machine learning model to CoreML format One of the strengths of CoreML, or should I say a wise decision made by its creators, is the support for converting machine learning models trained in other popular frameworks like sklearn, caffe, xgboost, etc. The data science community will not be reluctant to try CoreML as they can experiment in their favorite environment, train their models, and then easily import and use them in apps on iOS/macOS. Here are the frameworks that CoreML supports out of the box: What is Mlmodel? To make the conversion process simple, Apple designed its own open format to represent cross-framework machine learning models, namely mlmodel. This model file contains a description of each layer of the model, inputs, outputs, class labels, and any preprocessing required for the data. It also contains the learned parameters (weights and biases). The conversion process is as follows:
In this example, we will train a spam classifier in sklearn and then transfer the model to CoreML. About the spam text message dataset The SMS Spam Dataset v.1 is a public SMS labeled text message dataset for mobile spam text message research. It contains 5574 real, uncoded English text messages, which are marked as legitimate (contrived) or spam text messages. You can download the dataset here. Building a basic model We use LinearSVC in sklearn to build the basic model. At the same time, we extract the TF-IDF value of the text message as the model feature. TF-IDF is a method in natural language processing that classifies documents based on the words that uniquely identify the document. If you want to learn more about NLP and tf-idf, you can read this article. The code is as follows:
Now that our model is built, let's test it with a spam message:
Interesting, our model works pretty well! Let’s add cross validation:
How does this work? First, we use the coremltools Python toolkit. Then we select a converter to convert the model. In this case, we use converters.sklearn because the model to be converted is built using the sklearn tool. Then, we declare the model object, input variable names, and output variable names within the .convert() brackets. Next, we set the model parameters to add more information about the input and output, and finally use .save() to save the model file that has been converted to CoreML format. Double-click the model file and it will open in Xcode. As you can see, the model file shows a lot of information about the type of model, its inputs, outputs, the types of inputs and outputs, etc. I have marked them in red in the image above. You can compare these descriptions with what is provided when converting to .mlmodel. It’s that simple to import your own model into CoreML. Now that your model is in the Apple ecosystem, the real fun begins! Note: The full code for this step can be found here. Learn more about coremltools here and the different types of converters provided here. Using this model in our app Now that we have trained our model and imported it into CoreML, let’s use it to develop an iPhone spam classification app! We will run the app on the simulator. The simulator is a piece of software that displays the app's interface and operation as if it were actually running on an iPhone. This saves a lot of time because we can test the code and debug before running the app on an iPhone. Here's what the final product looks like: Download Project I have made a simple basic UI for our app and put it on GitHub. Load and run it with the following command:
This will open our project in Xcode. I have highlighted three important areas in red in the Xcode window:
Let's run the app and see what happens. Click the play button in the upper left corner to run the app in the simulator. Type some text in the box and click the predict button. What happened? So far, our app does nothing but output the text typed into the box as is. Add a trained model to your app Pretty simple:
Compile the model Before we can use our model for inference, we need to let Xcode compile the model during the build phase. Here are the specific steps: Select the file with a blue icon in the project navigation bar. The project settings will open on the right hand side. Click on Compile Sources and select the + icon. In the new window that appears, select the SpamMessageClassifier.mlmodel file and click Add. Now every time you run the app, Xcode will compile our machine learning model so that it can be used to make predictions. Creating the model in code Any app developed for Apple devices is programmed in Swift. You don’t need to learn Swift but if you are interested in learning more later, you can follow this tutorial. Select ViewController.swift in the Project Navigator. This file contains most of the code that controls the app’s functionality. The predictSpam() function on line 24 does most of the work. Delete line 25 and add the following code to the function:
The above code checks if the user has entered any information into the box. If so, the tfidf() function is called to calculate the tfidf value of the text. Then a SpamMessageClassifier object instance is generated and the .prediction() function is called. This is the same as the .predict() function in sklearn. Then appropriate information is displayed based on the prediction. But why is tfidf() needed? Remember that we are training the model based on the tf-idf representation of the text, so our model needs the same form of input. Once we have the information typed into the text box, call the tfidf() function to do the same thing. Let's write the code for this step. Copy the following code and put it after the predictSpam() function:
The above code gets the tfidf representation of the information entered in the text box, to do this read the SMSSpamCollection.txt original database and return the same information. Once you save the project and run the simulator again, your app will run fine. 4. Advantages and disadvantages of CoreML Like every evolving library, CoreML has pros and cons. Let’s be clear. advantage:
Because it can use the CPU, you can run it on the iOS Simulator (iOS Simulator does not support GPU). It provides many models because it can import models from other mainstream machine learning frameworks:
shortcoming:
Conclusion In this article, we learned about CoreML and applied it to develop iPhone machine learning apps. CoreML is a newer library, so it has its own advantages and disadvantages. One very useful advantage is that it runs on the local device, so it is faster and guarantees data privacy. But at the same time, it is not comprehensive and does not consider the needs of data scientists well enough. I hope that subsequent versions will improve. If you get stuck at any point, all the code for this article is available on GitHub. |
<<: How to solve some problems encountered in Xcode9 and iOS 11
>>: Do you know? How to learn TCP protocol
LeTV and Xiaomi, two Internet companies that clai...
Normally, we think of event operation as a part o...
Where is China's spring? Rapeseed knows best ...
Coffee has always attracted much attention as a l...
Author | Wang Siliang Review | Zheng Chengzhuo Ed...
Quantum computers have the potential to revolutio...
There are many ways to position yourself, but the...
A special month of June, with Children’s Day , Gr...
What is the price for renting a server in Chongqi...
There are two kinds of problems in this world. On...
Frog jumps, backflips, kick steps... Recently, in...
Dogs are humans’ best friends and companions , an...
Today I will share with you a very important modu...
Almost at the same time, the two major search gia...
As summer arrives quietly, the golden sunshine em...