CreditEase Zheng Yun: Sharing on the Practice of Big Data Financial Cloud

CreditEase Zheng Yun: Sharing on the Practice of Big Data Financial Cloud

CreditEase has accumulated nine years of data, including data from partners, user authorization, and some data publicly captured from the Internet. Therefore, we hope to use big data technology to tap the potential of the data, especially the value of Internet finance, and provide better services to customers.

Zheng Yun , Technical Director of CreditEase, is responsible for the R&D of several innovative Internet financial products driven by big data. Before joining CreditEase, he worked as a R&D manager at Hulu, an online video company in the United States, responsible for the technical R&D of video playback and website hosting. He also worked in R&D at Microsoft. Zheng Yun graduated from the Department of Automation at Tsinghua University with a master's degree.

LAIN Platform

Zheng Yun said that whether it is a cloud platform or a system, the platform must be stable and have a pillar. The first pillar is the big data infrastructure, and the second is the LAIN platform based on Docker. The data modeling between different businesses is different. But for example, the development environment, testing, including automated testing, regular testing, including release, including technical services, such as log collection, monitoring, including distributed architecture, operating system, network, security, etc., are actually universal, so we put these together into a platform, which is our cloud platform, which is what we often call the Pass system.

Docker has been a very popular technology in the past two years, especially at the beginning of this year. First, it is an open source container engine. Second, it is actually to further solve the problem of virtualization. With Docker, we can put each module into Docker, and Docker is independent of each other. Then, through the microservices, they are connected in series, which is very flexible. Its performance is also very good, and the additional cost is almost zero.

The most central one is Docker, which has three main technologies, the so-called three carriages. The first is Docker Swarm, which is a Docker container management and scheduling tool officially provided by Docker. Because it is officially provided, it has the advantage of inherent integration. Then there is ETCD, which is a very famous, lightweight distributed consistency storage. We mainly use it for some configuration storage, such as service registration and service discovery. Crlico is a set of network technologies open sourced by a communications company. It is a three-layer SDN that can replace the traditional Docker method of defining or porting.

Knowledge Graph

What is a knowledge graph? Compared to traditional documents or structured data, it has a feature: it has entities. It was first proposed by Google for search optimization. We use it mainly for risk control-related data modeling. There is also personalized question and answer, which can be used to fight fraud based on customer information or even personalized questions.

First, on the web side, we will use our distributed query to exclude the public data and some user-authorized data, and then divide it into HDFS. Then we use Sqoop to get our business data to our HBase, so we extract and structure it to form a knowledge graph. Then, in this knowledge graph, our commonly used query fields are thrown into ElasticSearch, and then provided to all front-ends for use. At the same time, the knowledge graph can also be used as a data source for rule engines and machine learning.

How to solve the anti-fraud problem in real-time credit granting

Real-time credit granting must first solve the problem of anti-fraud. So we will do anti-fraud from three aspects. The first is his identity. First, we must make sure that your mother is your mother and you are you, so we will check whether his platform account is real and whether his personal identity information is real, and then confirm the authenticity of his information through some personalized questions and answers. The second point is to look at his behavioral data, for example, whether there are some signs of fraud in his business activities, whether this person has entered some intermediary forums on the Internet, and whether he has participated in such activities. The third aspect is his relationship level. For example, the black circle in this picture is the blacklist, and the red circle is the customer who has overdue payments. Then through various data, the comprehensive credit score is calculated, and the approval and risk assessment are determined based on the score.

Data-driven methodology

From a methodological perspective, data-driven requires first of all massive amounts of data. Secondly, I have to classify the data, and then analyze the data, and finally use the data to drive our product decisions.

After the data is classified, we can further analyze it. One is to explain the phenomenon based on the existing data, that is, we know why it is like this. The second and more important thing is that we hope to use data to guide the optimization of the future, which is also the goal that many companies want to pursue.

To summarize the entire speech : First of all, our entire financial cloud needs an underlying pillar, which is the big data infrastructure and cloud platform I just mentioned. On top of it, we quickly build some core modules through applications like Yisou, such as anti-fraud, real-time credit, and so on. We use such a platform to continuously optimize the entire product and the core modules through products on both ends, commercial loans and financial products, and then form a complete framework for the entire platform. On this framework, we hope to provide better services to our users. We are also connecting data with partners to provide some service-oriented scenarios.

<<:  Typed yet flexible Table View Controller

>>:  WOT2016 Liu Ziqian: Yunti is the defender of Internet security

Recommend

6 data truths about Kuaishou live streaming sales

If we say that there are already three top stream...

Learning Numerical Algorithms with Playground

In middle school, there was nothing more terrifyi...

What equipment do you need to shoot short videos for your own media?

Due to the huge traffic of Tik Tok and Kuaishou, m...

Li Jiaqi and Li Ziqi: Where should traffic influencers go?

Two mobile phones and two internet celebrities tr...

The fusion of brand advertising and performance advertising!

There is a logic that says that we should use the...

8 reasons not to upgrade to Windows 10 yet

There are only a few days left until July 29th. I...

How to effectively promote the product in the early stage?

"Successfully executing a plan that makes no...

"China's Sky Eye" has made another important new discovery!

The search for nanohertz gravitational waves is o...