Code practice for handling i18n international telephone area codes

Code practice for handling i18n international telephone area codes


Preface

Last week, I was busy with the internationalization (i18n) of the product. One of the important aspects is the internationalization of phone numbers (we use phone numbers as the main account). A very important part of the phone number is the area code.

​​

The above picture is the login interface of our product. In addition to the regular phone number, there is also an area code in front of it, which represents the country and region the phone number belongs to. For the concept of area codes, please refer to Wikipedia.

Seeing this, some people may wonder what's so difficult about this? Isn't it just displaying it in a list? There are several problems with this.

  • Since it supports multiple languages, the country names displayed in different language environments are different. For example, "China" is "中国" in simplified Chinese, "China" in English, and "중화인민공화국" in Korean. The display order in each language is different.
  • Maintaining a table like this for different countries and languages ​​would be too much work and most companies probably can't do it.

So we will do this work locally, but iOS has already done part of the work for us. We can get the localized name of a country or in the current region based on the country code.

 //Get the current locale
NSLocale *locale = [NSLocale currentLocale];

//Get all country codes
NSArray *countryArray = [NSLocale ISOCountryCodes];

for (NSString *countryCode in countryArray)
{
//Get the localized name of the specified country based on the current locale and country short code
NSString *localName = [locale displayNameForKey:NSLocaleCountryCode value:countryCode];
}

Let's do a simple test

 NSArray *countryArray = [NSLocale ISOCountryCodes];
NSArray *languageArray = @[@"zh_CN",@"en_US",@"ja_JP"];

for (NSString *languege in languageArray)
{
NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:languege];

for (int i = 0; i < 5; ++i)
{
NSString *countryCode = countryArray[i];

NSString *displayName = [locale displayNameForKey:NSLocaleCountryCode value:countryCode];

NSLog(@"%@\t%@\t%@",languege,countryCode,displayName);
}
}

result

 zh_CN AD Andorra
zh_CN AE United Arab Emirates
zh_CN AF Afghanistan
zh_CN AG Antigua and Barbuda
zh_CN AI Anguilla

en_US AD Andorra
en_US AE United Arab Emirates
en_US AF Afghanistan
en_US AG Antigua and Barbuda
en_US AI Anguilla

ja_JP AD Andra
ja_JP AE アラブ Chief Minister Kun Lianbang
ja_JP AF アフガニスタン
ja_JP AG アンティグア・バーブーダ
ja_JP AI Angela

Now that we have introduced some of the work that iOS does for us, we have to do the other part ourselves. We need a list of regions->area codes, but this is also simple. I found a lot of them online. The file content is as follows (diallingcode.json)

 [
{
"name": "Afghanistan",
"dial_code": "+93",
"code": "AF"
},
{
"name": "Albania",
"dial_code": "+355",
"code": "AL"
},

...
...
//Omitted in the middle
...
...

{
"name": "Virgin Islands, British",
"dial_code": "+1 284",
"code": "VG"
},
{
"name": "Virgin Islands, US",
"dial_code": "+1 340",
"code": "VI"
}
]

Maintaining such a table is very simple. We can store it locally or on the server (the "name" field is actually not necessary, it's just for appearance)

Research

Let's put the code aside for now and see how other products do it

This is WeChat

​​

WeChat still has a lot of problems

  • The left side is the Chinese environment. The grouping by pinyin is correct, but the text order is wrong. The countries starting with "阿" are not arranged together.
  • On the right is the French environment. These derived Latin letters are not correctly classified.

This is Twitter

​​

Twitter is still weird in Chinese, but it didn’t make WeChat’s second mistake

What about Facebook? Their engineers are smart (lazy) and they don’t support indexing at all

Next we will solve these problems

Code

First, create a Modal to display country-related information.

 @interface MMCountry : NSObject

@property (nonatomic, strong) NSString *name; //Country name (localized version)
@property (nonatomic, strong) NSString *code; //Country code
@property (nonatomic, strong) NSString *latin; //Latin text of the country name (only basic Latin letters)
@property (nonatomic, strong) NSString *dial_code; //area code

@end

Then we need to read the area code from the configuration file and create an index using the area code as the key

 NSData *data = [NSData dataWithContentsOfFile:[[NSBundle mainBundle] pathForResource:@"diallingcode" ofType:@"json"]];
NSError *error = nil;

NSArray *arrayCode = [NSJSONSerialization JSONObjectWithData:data options:0 error:&error];

if ( error ) {

return;
}

//Read the file
NSMutableDictionary *dicCode = [@{} mutableCopy];

for (NSDictionary *item in arrayCode)
{
MMCountry *c = [MMCountry new];

c.code = item[@"code"];
c.dial_code = item[@"dial_code"];

[dicCode setObject:c forKey:c.code];
}

Then get the local dialect names of these countries

 NSLocale *locale = [NSLocale currentLocale];
NSArray *countryArray = [NSLocale ISOCountryCodes];

NSMutableDictionary *dicCountry = [@{} mutableCopy];

for (NSString *countryCode in countryArray) {

if ( dicCode[countryCode] )
{
MMCountry *c = dicCode[countryCode];

//You know here
c.name = [locale displayNameForKey:NSLocaleCountryCode value:countryCode];
if ( [c.name isEqualToString:@"Taiwan"] )
{
c.name = @"Taiwan, China";
}

// Latinize the name
c.latin = [self latinize:c.name];

[dicCountry setObject:c forKey:c.code];
}
else
{
//If it is not found, it means that the configuration file is incomplete and can be completed
NSLog(@"missed %@ %@",[locale displayNameForKey:NSLocaleCountryCode value:countryCode],countryCode);
}
}

It should be noted here that the Latin culture of letters solves the second problem of WeChat, so that non-basic Latin letters can also be sorted according to basic Latin letters. The function is as follows

 - (NSString*)latinize:(NSString*)str
{
NSMutableString *source = [str mutableCopy];

CFStringTransform((__bridge CFMutableStringRef)source, NULL, kCFStringTransformToLatin, NO);

//This is how WeChat does it
//CFStringTransform((__bridge CFMutableStringRef)source, NULL, kCFStringTransformMandarinLatin, NO);

CFStringTransform((__bridge CFMutableStringRef)source, NULL, kCFStringTransformStripDiacritics, NO);

return source;
}

There are two steps here

  1. First convert the text into Latin letters (kCFStringTransformToLatin)
  2. Then remove the diacritics from Latin letters (kCFStringTransformStripDiacritics)


This is the first mistake made by WeChat, that is, the mistake of not correctly classifying. Because WeChat only processes Chinese characters in the first step, other characters are not processed, resulting in the failure to obtain the correct basic Latin characters in the second step (kCFStringTransformMandarinLatin, see the commented out code)


Let's test whether these two steps will result in the same effect as before.

 NSArray *countryArray = [NSLocale ISOCountryCodes];
NSArray *languageArray = @[@"zh_CN",@"en_US",@"ja_JP"];

for (NSString *languege in languageArray)
{
NSLocale *locale = [[NSLocale alloc] initWithLocaleIdentifier:languege];

for (int i = 0; i < 5; ++i)
{
NSString *countryCode = countryArray[i];

NSString *displayName = [locale displayNameForKey:NSLocaleCountryCode value:countryCode];

NSLog(@"%@\t%@\t%@\t@",languege,countryCode,displayName,[self latinize:displayName]);
}
}

result

 zh_CN AD Andorra | an dao er
zh_CN AE United Arab Emirates |
zh_CN AF Afghanistan | a fu han
zh_CN AG Antigua and Barbuda |
zh_CN AI Anguilla | an gui la
en_US AD Andorra | Andorra
en_US AE United Arab Emirates | United Arab Emirates
en_US AF Afghanistan | Afghanistan
en_US AG Antigua & Barbuda | Antigua & Barbuda
en_US AI Anguilla | Anguilla
ja_JP AD andora | andora
ja_JP AE アラブ长国连碰|arabu shou zhang guo lian ban
ja_JP AF アフガニスタン| afuganisutan
ja_JP AG アンティグア・バーブーダ| antigua・babuda
ja_JP AI Angira | angira

You can see that the system will convert different expressions of the same country into different Latin letters according to the characteristics of different countries and languages.

Next, we classify the acquired data according to 'A'-'Z'

 NSMutableDictionary *dicSort = [@{} mutableCopy];

for (MMCountry *c in dicCountry.allValues)
{
NSString *indexKey = @"";

if ( c.latin.length > 0 )
{
indexKey = [[c.latin substringToIndex:1] uppercaseString];

char c = [indexKey characterAtIndex:0];

if ( ( c < 'A') || ( c > 'Z' ) )
{
continue;
}
}
else
{
continue;
}

NSMutableArray *array = dicSort[indexKey];

if ( !array )
{
array = [NSMutableArray array];

dicSort[indexKey] = array;
}

[array addObject:c];
}

*** Rearrange the order of the data under each category

 for ( NSString *key in dicSort.allKeys )
{
NSArray *array = dicSort[key];

array = [array sortedArrayUsingComparator:^NSComparisonResult(MMCountry *obj1, MMCountry *obj2) {

return [obj1.name localizedStandardCompare:obj2.name];
}];

// array = [array sortedArrayUsingComparator:^NSComparisonResult(CSCountry *obj1, CSCountry *obj2) {
//
// return obj1.latin > obj2.latin;
// }];

dicSort[key] = array;
}

In this way, dicSort is the result set we finally get


Here is the second mistake made by WeChat. WeChat sorts by latin (see commented out code). So the countries with the same Chinese characters cannot be sorted together. The correct way is to use localizedStandardCompare to sort. This is also the localized comparison function that iOS has provided for us. Look at the previous picture and pick three countries, for example: Albania, Ireland, Aruba. Their pinyin is aerbabiya, aierlan, aluba. If they are sorted by pinyin, this is the correct sort.


Let’s take a look at the final result

​​

Is it better than WeChat?

discuss

Although the code is written, the problem has not been solved. A key question is why do we need to sort by 'A'-'Z'? For example, Twitter in Japanese and Korean environments is like this

​​

In fact, the best solution is to index according to the language characteristics of different countries (PS: seeing the poor results of Twitter in the Chinese environment, I am not sure whether its results in Japanese and Korean are correct (¯﹃¯)
Of course, if you really want to do this, the changes are not big, just a slight modification in the index.

summary

The demo in this article can be found here

As mentioned in the discussion, the solution discussed in this article is not the final solution. If you need a better experience, you need to study the culture of each country in depth. Therefore, internationalization is not just a technical issue, but also a social project~~~~

<<:  How I Doubled My Pickup Rates as a Coder

>>:  How should tool apps operate?

Recommend

The formula for creating a hit product

This article combines the popular products in rec...

Tencent invested in Toutiao? Here are three questions

Reliable we-media outlet Kaiba broke the news tha...

Are you satisfied with what you see in the sky today?

Nan Rendong, who devoted his life to the "Ch...

How to sell products through live streaming on Xiaohongshu!

Live streaming sales has been really popular rece...

Community operation: ten active cases of refined community operation

“Any social network operation that does not obser...

Dan Nystedt: Nvidia will contribute 11% of TSMC's revenue in 2023

According to financial expert Dan Nystedt's f...

Live e-commerce operation methodology

After nearly two years of rapid development, the ...

Can elastic waves have spin?

Spin is one of the core concepts in physics, and ...