SMS language

Crossing SMS language barriers with the National Language Identifier for Turkish, Spanish and Portuguese.

In your SMS business usage, you might have noticed that if your text contains characters specific to your country’s alphabet, the number of remaining SMS characters can drop significantly. This can cause longer messages to be split into two, or more, separate messages which will double or even triple the cost of reaching your customer.

By using standard encoding for GSM messages, the 7-bit default alphabet, you can fit 160 characters in a single SMS message. Here is a list of allowed characters displayed as Basic Character Set (left) and Basic Character Set Extension (right).

7-bit default alphabet
7-bit default alphabet

If you include even a single character which is not supported in the default alphabet, all message characters will be encoded in a different standard which will cause a maximum number of characters to drop to only 70 per message! If the message is longer than 70 characters it will be divided into two parts, where the second message will also be limited to 70 characters, even if the second message contains only basic GSM alphabet. Same rules are applied to every consequent message part.

Message encoding

We will not get into message encoding details in this tutorial. If you wish to learn more about the subject, visit this page.

If you need to handle special characters in your SMS messages, there are two main approaches that can be taken in order to increase character capacity closer to standard SMS size.

  1. Transliteration
  2. National Language Identifier (NLI)

Transliteration

Transliteration is a technique where the system replaces illegal characters with related or similar legal characters from the default alphabet. Transliteration is covered in detail in this tutorial.

National Language Identifier

National Language Identifier (NLI) is an encoding technology which allows an SMS containing language specific characters usually treated as 16bit Unicode to be delivered as original text, while only deducting 5 characters from the maximum SMS length – 155 characters allowed. The remaining 5 characters are used in the background to instruct the receiver’s device about the selected language and how to properly display it on screen.

By sending a Fully featured textual message and setting the languageCode parameter you can send your language specific characters. Supported languages are:

Language code Language
TR Turkish
ES Spanish
PT Portuguese

In this example a message containing Turkish alphabet will be sent.

POST /sms/1/text/advanced HTTP/1.1
Host: api.infobip.com
Authorization: Basic QWxhZGRpbjpvcGVuIHNlc2FtZQ==
Content-Type: application/json

{
   "messages":[
      {
         "from":"InfoSMS",
         "destinations":[
            {
               "to":"41793026727"
            }
         ],
         "text":"Artık Ulusal Dil Tanımlayıcısı ile Türkçe karakterli smslerinizi rahatlıkla iletebilirsiniz.",
         "language":{
            "languageCode":"TR"
         }
      }
   ]
}	
  

Here is a list of supported characters for each of the supported languages:

Turkish
Turkish

 

Portuguese
Portuguese

 

Spanish
Spanish

Preview messages before sending!

Nonstandard characters may cause messages to encode in Unicode, which can considerably reduce the number of available characters per message. We recommend using the SMS preview method to explore all options before sending.

Important:

There is a chance that certain networks don’t support the Language feature, so we can’t guarantee 100% that this functionality will work for all destinations. For example, if a message with the Turkish language is sent over a Chinese provider it might not display properly on the recipient’s device.