Software Implementation Solutions of A Lightweight Block Cipher to Secure Restricted IoT Environment: A Review

Abstract


INTRODUCTION
Because of their omnipresent nature, security plays a significant role in authenticating information in communication systems and other applications such as IoT applications. IoT technology is currently being employed in a variety of applications, such as smart infrastructure (smart homes, smart cities, and smart grid), wearable technology, and smart automobiles, with numerous uses in the automotive system and elsewhere.
By the end of 2020, it is estimated that more than 18 billion IoT devices would be on the market and connected through the cloud, with more than half of them for industrial uses [1]. As technology connects a lot of devices through the Internet, hacking them can have a big loss, such as losing sensitive personal and economic information, when user's lack of knowledge about how to work with these devices and the potential risks to personal information due to misuse.
Users need to keep their data private when using these applications. This led to a change in the trend in adopting safety as a basic thing in the manufacture of these devices, especially if they are used in sensitive applications (such as identification, credit cards, personal and confidential data for patients, etc.). Ubiquitous computing with large networks of resource constrained IoT devices, have extremely tight cost constraints over time, Moore's law can expect the speed and capability of our computers to increase every two years, though the cost of computers is halved, which will increasingly enable such applications [2].The demand for cryptographic components is significant and growing since many of these applications will process sensitive information

Cryptography Solutions Challenges implementation in Resource-Constrained IoT Devices
Cryptographic algorithms are employed to ensure the secrecy, integrity, authentication, and authorization of data traveling via resource-constrained IoT devices, as well as to safeguard data stored or transiting over the network. Figure 1 illustrates the role of the cryptographic techniques to prevent attacker from reaching the IoT data and tampering it. Due to resource limits, implementing standard cryptography in these IoT devices is difficult:  heavy and complex mathematical operation  operations use huge memory space  Traditional cryptography is expensive to implement on low-resource devices (circuit size) which impose challenges on software design implementation. To overcome these difficulties, lightweight ciphers were introduced [2, 3,4].

Lightweight cryptography (LWC)
Lightweight cryptography is a branch of a current cryptographic technique aimed at providing security solutions for devices such as mobile phones, RFID tags, sensor networks, smart cards, and IoT devices. [5].
In 2015, the National Institute of Standards and Technology (NIST) announced a contest for a suitable lightweight cryptographic algorithm that might be used in resourceconstrained environment [6] .NIST claimed that symmetric ciphers may be used to provide high-level security in low-end devices such as RFID Tags, Motes, Smartcards, Industrial Sensors, Wireless Sensors, Mobile or User Equipment, Healthcare Devices (like Hubs), and other battery-powered devices like Wearable [6].Three criteria were used to accept or reject algorithms during the evaluation process: cryptographic security, performance, and implementation cost. Other features were examined in the second candidate, including functionality, underlying components, design methods, and supported key and tag sizes [6].Lightweight Cryptography Working Group was established in 2013 by CRYPTREC and they published a comprehensive technical report about LWC in 2017.
To secure resource-constrained devices, many algorithms have been presented in this domain, which either software-based or hardware-based implementation of lightweight ciphers [7]. These ciphers aim to minimize the overall implementation costs for cryptographic primitives that are hardware-oriented as well as software-oriented in terms of various aspects, such as:(Number of rounds, memory, key size, power consumption, throughput, and Gate Equivalence (GE)). This is accomplished by focusing on:  Hardware costs (power consumption, physical area (GE), and energy consumption) are all being reduced.
 Improved software efficiency (memory consumption, processing power, throughput, and latency).
New algorithms are introduced in this literature that are based on either modified versions of well-known cryptographic algorithms by adaptive ones for the constrained environment or designing a new cipher algorithm that meets the requirements to secure the resource-constrained devices.
There are a variety of LWC algorithms available nowadays, authors of a number of scholarly publications [8].In [2] the authors presented a comparison between many adaptive algorithms for embedded systems that were designed for good hardware performance. In [9] the authors proposed a new block cipher, Lightweight Data Encryption Standard (DESL), which is a modified version of the DES algorithm. They use a single S-Box instead of eight S-boxes in DES to reduce the cost of implementation. Designing a new lightweight cryptographic algorithm, on the other hand, had a wide branch to secure constrained devices, with a large variety of algorithms.

Lightweight Block Cipher
In recent years, many strategies with highly limited applications have been created. Lightweight block ciphers, in particular, can get beyond the limitations of these applications. The choice of a lightweight block cipher is crucial since it affects the system's cost, area, speed, latency, and bandwidth requirements. Several considerations are made when building a new Lightweight block cipher algorithm to lower the cost of resource consumption:

Block Cipher inner structure:
Based on their inner structure, block ciphers can be classified into: Substitution Permutation Networks (SPN), Feistel Networks, Generalized Feistel network (GFN), Add-Rotate-XOR (ARX), and hybrid. In SPN, the plaintext is transformed and prepared for the next round by using a series of successive substitutions and permutation boxes. SPN provides a higher level of protection, but it also consumes more resources. Feistel networks use a round function to conduct a diffusion function on half of the data in each block. Although many applications do not require decryption, it uses a smaller round function to provide both encryption and decryption at a low cost. GFN takes a data block and splits it into sub-blocks, applying the Feistel functions to each pair of sub-blocks. GFN encrypts and decrypts using the same round function, making it a good choice for low-cost hardware implementation. With no S-boxes, ARXs use simple operations like addition, rotation, and XORs. Compared to SPN and Feistel ciphers, ARX's security properties have not been thoroughly investigated, yet, they produce small and rapid implementations. Hybrid ciphers combine the three types of ciphers discussed above in order to improve various efficiency measures.

Targeted Implementation Environment (Hardware or Software)
Depending on the implementation Environment Lightweight cryptographic algorithms can be classified to hardware or software implementation. The main goal of hardware implementation is to achieve minimal gate equivalent by reducing the number of logic gates required. This lowers the cost and reduces the amount of power consumption. Hardware implementations are better suited to ultra-constrained devices like 4-bit microcontrollers that execute specified functions. Software implementations target to small memory consumption, processing power, and throughput (bytes per cycle), these design require a microprocessor to operate.
Cryptographic libraries for embedded devices include Software implementations. In comparison to hardware implementations, their key advantage is portability. Three software implementations on different restricted environment (8-bit) AVR processor, (6-bit) MSP processor and (32-bit) ARM processor are introduced by FELICS framework [10], to evaluate the performance of lightweight block or stream ciphers in terms of implementation size, RAM utilization, and time to complete a given operation. Table 1 presents a basic comparison of some common lightweight block ciphers, we are primarily interested in software implementation design, and display the structure of the top selected lightweight block ciphers for software implementation design. The following are the details of all of the lightweight block ciphers that were chosen for software implementation design:  TEA / XTEA [11] [12] Wheeler and Needham presented the Tiny Encryption Algorithm (TEA) in 1994. It has Feistel structure and small amount of code and can be easily integrated into embedded systems. XTEA is an improved version of TEA to overcome the discovered weakness in it [12], it has a complex keyschedule than TEA , and also based on simple F-function composed of left and right shifts operations, XORs and additions.  KASUMI /MISTY1 [13] MISTY1was presented by M Matsui in 1997.it has Feistel structure ,KASUMI has Feistel structure and it is equivalent to MISTY1 in 8 rounds, with the exception that its key schedule rotates the bits of the master key and XORs round constants. It is used in the worldwide system for mobile communication (GSM), UMTS, and GPRS for security purposes. [14].
 AES [15] The Advance Encryption Standard (AES) was created by Vincent Rijmen and Joan Daemen in 1998 and was adopted as the encryption standard by NIST in 2001. It accepts 128 bits of plain text as input and produces 128 bits of encrypted cipher text as output. It calculates all the round keys from the original key using a Key Schedule method. The number of rounds is determined by the key length: 10 for a 128-bit key, 12 for a 192-bit key, and 14 for a 256-bit key. Instead of working with bits at a time, it works with bytes of data. The input block size is 128 bits (or 16 bytes), and the cipher state is displayed as 4*4 matrixes, with four operations applied in the following order: Substitute Bytes (does the substitution), ShiftRows (does the permutation), and MixColumns (does the permutation), and Add Round key (does the permutation). The four operations for decryption will be: Add a round key, and then reverse MixColumns, ShiftRows, and Inverse SubByte. Despite the fact that it is not a lightweight encryption, many IoT devices use this technique.
 Camellia [16] Camellia was designed by Nippon Telegraph and Telephone Corporation and Mitsubishi Electric Corporation in 2000. It is an ISO/IEC, IETF, NESSIE and CRYPTREC recognized cipher and offers a similar level of security as AES .it has Feistel structure with two round variants, 18 rounds (when using 128 bit keys) or 24 rounds (when using 192 or 256 bit keys).
 HIGHT [17,18] Hong et al. presented this encryption in 2006. It has a GFS structure based on ARX. Its main operations are XOR, addition mod 28 and left bitwise rotation. WhiteningKey Generation (create 8 whitening key bytes used in the first and last rounds) and SubkeyGeneration are the two algorithms that make up the key schedule (generates 128 subkey bytes). The authors of [18] presented a software and hardware implementation of the HIGHT block cipher for resourceconstrained devices (8-bit AVR and 32-bit ARM Cortex-M3) and ASICs.
 SEA [19] Francois-Xavier et al. presented this encryption in 2006. It has a Feistel structure that can be used in software on an 8-bit processor. Its F-function is made up of basic operations: Bitwise XOR, apply 3x3 S-box, word rotation, bit rotation and Addition modulo 2b, this enables for quick evaluation, minimal memory usage, and short code size.
 CLEFIA [20] This cipher is proposed by Sony in 2007 and presented as standardization in ISO/IEC 29192. It has type-2 GFN structure, The 128 bit (16 bytes) plaintext input P0 to P15 is grouped in 4 byte words. It uses a simpler key scheduler and small F-functions, with small S-Boxes and basic permutations. CLEFIA uses whitening keys WK0 to WK3 at the start and end of encryption.
 KLEIN [21] KLEIN has been by Zheng Gong et al. in 2011, it is based on SPN, for software efficiency on 8-bit processors, and preferred byte-oriented matrix multiplication operations. Each round has four layers in order: AddRoundKey, SubNibbles, RotateNibbles, and MixNibbles. The author of [22] chose the KLEIN cipher as the most lightweight security solution to test in an IoHT environment.
 LBlock [23] This cipher proposed was by Wu and Zhang in 2011, it has Feistel Network structure and has an efficient software implementation on 8-bit microcontrollers .Its round function consists of substitution layer using 4-bit S-boxes (8 Sboxes applied in parallel) and permutation layer (32-bit permutations with shift operations).
 LED [24] The Lightweight Encryption Device ( LED ) was proposed by Guoin 2011,it has SPN structure, its operation is similar to an AES-like design ,each round applies 4 functions: AddConstants, SubCells (applies a 4-bit Sbox Present cipher ),ShiftRows and MixColumnsSerial(using Maximum Distance Separable (MDS)).
 TWINE [25] This cipher was presented by Tomoyasu Suzaki et al. in 2011. It has Type-2 GFS with 16 of 4-bits branches. twine has efficient software implementation on various platforms, Its Ffunction consist of only a subkey addition and a nonlinear substitution layer using single 4-bit S-box that acts on nibbles with repetition 8 times every round , and a diffusion layer that permutes the blocks of 4 bits.
 SPECK and SIMON [26,27] SPECK and SIMON have been presented by The U.S. National Security Agency (NSA) in2013. SPECK is ARX and performs 22, 23, 26, 27, 28, 29, 32, 33 and 34 iterations. Each round consisting of: Bitwise XOR, Addition modulo 2 n and Left and Right circular shift, make it suited to software implementations more than Simon.
Simon uses a Feistel structure with simple arithmetic and logic operations, its round function consist of left circular shifts, bitwise XOR and bitwise AND. If the block size consist of 2nbits and a key size of mn-bits then it represented as 2n/mn.

 ITUBEE[28]
This cipher was presented by F Karakoç et al. in 2013.it has Feistel structure with no key schedule making it suitable for 8bit software-based platforms with limited resources. It Insert round keys between two round functions F to strengthen the cipher against related key attacks.
 Chaskey [29] Nicky Mouha et al. presented the Chaskey cipher for 32-bit microcontrollers in 2014. It's ARX, with a permutation-based MAC technique based on an Even-Mansour block cipher as the foundation. The XOR with state method is used to generate the keys. Because key updating consists of two shifts and two XORs for two subkeys, there is no key schedule.

 Fantomas [30]
Vincent Grosso proposed this cipher in 2014. It uses LSdesigns, which combine L-boxes (look-up tables) and bitslice S-boxes. On 8-bit MCUs, Fantomas has a good implementation.
 Robin [30,31] Robin was proposed by Vincent Grosso 2014, it has SPN structure and similar to Fantomas, but uses involutions on its L-Box and its S-Box (8×8 bits S-Box and a 16×16 bits L-Box) to be used for decryption and encryption.
 FeW [32] This cipher was proposed by Kumar, et al. in 2014.It has based on Feistel-M structure consist of two Feistel branches of 4-branch generalized Feistel structure to improve security against cryptographic attacks. It utilizes Humminbird-2's Sbox and imitates the key expansion process from the PRESENT.

 Pride[33]
Albrecht et al proposed this cipher in 2014, it has an SPN structure and is easy to implement in software on 8-bit microcontrollers. It has a strong linear layer separated into three sub-layers and a bit-sliced S-box. The 128-bit master key is split into two key, k0 and k1, which are used to encrypt data. Pre-whitening and post-whitening are handled by k0 (64 bits), whereas the subkey for each round is handled by k1 (64 bits).

 RECTANGLE[34]
RECTANGLE proposed by Zhang et al in 2015, it is an ultralightweight block SPN cipher, with a substitution layer consists of 4-bit S-boxes connected in parallel and a permutation layer executed in 3 rotations. There are three operations in each round: 1. SubColumn, 2.AddRoundkey (using Bitwise XOR with round key), and 3.ShiftRow (each row is rotated left over different offsets), which uses bit-slice techniques to obtain a fast software speed.

 SIMECK[35]
Gangqiang Yang et al. first proposed this cipher in 2015. SIMECK is a Feistel block cipher that combines the best design elements of SIMON and SPECK block ciphers. It employs ARX operations to encrypt or decode 2n-bit message blocks utilizing a 4n-bit key and 2n-bit message blocks. Changes in the rotations and key scheduling enable for better hardware and software implementation. Efficient implementation methods of Simeck were proposed in [36,37] these proposed methods can be adapted in IoT application.  RoadRunneR [38] RoadRunneR is presented by Adnan Baysal and Sühap Sahin in 2015, it is Feistel bit-slice block cipher that is targeted for software implementations on CPUs with an 8-bit architecture.
It follows LS-Design in which the cipher is composed of S-Boxes that follow the bit slice and L-Boxes (linear P-Boxes).It uses 3 keys per round plus 2 whitening keys one in the beginning and another at the end to XOR with the block. In encryption the 64 bit block is divided into two32bit parts, the left part is XORed with whiteningkey at the beginning and end of the encryption.  SPARX [39,40] In 2016, Daniel Dinu et al. presented the SPARX cipher, which is built on the ARX structure and enhance its security with an SPN structure. Rather than storing Speckey S-Box in RAM, it constructs it using simple procedures. They suggest a new method called "Long Trail Strategy" (LTS) in place of "wide trail design strategy" (WTS), which advises the use of large and computationally expensive S-boxes combined with light linear layers termed Long Trail Argument.
 ANU [41] ANU was presented by G. Bansod et al. in 2016 as an ultralightweight block cipher with Feistel-network structure. The key scheduling is motivated by the key schedule of PRESENT cipher. The round function has two operations in which F1 (left circular shift by 3 bit) and F2 (right circular shift by 8 bit). F1 output is applied to the nonlinear layer S-box then XORed with the LSB 32 bit data resulting in FX which is XORed with F2 and with round key. ANU is well-suited to applications with tight constraints, such as IoT.
 PICO [42] This cipher was developed by Bansod et al in 2016, it is ultralight SPN block cipher. It has three operations involved in encryption process: AddRoundkey, SubColumn and the Bit_Shuffle. The PICO cipher key schedule is based on the SPECK cipher key scheduling architecture, it uses key of 128 bit to extract 33 subkeys k0-k32 of size 64 bits and K32 is used for post whitening key.  SKINNY [43] This cipher proposed was by Beierle, et al. in 2016. SKINNY family have SPN structure .It employs three key-length possibilities of n bits, 2n bits, or 3n bits, with n being the block size (64 or 128 bits). The number of rounds varies from 32 to 56 depending on the block and encryption key size. It includes a light key scheduling and light diffusion layer.
 SIT [44] This cipher is proposed by Muhammad Usman et al.in 2017. SIT (Secure IoT) is hybrid approach based on combining Feistel with SPN structure. Encryption process is composed of logical operations, left shifting, swapping and substitution. 5 different keys are used for 5 rounds encryption to improve energy efficiency.
 LiCi [45] Patil et al. proposed this cipher in 2017, and it has a Feistel structure. The MSB of the input plaintext is sent into 8 Sboxes for replacement after the 64-bit input is separated into two pieces, each of which includes 32 bits. It uses 4-bit S-boxes with simple operations, XOR, left and right circular shift for encryption process. LiCi key scheduling algorithm is inspired by PRESENT cipher.
 CHAM [46] This cipher was presented by Koo   BRIGHT [48] This cipher is proposed by Sehrawat and Gill in 2019, it is GFN-based based on 4-branch block cipher for resourceconstrained IoT applications devices. The number of rounds different from 32 to 37 depending on the cipher block and encryption key sizes. The block size is 64-bit or 128-bit with an encryption key size ranging from 80 to 256 bits. It uses three layers, first pre-key whitening and, for each round applied second layer which perform ARX operations, and third layer perform round permutation.

 NLCA[59]
This algorithm is proposed by Thabit et al. in 2021, it is structure based on combination between FN and SPN for enhancing data transmission security in cloud services. XOR, XNOR, F functions, swaps, and other transformation are used in each round. NLCA performance including execution time and lower memory usage was evaluated against some popular cryptographic algorithms, including DES, AES, HIGHT, Blowfish, and LED, utilizing a variety of parameters in the same cloud environment.

Performance Evaluation (Hardware and Software Performance Metrics)
Several measures based on hardware implementations or software implementations are offered to evaluate the performance of lightweight ciphers. The best encryption is one that provides a convenient level of security while balancing performance and cost concerns. This section summarizes the specifics of these measures. Some metrics are common whereas some are restricted to the H.W implementations (e.g. CMOS technology and GE metric) and others to S.W implementations (e.g. RAM size, ROM size).

Lightweight Ciphers Performance Evaluation Metrics:
Hardware technology: related to the CMOS technology that used to implement the lightweight cipher and there occupied circuit area which is measured in µm. 0.13 and 0.18 µm are the most technologies used in the lightweight cryptography research. Gate Equivalent (GE) metric is used to described the complication and the area occupied by the hardware implementations, this area represents the physical area required to run the cipher on a board measured in µm2 whereas( 1GE = 2 input-NAND Gate).
Power and Energy consumption: power is measured in micro Watt (µW) for hardware implementations and it is dependent on the clock frequency. Energy consumption per bit can be calculated as follows for both hardware and software implementations [63]:

Energy [µJ] = (Latency [cycles/block] ×Power [µW])/block size [bits]
Where, Latency = the number of clock cycles required to encrypt one block of data, Power = power consumed by the hardware or software implementation in µW, Block size = size of data in bits can process in encryption/decryption operation.
Power can be Optimizing by minimizing the memory footprint of the source code and simplify the operations while maintaining a sufficient level of security.

Related works on Performance Evaluation of Lightweight Block Ciphers
Researches on performance evaluation take into accounts three directions: software, hardware and software / hardware evaluation papers. In this paper we focus on software related evaluation, we discuss different approaches or technologies for security and performance evaluation of lightweight ciphers based on restricted environment (platform), target applications and show their experimental Results as below:

Design Strategies of Lightweight Block Cipher Algorithm
The optimal design of lightweight algorithms is based on a trade-off between cost, performance, and security requirements as shown in Figure 2 [62]. Cost vs. performance: Serial implementation is the minimal cost approach but degrades the performance with added loops, while the more number of simultaneous calculation and processing, the higher performance.
Performance vs. security: lower number of rounds possesses lower latency, while higher number of rounds is a safer cipher.
Security vs. cost: longer key length means more time requires attacking while, lower key length indicates less register and memory requirement. The trade-off of these three criteria is subject to other conditions, including the application for which it is designed and the implementation environment based (H.W or S.W).

Fig. 2 Design strategies Trade-Offs in Lightweight Cryptography
Block cipher security depends on the confusion and diffusion principles, which Shannon had identified [60]. In a cipher, a nonlinear operation causes confusion and linear operations causes diffusion.  Confusion Practically, confusion can be achieved by using costly Sboxes such as 8-bit S-box that used by the AES, a compact 4-bit S-boxes are used to low-cost hardware implementation. For lightweight designing ciphers, the look-up tables (LUTs) is alternative way for representing S-boxes which can improve throughput in software implementations through fast memory retrieval .A bitslice implementation by performing basic bitwise operations (XOR or AND) on words of w bits is another way [30].Although bitslice implementations can be very quick, they are limited to having a large memory overhead make it suitable for only non-feedback modes of operation like CTR mode [57]. ARX structures based ciphers can achieve low cost nonlinearity software implementation through modular addition operations, examples of this cipher Speck [26] and Sparx [39].

 Diffusion
A good diffusion can be achieved by using bit permutation. For hardware implementation it is simply represented by bit-wise permutation such as diffusion layer that used by Present [47]. Bit rotation in word and MDs matrices can achieve low cost permutation for software implementation [24].  Simple key schedule to derive subkeys and Simple round function consists of simple operations.  Block size (64 bit or less) and key length according to NIST, the smallest key size is 112 bits [6]. Devices' characteristics play a key role in determining the size of the block.  Number of rounds: execution times will be lowered by reducing the number of rounds. The number of rounds inversely proportional to security level complexity of the confusion and diffusion layer, in [61] based on Speck cipher, a hybrid cipher Speck-R presented which reduced the number of rounds from 26 to 7 and the execution time at least 18% for Speck by integrating ARX structure with a dynamic substitution layer.  Using bitwise operations and simple operations like modular addition can decrease the code size and RAM consumption.  The word size used in cipher operation should be on par with the largest register size that is supported by constricted architectures.

Conclusion
Researchers work to improve security levels and strengthen ciphers against both existing and new threats. This paper discusses the most prominent security problems of restricted devices in the IoT environment.
Encryption is one of the most effective methods for providing end-to-end security. Lightweight cryptographic algorithm is essential for handling security in highly constrained environments such as the Internet of Things. Block cipher is very convenient and easier to implement in software, it can be operated on data in computer-sized blocks. Due to many considerations such as energy and memory utilization, especially for software platforms, this article will assist IoT security developers to highlight algorithms that match the needs of the constrained environment. Through the design strategies presented in this research, it can be said that designing a cipher with simple round functions and simple operations can achieve a high or acceptable level of security but it depends on the requirements and specifications of the target application and the constraints of the device. It should be emphasized that it is necessary to evaluate the performance on different resource constrained devices to develop a more comprehensive understanding of the lightweight ciphers.