Smartphone for palm oil fruit counting to reduce embezzlement in harvesting season

a Department of Electrical Engineering, Faculty of Engineering, Universitas Negeri Malang, Indonesia b Graduate School, Faculty of Engineering, Universitas Negeri Malang, Indonesia c Department of Physics, Faculty of Mathematics and Natural Sciences, Universitas Negeri Malang, Indonesia d Department of Computer Science and Information Engineering Southern Taiwan University of Science and Technology, Taiwan e Centre of Electrical Energy System (CEES), Sekolah Kejuruteraan Elektrik, Fakulti Kejuruteraan, Universiti Teknologi Malaysia, Malaysia 1 aripriharta.ft@um.ac.id *, 2 adimfirmansah@gmail.com; 3 nandang.mufti.fmipa@um.ac.id , 4 grojium@gmail.com , 5 norzanah@utm.my

Harvest estimation is an essential parameter in the agriculture industries to estimate transportation facilities and storage areas in the harvesting season. Meanwhile, companies are required to calculate crop yields quickly and accurately. This paper reports on an experimental study in the form of a smart application to count oil palm fruit in the field quickly and accurately. The system used a single shot detector algorithm to count the number of fresh fruit bunches (FFB) on-site using a smartphone camera. The cutting area (CA) at the top of the collection was collected in various positions in the database. Our research documented that the algorithm matched the CA with the picture taken by the operator. Hence, the application automatically calculated the number of harvests per-site in the FFB unit. The data were then sent to the cloud database via a wireless router in a warehouse or through a cellular network. The main advantage of this application is reducing the theft that usually occurs on the spot. The model used performs very well for agricultural applications, with 94% to 99% accuracy.
Automatic crop yields counting provides many benefits, such as making the harvest process faster. A faster harvest cycle makes labor costs more efficient, especially laborers paid on an hourly basis. The automatic calculation system also provides more precise and consistent results. The low error rate can reduce the budget due to crop yields' miscalculations. The fast harvesting process also makes the company can produce fresh fruit. The automation system can reduce embezzlement of palm oil crops that harm the company.

Related Works
Research on increasing crop harvesting efficiency has received much attention in the last few decades, especially in detecting and counting the number of fruits. One appliance widely used is computer vision techniques based on color Previous research carried out by [6] deployed image colors to detect maturity and count the number of green, oranges. The results showed a successful detection rate of 75.3%, with an error rate of 27.3% during the day. Sunlight conditions during the day have a significant influence on the results of detection [7], [8]. The sun's rays create shadows that cause a standard segmentation procedure to divide the surface into several fragments. Anchored by these results, [9] applied computer vision techniques at night to detect fruit using an artificial light source with 78% success rate.
Meanwhile, Authors [7] used computer vision techniques for detecting oranges in outdoor environments-based color. The obtained accuracy rate is 76.5%. Furthermore, Authors [10] added size parameters to improve the accuracy of fruit detection. The accuracy level produced in their study using the color, shape, and size parameters increased to 90%.
Traditional computer vision techniques have been widely adopted in the agricultural industry. Although the method works well under certain conditions, some cases require more precise detection due to changes in environmental conditions such as illuminations and occlusions [11]. The traditional computer vision algorithm also has a high error rate when detecting fruit that piled up. In recent years, the computing capabilities of electronic devices continue to increase. The processor continues to grow until it is created with a special purpose processor for graphics processing called the Graphics Processing Unit (GPU). In this era, Deep Learning (DL), which a sub-field of Machine Learning (ML), has been widely developed. These are included in the category of Artificial Intelligent (AI). DL uses a layered algorithm structure called an artificial neural network (ANN) [8]. DL makes it possible to build accurate detectors through training that automatically learns features from photos and uses them efficiently. In DL, visual features are generally extracted by the Convolutional Neural Network (CNN) [12].
In an earlier study, [13] adopted the Faster R-CNN model called Deep Fruit to detect fruit. The model produced after training has an accuracy of 83.8% for detecting paprika. With a different model, Authors [14] developed CNN based on Inception-ResNet architecture model called Deep Count to count fruit. This research was carried out by adding synthetic images to produce a better model. After tested in a real environment, the model has an accuracy of 80% to 85%. [8] employed a Single-Shot detector to detect fruit in real-time. The model also used synthetic images as training data and produced an accuracy of 0.9 with 0.64 Intersection over Union (IoU). All these three studies used a computer or minicomputer to conduct detection.
Unlike the above three investigations, [15] applied the R-CNN Faster model and Single Shot Detector (SSD) to calculate crop yield estimates. In their study, the trained R-CNN Faster model was able to obtain an accuracy rate of 89%, while the SSD model had an accuracy rate of 82%. On the other hand, [16] applied DL in unmanned aerial vehicles (UAV) to count apples and oranges. They used two-layer CNN. The first layer is a fully convolutional network-based blob detector, and the second layer is a counting algorithm based convolutional network-based. Interestingly, this study not only implements DL but also combines linear regression to determine the results of the fruit calculation. The results showed the applied model had an accuracy rate of 91.3% for detecting apples.
Based on the needs of the crop automation system and technological developments explained earlier in this paper, we aim to propose an automatic crop yield calculation as a new technological appliance in agricultural sectors. The proposed system applies the DL to calculate yields. In contrast to the study of the application of DL to calculate crop yields, our experiment uses a single shot detector model called You Only Look Once (YOLO) lite version 3 model. The model is embedded in the smartphone application to detect objects. Overall, system details are explained in the method section.

Method
This proposed system implemented a single-shot detector (YOLO) lite version 3 as CNN in a smartphone application to detect and count palm oil fruit. The smartphone application accessed the camera to take pictures. The system detected palm oil fruit in real-time and directly displays the bounding box on the detected fruit. Besides displayed on the smartphone application in real-time, the number of fruits is also sent to the cloud database. This system can use two network connection modes, namely a cellular network that utilizes a Base Transceiver Station (BTS) and a Wi-Fi network that utilizes a wireless router. Data sent in the cloud database is displayed in the form of a web interface that can be accessed via a computer or smartphone. The overall system block diagram is shown in Fig.  1. This system can be used to calculate harvest yields that are outside the warehouse if there is a cellular network to send data to the database. Fig. 2 shows the proposed system implementation scenario. Workers who oversee counting fruit carry a smartphone with an installed automatic counter application. Workers who are in the warehouse area can also use Wi-Fi networks to send data to the cloud database. Besides, workers who work in outdoor areas can use the cellular network that is available to send data to the database.

Results and Discussion
The test was carried out on a smartphone with a Mediatek MT6753T chipset. The smartphone uses an Octa-core 1.5 GHz Cortex-A53 processor and a Mali-T720MP3 GPU. Fig. 3 showed a screenshot of the application display when workers were counting the number of palm oil fruit. The detected fruit was marked with a red bounding box. The test results show that the used model can detect palm oil fruit well.

Fig. 3. Detection application
Modifications to the smartphone application scripts were performed to display the confidence level of each detected fruit. Fig. 4 shows the display of the smartphone application that the script added to display the name and level of confidence of the detected palm oil fruit. The experimental results show a confidence level (accuracy) of 94% to 99%. The high level of confidence indicates the used model has great accuracy in detecting palm oil fruit.

Fig. 4. Confidence value
The detection button is used to start and end the palm oil fruit detection process. After the worker presses the detection button, the calculated number of palm oil fruit is automatically sent to the cloud database. The implementation of the proposed system can accelerate the process of harvesting palm oil. Manual calculations generally require ± 2 minutes. The time needed for manual calculations is also directly proportional to the number of fruits. Manual calculations for large numbers also have a high error rate. After implementing the proposed system, the calculation process of palm oil fruit decreased to ± 30 seconds, with a high degree of accuracy.
Overall, the proposed system has an accuracy rate of 94% to 99%. A comparison of the proposed system accuracy with the previous research is shown in Table 1. The comparison results show that three studies can produce an accuracy rate above 90%. If the level of accuracy is greater, the error rate for fruit harvest calculation is thus lower. The level of accuracy is significantly influenced by the network architectures, datasets, training system, and the detected object form [8], [17]. The level of accuracy can be improved through the longer training stages of the model. The error level due to illumination can be minimized by providing a dataset with several illumination variations or add a process layer [18]- [20]. However, if more layers are used, then it takes longer computational time to detect fruit.

Reference
Model Accuracy (%) [13] Faster R-CNN 83,8 [14] Inception-ResNet 80-85 [8] Modified Yolo v2 85-95 [15] Faster R-CNN 89 [15] SSD 82 [16] Blob detector + count network + linear regression 91,3 Proposed system Yolo lite v3 94-99 To obtain responses from users, we carried out a survey of the application use feedback. Survey documents were disseminated through Google online survey covering several aspects, namely userfriendly, execution speed, accuracy, and benefits. Furthermore, each item of this question has a double answer option. Fig. 5 shows the feedback from users. Based on our survey, it was found that 95% agreed that the developed application was user-friendly, and only 5% of them disagreed. This is because we have developed an application to be able to process images from the user's smartphone camera. Users only need to take pictures of the oil palm harvest, and then our algorithm will process the image data automatically. Users' responses on the execution speed are quite good, with 70% stating that the application is fast, and 30% contended slow. Execution is carried out in the cloud, while the speed constraints are usually on uploading and downloading data transmission from the cloud, depending on the type of smartphone, signal strength, and internet user quota. From the research aspect, almost 85% of users responded that the image matching algorithm is accurate, and 15% of them expressed conversely. In general, this algorithm has been proven to be of good quality through performance tests so that the responses of 15% of users are probably due to the user's camera resolution or the user's poor photo technique. The last aspect is the benefit where 88% of users shared that the application entails many benefits, while 12% of them did not express the same idea. This shows that the developed application is acceptable for the users.

Conclusion
The single-shot detector (YOLO) lite version 3 model has been successfully implemented in the smartphone application to speed up the process of harvesting palm oil for automatic counting. The applied model has high accuracy in detecting palm oil fruit. The network connection method used in the system more flexible because it can be used to calculate the amount of palm oil inside or outside the warehouse (garden area). Compared to the manual calculation, the proposed system can increase the efficiency of the harvest process with a low error rate. This could bring a precise number of fruit and reduce the embezzlement. In general, this application is plausible. It can be construed from the users' positive responses to all survey questions concerning aspects of user-friendliness, accuracy, usefulness, and speed. Differences in users' responses to binary questions are due to differences in smartphone types and camera resolutions.