Artificial Neural Networks and Adaptive Neuro Fuzzy Inference System for wheat yield analysis and prediction

The current study evaluated the prediction of the yield of wheat crops in the Bagalkot district of Karnataka State, India. The study aimed to provide crop yield predictions to help farmers optimize their cultivation and marketing strategies. The model used various independent variables, such as temperature, humidity of air, and water resources, to predict growth in the yield of wheat crops. The correlation analysis helps determine the strength and direction of the relationship between the variables based on the results. The statistical analysis identifies the variables that have a significant impact on crop yield growth. The work developed and tested two different models (the Artificial Neural Network (ANN) model and the Adaptive Neuro-fuzzy Interference System (ANFIS) to predict crop yield growth based on the selected independent variables. The ANFIS model was particularly interesting as it can predict a mapping between the input and output parameters, which can be useful for understanding the relationships between different variables. ANFIS was considered a better predictor than ANN as the error percentage ranged from 0-3%. Overall, the work highlighted the importance of crop yield predictions and the potential benefits that simulations can generate for farmers and the agriculture sector in general.


Introduction
Crop yield prediction, a crucial task in agriculture, plays a vital role in making informed decisions about crop management, harvest planning, and market forecasting.It helps farmers determine the best time for planting, selecting suitable crops, optimizing resource allocation, and maximizing yield.The prediction of crop yield can be done using various approaches, including statistical modeling, machine learning, and remote sensing.The yield of crops depends on several parameters, such as rainfall, temperature, humidity, soil attributes, and agricultural factors.Precision agriculture is one of the challenging problems, wherein yield prediction depends on several datasets such as climate, soil conditions, and seed quality (Beloti et al., 2017;Xu et al. 2019).Statistical modeling is used for crop yield prediction and is based on historical data on weather, soil, and crop characteristics.The statistical models can identify the relationships between these parameters and crop yield, allowing for the prediction of future yields.Machine learning models, on the other hand, can handle complex nonlinear relationships between input and output variables and have been widely used for crop yield prediction.However, these models require training from the past huge set of variable data and depend on the appropriate algorithm for developing the models, which becomes a real challenge in determining a sophisticated predictive model.
Similarly, some researchers (Rahman et al., 2022) employed statistical models to forecast the production of winter wheat depending on environmental conditions.They compared the performance of three statistical models, including multiple linear regression, support vector regression, and random forest models, in predicting crop yield.The study showed that all three models achieved high accuracy in predicting crop yield, with the random forest model outperforming the other two models.The study also revealed that the most important environmental factors for crop yield prediction are precipitation, temperature, solar radiation, and soil organic matter content.Su et al. (2012) employed linear regression models to predict maize yield in China based on weather and soil data and reported that temperature, precipitation, and soil nutrients are important factors for predicting crop yield.Another study used time series analysis to predict rice yield in West Bengal, India, based on historical yield data and weather information (Bhavani, et al., 2018) and demonstrated that the autoregressive integrated moving average (ARIMA) model was effective in predicting rice yield.
Machine learning (ML) models have been widely used for crop yield prediction, as they can handle complex nonlinear relationships between input and output variables.The ML approaches are used in a wide range of applications, such as manufacturing processes, supply chain management, agriculture sectors, and many more (Ayyappa et al., 2021;Manoj et al., 2020).The linear models make it easy to understand the relationship between the dependent and independent variables, but they fail to achieve high prediction accuracy.This is mainly due to their inability to relate the nonlinear interactions among the variables (Ansarifar et al., 2021).Over the years, researchers have employed various ML methods to address crop yield prediction.For instance, Drummond et al. (2003) utilized stepwise linear regression models to examine the relationship between yield and topographic characteristics; they minimized overfitting and predicted an optimal yield model using a neural network.Similarly, another study employed the crop yield with a random forest model, compared it to the yield achieved in the MLR method, and reported that the random forest method is effective in crop yield prediction, providing acceptable predictions in smart agriculture with reasonable data (Jeong et al., 2016).
Despite the effectiveness of the random forest method, its usage in predicting crop yield and related output is limited (Fukuda et al.,2013) as each dataset is unique, and some ML models may not be suitable for a particular system (Gromping et al.,2009).Moreover, crop yield prediction changes over time, as it behaves nonlinearly depending on several factors, such as soil characteristics, climate, the genetic makeup of the crop, and pest infestations.To overcome these challenges, several ML techniques have been proposed for crop yield prediction, including support vector machines, deep neural networks, and ensemble models, among others.Each of these models has its strengths and weaknesses, and selecting the appropriate model depends on the data available and the research objectives.A deep neural network (DNN) can capture complex nonlinear relationships between crop yield and its predictors.Ensemble models such as random forests and gradient boosting can handle high-dimensional data and noisy datasets.ML methods have shown immense potential in predicting crop yield and related outputs.However, selecting an appropriate model for a particular dataset is critical for making accurate predictions.Additionally, researchers should continually evaluate the performance of their models over time, as crop yield prediction changes with time and depends on several factors.
Of late, several models for yield production with suitable methods have been proposed by the research community.They relate to corn yield predictions with detailed correlation and regression algorithms (Sarturi et al., 2022).Predicting crop yield and enhancing crop management are two associated and crucial concepts (Russo et al., 2015).Furthermore, support vector machines (SVM) have proven to be a better method for performing regression analysis and predicting crop yield.The correlation of a sediment yield from monsoon data in determining the accuracy of prediction was established (Misra et al., 2009).
Another study examined the details of crop yield prediction using different models and their importance in the ML processes (Elavarasan et al., 2018).Another research aimed at predicting the soil organic matter content (SOM) in crop fields using artificial neural networks (ANNs) and regression models (Achieng et al., 2019) and demonstrated that the ANN model outperformed the regression models, with an R-squared value of 0.804 and a root mean squared error (RMSE) of 0.261% for the ANN model compared to an R-squared value of 0.694 and an RMSE of 0.323% for the best regression model.The ANN and adaptive neuro-fuzzy inference system (ANFIS) for the prediction of crop yield using two different models were evaluated and the use of the ANFIS model was suggested as a reliable tool for crop yield prediction in agriculture (Kalpana et al., 2020).A study on the modeling of wheat yield using ANN and ANFIS models employed 16-year-long data about wheat yield observations and weather data and evaluated the models using the coefficient of determination (R²) and RMSE (Kumar et al.,2023).The ANFIS model showed better performance with R² values of 0.85 and 0.83 for training and testing, respectively.Razavi et al. (2011) used ANFIS to predict pear yield based on weather parameters and soil properties and demonstrated that the ANFIS model outperformed traditional regression models in predicting crop yield.The prediction of soil nutrients based on soil properties and weather data was carried out with the ANFIS method (Ramsauer et al., 2022).The study revealed that ANFIS could accurately predict soil nutrients, which can help farmers in fertilizer management.The use of ANFIS to predict dairy cow milk yield based on environmental factors and management practices was also studied (Gholami et al., 2018).They showed that ANFIS can accurately predict milk yield, thereby helping farmers optimize feeding and management practices.Based on photographs of the crop leaves, the ANFIS was used to identify tomato illnesses and ANFIS was shown to achieve high accuracy in diagnosing diseases, which can help farmers in timely disease management (Kamath et al., 2021).
A literature survey on crop yield prediction reveals that various modeling techniques have been applied for predicting crop yields, including statistical modeling, ML, and deep learning-based approaches.Some of the commonly used statistical techniques are regression, multivariate regression, and time series analyses.MLbased approaches include artificial neural networks, support vector regression, decision trees, and random forests.Deep learning-based approaches such as convolutional neural networks and recurrent neural networks have also been used for crop yield prediction using remote sensing and weather data.
To elaborate on the importance of using tailored modeling techniques for specific crops and regions, understanding the varying requirements of different crops for growth and development is essential.For instance, wheat has specific temperature and water requirements for optimal growth and yield production.In addition, environmental factors, such as soil quality, sunlight exposure, and pests and diseases, vary significantly across different regions, even within the same country.Thus, modeling techniques that work well for one crop or region may not necessarily perform well for another.
Bagalkot district in Karnataka, India, predominantly cultivates wheat as a major crop.However, the region experiences irregular rainfall patterns throughout the year, which poses a significant challenge to wheat cultivation.Accurately predicting the atmospheric conditions is therefore crucial to aid farmers in planning and managing their crops.In this regard, novel modeling techniques that consider the specific requirements of wheat in terms of temperature and water sources can provide valuable insights into the potential yield production of the crop under varying environmental conditions.In conclusion, employing tailored modeling techniques for specific crops and regions can significantly enhance the accuracy and reliability of yield predictions.Timely and optimum use of temperature and water sources as predictors for wheat yield prediction in Bagalkot can aid farmers in planning and managing their crops to mitigate the adverse effects of irregular rainfall patterns.Therefore, it is essential to assess and compare the efficiency of different modeling approaches tailored to specific crops and regions to optimize crop yield and ensure food security.

Material and Methods
Bagalkot district, located in the state of Karnataka, India, is well-known for cultivating wheat crops on a large scale.The farmers in this region begin sowing seeds in September and complete the harvest by March in the ensuing year.To collect data regarding crop yield and water sources in Bagalkot district, the government website (https://eands.dacnet.nic.in) was referred to.Additionally, temperature and humidity data spanning from the year 2000 to 2020 were obtained from https://www.timeanddate.com/weather/.The variables considered for a given year of wheat crop cultivation were the area, production, and yield.Initial examination of the data revealed that wheat crop cultivation had steadily increased over the years, which could be attributed to the variation in the availability of water resources and changing levels of rainfall in the region.Correlation techniques are statistical methods that allow us to explore and quantify the relationship between two or more variables.Table 2 presents the coefficient of correlation a measure of the strength and direction of the relationship between the dependent and independent variables.
Scatter diagrams are visual representations of the relationship between two variables, with each point representing an observation of the two variables.In the present study, Figure 1 shows the relationship between different variables and provides insight into the nature of the relationship between them.The results indicate a positive relationship between rainfall and crop production.This suggests that when there is more rainfall, there is a higher crop yield.Additionally, moderate correlations ranging from 0.552 to 0.850 were observed between different sources of water used for crop cultivation, including wells, canals, and other sources.This implies that the type of water source used for crop cultivation affects crop yield.
The scatter diagram between water sources and crop yield also established a positive correlation, further highlighting the importance of water as the main resource for crop cultivation.The use of different water sources, such as canals, wells, or tankers, showed a strong correlation with the demand for crops.This suggests that the availability of water resources influences the demand for crops.The study also showed that the main water resource for wheat crop cultivation was from wells or bore water resources.This indicates that the type of water source used for crop cultivation may vary depending on the type of crop grown with other variable parameters.
Furthermore, the study revealed that the surrounding temperature, humidity, and water resources are key factors affecting wheat crop cultivation.Farmers choose to cultivate crops based on these environmental conditions, which can vary throughout the year.As shown in Figure 1B, the yield increases with an increase in the area of land used for cultivation, which also depends on the available environmental conditions.Water resources are critical for crop cultivation as they have a strong positive correlation with crop yield over time.Figure 1C illustrates this relationship.Adequate availability of water plays a crucial role in crop growth and development.Without adequate water resources, crops cannot grow properly and their yield decreases.Therefore, farmers must ensure to have enough water for their crops.They can do this by using irrigation techniques or relying on rainfall.Overall, the study highlights the importance of water resources in crop cultivation, with a strong positive correlation between water resources and crop yield over time.
Temperature and humidity also play a predominant role in farming.Temperature affects plant growth and development by influencing the rate of photosynthesis, respiration, and transpiration.The optimal temperature range for most crops is between 20°C and 30°C.Above or below this range, the yield decreases.Similarly, humidity affects crop growth by regulating transpiration, which also involves water loss from plants.The optimal humidity range for most crops is between 60% and 80%.In dry weather conditions, the duration of sunshine has a good correlation with crop yield (Chmielewski and Pots, 1995).Longer periods of sunshine can compensate for the lack of humidity to some extent.
In Bagalkot, crop cultivation begins in September and harvesting happens in March (seven months).During this time, humidity and temperature strongly affect crop yield. Figure 1D shows the correlation between the maximum temperature and minimum humidity value for each month and how it affects crop yield for different years.The maximum humidity was always found between 12 am to 5 am, and the minimum humidity was found in midafternoon.There was a positive correlation between temperature and humidity towards crop yield.This explains that crop yield mainly depends on the duration of sunshine and humidity during the afternoon.In the months between September to November, the rainfall had improved humidity, resulting in improved crop cultivation.The higher correlation was found during the harvesting period, i.e. in March when the higher humidity and temperature during the daytime had improved crop cultivation.Therefore, farmers must pay attention to weather patterns and adjust their farming practices accordingly to ensure optimal crop yield.

Results and Discussion
In the current study, various input parameters were considered, such as year, temperature, humidity, average rainfall, total water source, area, production, and yield, out of which temperature, humidity, average rainfall, and total water source were found to be significant predictors of Yield.The ANN and ANFIS models were used to predict the yield, and a MATLAB toolbox was employed for the same.For the ANN model, 22 iterations were performed, and the model was trained with 70% of the data, validated with 15%, and tested with 15% using the Levenberg-Marquardt training method and feedforward backpropagation.
The optimal architecture was obtained through a trial-and-error method and achieved at 4, 10, 1, and 1, as shown in Figure 2A.The regression plot drawn in Figure 3 shows the training, testing, and validation stages, based on which the optimal level of the model was decided.The membership functions used were log sigmoid and pure line functions, and the mean square error was chosen as the performance metric for evaluation.On the other hand, the Adaptive Neurofuzzy Interference System (ANFIS) model was trained with 81 rules, 3 input layers, and 1 output layer, using trim membership function type with constant variation and 4 hidden layers (Figure 2B).The membership function was chosen based on the average training and testing errors by trial-and-error methods.For the optimal model, the average training and testing errors were 3.05E-08 and 0.29934, respectively.The model was trained with an epoch of 1000.It employed 17 training and 5 testing data for optimal performance.Table 3 shows that the Adaptive Neuro-fuzzy Interference System (ANFIS) model has 2 times less error (0-3%) than the ANN model (0-6%), indicating that the ANFIS model is a better predictor compared to the ANN model.The performance of the models was evaluated using Mean square error (MSE), Mean Absolute error (MAE), and Root mean square error (RMSE), as depicted in Table 4.The optimal ANFIS model also provides a 3-D mapping between input and output parameters, as shown in Figure 4. Similar analogies were used to design optimal models (Manoj et al., 2021;2022).
This study showed a positive relationship between rainfall and crop production, indicating that higher rainfall results in a higher crop yield.Additionally, moderate correlations were observed between different sources of water used for crop cultivation, including wells, canals, and other sources.This suggests that the type of water source used for crop cultivation may affect crop yield.The present study also revealed that the type of water source used for crop cultivation may vary depending on the type of crop being grown.Wells or bore water resource was found to be the main water resource for wheat crop cultivation.Overall, these findings suggest that both rainfall and water sources influence crop yield.While rainfall can be unpredictable and subject to seasonal variations, the availability and management of water resources can have a significant impact on crop cultivation.Therefore, farmers need to consider both rainfall and water sources in making informed decisions regarding crop cultivation and water resource management (Figure 4A).
Rainfall and temperature are crucial factors that can significantly affect crop yield.Adequate rainfall is necessary for crop growth, but excessive or inadequate rainfall can result in adverse effects on crop yield, such as waterlogging or drought stress.Similarly, temperature plays a critical role in crop growth and development, and each crop has its own optimal temperature range.Extreme temperatures can result in decreased agricultural output, impaired crop quality, and increased vulnerability to pests and diseases, even if they are unexpectedly high or low.
The model employed in the present study examined the correlation between average rainfall, temperature, and wheat crop yield, considering the local environmental conditions.Notably, this model is specific to wheat crops and may not be applicable to other crops or different environmental conditions, as each crop has distinct water and temperature requirements that vary based on different factors such as soil type, topography, and altitude.Nonetheless, the developed model can be useful for predicting crop yield with respect to temperature and rainfall (Figure 4B).The model reveals that the correlation between humidity, temperature, and crop yield is necessary for crop growth and development.Temperature and humidity have a direct relationship and impact the atmospheric water vapor content, which is essential for crop growth (psychrometry).Maintaining moderate humidity and temperature values is crucial for improving crop yield.High humidity cause moisture stress and increase the probability of pest and disease attacks, whereas low humidity can lead to moisture loss and lower crop yield.Hightemperature levels can increase the rate of evapotranspiration and result in water stress, whereas low temperatures can slow down crop growth.
Thus, while there may not be a straightforward correlation between humidity, temperature, and crop yield in the study, these environmental factors have been shown to play a crucial role in crop cultivation and can affect crop yield in various ways (Figure 4C).Furthermore, the results suggested that the type of water source used for crop cultivation may affect crop yield, and there is a strong correlation between water sources and the demand for crops.The main water resource for wheat crop cultivation is wells or bore water resources.This suggests that the type of water source used for crop cultivation may vary depending on the type of crop being grown.Additionally, surrounding temperature and water resources are found to be the key factors affecting wheat crop cultivation, and farmers choose to cultivate crops based on these environmental conditions, which can vary throughout the year.The crop yield was observed to increase with the rise in the area of land used for cultivation, which also depends on the available environmental conditions.Therefore, a correlation between water sources, temperature, and crop yield seems probable, with the optimal combination of these factors varying depending on the type of crop being grown and the local environmental conditions (Figure 4D).
This study examined the relationship between humidity, water source, and crop yield.High humidity can increase the atmospheric water vapor content, while water sources provide the necessary water for crop growth.However, too much humidity can lead to moisture stress and increased susceptibility to pests and diseases.On the contrary, inadequate water supply can lead to drought stress and reduced crop yield.Therefore, both humidity and water sources should be balanced to ensure optimal crop growth and yield.The study also found a moderate correlation between different sources of water used for crop cultivation, including wells, canals, and other sources, and crop yield.This indicates that the type of water source used for crop cultivation can have an impact on crop yield.
The scatter diagram between water sources and crop yield (Figure 1A) also revealed a positive correlation, further supporting the importance of water as the main resource for crop cultivation.The use of different water sources, such as canals, wells, or tankers, showed a strong correlation with the demand for crops, indicating that the availability of water resources may influence the demand for crops.In terms of humidity, while there may not be a direct correlation between humidity and crop yield in the study, high humidity can increase the atmospheric water vapor content, leading to moisture stress and increased susceptibility to pests and diseases.Therefore, humidity levels should be monitored and balanced to ensure optimal crop growth and yield (Figure 4E).

Conclusions
This study has implications for policy formulation and farmer decision-making regarding crop cultivation and water resource management, enabling policymakers to shape sustainable agricultural policies and enabling farmers to increase yield, enhance profitability, and meet market demands.Temperature, humidity, average rainfall, and total water source are significant environmental determinants driving crop growth and development.Rainfall and temperature play a significant role in crop growth, with extreme weather conditions hampering yield and deviations from optimal temperature range affecting crop yield and quality.The interplay of humidity, temperature, and crop yield is crucial for crop growth and development, with wells and borewater dominating as primary water sources for wheat cultivation.The results also showed that the type of water source used, such as wells, canals, or tankers, may influence crop productivity and demand.The correlation ranges from 0.55 to 0.8, are in accordance with the investigation.Temperature, humidity, average rainfall, and total water source were found to be important predictors of crop yield in the study.

Figure 1
Figure 1 (A) Variation of yield for average rainfall, (B) Variation of yield for area, (C) Variation of yield for Total water source, and (D) Monthly coefficient of correlations.

Figure 2 Figure 3 7
Figure 2 Architecture of Optimal (A) ANN model and (B) ANFIS model

Figure 4 .
Figure 4. ANFIS input and output parametric variation obtained from the optimal model.

Table 1 .
Table 1 summarizes the dataset obtained from various sources, including the four independent variables: temperature, humidity, rainfall, and irrigation sources (canal, well, or other means) for an average month from September to March.Artificial Neural Networks and Adaptive Neuro Fuzzy Inference System for wheat yield analysis and prediction Revista de Agricultura Neotropical, Cassilândia-MS, v. 10, n. 3, e7553, July/Sep., 2023.Agricultural Dataset

Table 2
Coefficient of correlations

Table 3 :
Agricultural data set used for ANN and ANFIS prediction of Yield with percentage error

Table 4 :
Evaluation matrices for ANN and ANFIS models