0% found this document useful (0 votes)

21 views57 pages

Report TTT

Uploaded by

Vansh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

21 views57 pages

Report TTT

Uploaded by

Vansh

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 57

Predicting Soil Moisture Using Weather Data

Aryan Parashar (2000320120048)

Aryan Gupta (2000320120046)
Aryan Tyagi (2000320120050)
Anurag
Bhardwaj(2000320120039)

Submitted to the department of Computer Science

in partial fulfilment of the requirements
for the degree of
Bachelor of
Technology in
Computer Science

ABES Engineering College, Ghaziabad

Dr. A.P.J. Abdul Kalam Technical University,Uttar Pradesh
Lucknow May, 2024
DECLARATION

We thus declare that, to the best of our knowledge and belief, this submission is entirely original
work by us, with the exception of any passages where appropriate credit has been given within
the text. Neither previously published or written work by another person nor work that has been
substantially accepted for the award of any other degree or certificate from the university or
other higher education institution is present.

Signature:
Name: Aryan Parashar
Roll number: 2000320120048
Date:

Signature:
Name: Aryan Gupta
Roll number: 2000320120046
Date:

Signature:
Name: Aryan Tyagi
Roll number: 2000320120050
Date:

Signature:
Name: Anurag Bhardwaj
Roll number: 2000320120039
Date:
CERTIFICATE

This certifies that Aryan Parashar, Aryan Gupta, Aryan Tyagi, and Anurag Bhardwaj's project
report, "Predicting soil moisture using meteorological data," is a record of their own work
completed under my supervision and partially fulfills the requirements for the granting of a
B.Tech. degree in the Department of Computer Science at Dr. A.P.J. Abdul Kalam Technical
University. This thesis' content is unique and hasn't been submitted for consideration for any
other degree.

Date: 17-05-2024 (Supervisor Signature)

Name
Ms. Disha Mohini Pathak

Designation: Assistant Professor

Department of Computer Science
ABES Engineering College,
Ghaziabad.
ACKNOWLEDGEMENT

Presenting the report of the B.Tech project completed during the B.Tech final year brings us
tremendous joy. We are especially grateful to Ms. Disha Mohini Pathak, an assistant professor in
the computer science department of ABES Engineering College in Ghaziabad, for her
unwavering support and direction during the project. His/her honesty, diligence, and tenacity
have always served as an example for us. Only because of his aware efforts have our endeavors
been successful.

We also take this opportunity to thank Professor (Dr.) Pankaj Kumar Sharma, Head of the
Computer Science Department at ABES Engineering College, Ghaziabad, for his invaluable
support and help during the project's development.
Furthermore, we would hate to pass up the chance to thank the department's teachers for their
generous support and collaboration as our project was being developed. Finally, but just as
importantly, we thank our friends for helping to see the project through to completion.

Signature:
Name: Aryan Parashar
Roll No. 2000320120048
Date:

Signature:
Name: Aryan Gupta
Roll No. 2000320120046
Date:

Signature:
Name: Aryan Tyagi
Roll No. 2000320120048
Date:

Signature:
Name: Anurag Bhardwaj
Roll No. 2000320120039
Date:
ABSTRACT

The study of agriculture has the potential to improve our current situation regarding food
and water scarcity, two issues that the world is now grappling with and trying to find
solutions for. This issue can be resolved by using economically viable machine learning
approaches to predict soil moisture based on meteorological data. Many farmers lack the
means or money to check soil moisture levels as they pertain to long-term crops, even if
some can afford to have several moisture sensors and keep an eye on them. Giving farmers
permission to hire a specialist to conduct a sensor-based study of their property would be
one way to solve the problem. This way, the model could forecast soil moisture levels based
on meteorological information.

Keeping the soil at the proper moisture level during the plant-growing phenomena might
result in higher yields and fewer crop-related problems overall. Varying phases of growth
have different consequences, or maybe insignificant effects, from water surplus or deficit.
It's critical to understand how your land uses and stores water, as these factors can vary
greatly depending on the type of plants you use, the terrain, and elevation changes. We may
use a regression model with many strategies to get around this problem; based on settings
for prediction time, fit time, and r2 score, Random forest emerges as the top option.
TABLE OF
Page
CONTENTS

DECLARATION ii
CERTIFICATE iii
ACKNOWLEDGEMENTS iv
ABSTRACT v
LIST OF TABLES vii
LIST OF FIGURES viii
LIST OF SYMBOLS ix
LIST OF ABBREVIATIONS x
CHAPTER 1 (INTRODUCTION) 1
1.1 (Motivation)
1.2 (Project Objective)
1.3 (Scope of the object)
1.4 (Related Work)
1.5 (Organizaion of the report)
CHAPTER 2 (LITERATURE REVIEW) 13
2.1. ...............................................................................................................
2.2. ...............................................................................................................
2.2.1 .............................................................................................................
2.2.2. ..........................................................................................................
2.2.2.1. ........................................................................................................
2.2.2.2. ........................................................................................................
2.3. ...............................................................................................................
CHAPTER 3
3.1. ................................................................................................................ 36
3.2. ................................................................................................................ 39
CHAPTER 4 (CONCLUSIONS) 40
APPENDIX A 45
APPENDIX B 47
REFERENCES... .................................................................................................... 49
LIST OF TABLES

Table Description

Table 2 Literature Survey Paper

LIST OF FIGURES

Figure Description

Fig 2.1 Diagrams

LIST OF SYMBOLS

[x] Integer value of x.

≠ Not Equal

∈ Belongs to

€ Euro- A Currency

_ Optical distance

Optical thickness or optical half

thickness
LIST OF ABBREVIATIONS

AAM Active Appearance Model

ICA Independent Component Analysis

ISC Increment Sign Correlation

PCA Principal Component Analysis

ROC Receiver Operating Characteristics

CHAPTER 1
INTRODUCTION

Maintaining the right amount of moisture in the soil during the plant growth season
can increase yields and reduce crop issues overall. Water excess or shortage has
varying, or even insignificant, impacts on different development stages.
Because it can vary widely based on the plants you utilize, the terrain, and the
elevation of the region, it is important to understand how your land consumes and
stores water. This kind of approach has been utilized by farmers for hundreds of
years. What counts is the precision we can get with real data. For the last few
centuries, farmers had to evaluate the moisture level of their soil mostly by touch
and experience. Even if a lot of farmers were successful in the sense that they
were able to grow crops, there were still ways they might have improved the
productivity of their harvests.
Although there are other factors besides plant availability that affect yields, the
goal of this study is to develop a model that farmers can use to estimate soil
moisture levels without having to purchase and install costly sensors. There are
several possible uses for the developed model. The ability to track present soil
conditions is the primary usage, allowing for the potential correction of any
problems by making necessary adjustments.
Second, a farmer might assess past data, compare it to yields or other harvest
outcomes, and utilize this analytical knowledge to guide actions in the future. A
maize farmer, for instance, could just be concerned with the circumstances that
are anticipated to ensure that they fall within acceptable bounds. Together with
other data, a grape farmer in a wine vineyard may use this information to forecast
the wine's quality or even the blend of wine that would be best made from grapes
grown under these circumstances.
The goal of this research is to precisely observe how weather affects a specific
plot of land in the state of Washington. To get benchmarks, this process might be
carried out anywhere in the world. For a farmer without the resources to undertake
a comprehensive study of water consumption on their field, these benchmarks
may be an affordable alternative for training data. Alternatively, they may choose a
model with a comparable soil composition and/or topography, and then estimate
the soil moisture content using their own meteorological data.
This project's main objective is to provide the greatest tool at a price that will allow
for widespread use.
1. Problem Statement
1.1 Motivation
Predicting soil moisture using machine learning and weather data is crucial for
optimizing agriculture, enhancing water resource management, and mitigating the
impacts of droughts. Accurate predictions aid in crop yield optimization through
precise irrigation, promoting resource efficiency and sustainable farming practices.
Early warning systems for droughts benefit communities, governments, and farmers,
enabling proactive measures. The information is valuable for reservoir management,
urban planning, and infrastructure development, reducing risks associated with soil
moisture variations. Integration into weather models improves forecasting accuracy,
while the insights contribute to climate change research. Soil moisture predictions
also play a role in insurance risk assessment for agriculture. Overall, this approach
not only addresses immediate challenges but fosters scientific understanding,
supporting innovation and informed decision-making in various sectors.

1.2 Project Objective

The project aims to develop a robust machine learning model for predicting soil
moisture levels based on weather data. By leveraging advanced algorithms, the
objective is to provide accurate and timely information to optimize agricultural
practices, enhance water resource management, and mitigate the impact of
droughts. The model will contribute to precision farming by guiding farmers in
optimal irrigation scheduling, ultimately improving crop yields and resource
efficiency. Additionally, the project seeks to facilitate early warning systems for
droughts, empowering communities, governments, and farmers to proactively
respond to water scarcity challenges. The research will also explore the integration
of soil moisture predictions into weather forecasting models, with the potential to
improve overall forecasting accuracy. The overarching goal is to create a practical
tool that addresses real-world challenges in agriculture, water management, and
environmental sustainability, fostering informed decision-making and contributing to
the advancement of scientific knowledge.

1.3 Scope of the Project

The project's scope encompasses the development and implementation of a

machine learning-based soil moisture prediction system using weather data. It
involves data collection, preprocessing, and the application of advanced algorithms
to create a reliable model.
The primary focus is on optimizing agricultural practices, enhancing water resource
management, and providing early drought warnings. The scope extends to
assessing the model's impact on precision farming, improving crop yields, and
promoting resource-efficient irrigation. Additionally, the project explores the potential
integration of soil moisture predictions into broader weather forecasting models to
enhance overall prediction accuracy. Beyond agriculture, the research aims to
contribute valuable insights to urban planning, infrastructure development, and
climate change studies. The project's outcomes are expected to offer practical
applications in diverse sectors, providing a holistic solution to soil moisture
prediction challenges and fostering sustainable practices through informed
decision-making.

2. Related Previous Work

Previous work in soil moisture prediction has predominantly focused on empirical
models and remote sensing technologies. Many studies have employed statistical
approaches, such as regression models, to correlate weather variables with soil
moisture levels. Remote sensing techniques, including satellite and sensor-based
observations, have been instrumental in providing spatially distributed data for
large-scale analysis. However, machine learning techniques offer a promising
avenue for improved accuracy and predictive capabilities. Recent efforts have
explored the use of algorithms like Random Forest, Support Vector Machines, and
Neural Networks in soil moisture prediction, showcasing their potential in handling
complex, non-linear relationships. While existing research has laid a foundation,
this project seeks to advance the field by integrating state-of-the-art machine
learning methodologies, enhancing prediction accuracy, and addressing practical
applications in agriculture, water resource management, and environmental
sustainability.

2.1 Organization of the Report.

The report will follow a structured framework comprising several key sections. The
introduction will provide background information, the project's significance, and its
objectives. The literature review will delve into existing research on soil moisture
prediction, emphasizing previous methodologies, technologies, and their
limitations. The methodology section will detail data collection processes,
preprocessing steps, and the application of machine learning algorithms. Results
will present findings, including model performance metrics and validation
outcomes. A discussion section will interpret results, compare findings with
existing literature, and address any limitations. Practical implications and potential
applications in agriculture, water management, and beyond will be explored.
3. Organization of the Report

Chapter 2: Literature Survey and Software Requirement Specification

This chapter reviews existing literature on soil moisture prediction, summarizing
methodologies, algorithms, and technologies. It provides insights crucial for shaping the
project's methodology. Additionally, it outlines the software requirements, specifying
tools, programming languages, and frameworks needed for data processing and
machine learning implementation.

Chapter 3: System Design and Methodology

Detailing the project's architecture and methodology, this chapter serves as a blueprint
for development. It outlines the data flow, component interactions, and machine learning
approaches chosen. By defining the systematic approach, it provides clarity for
subsequent phases of the project.

Chapter 4: Implementation and Results

This chapter focuses on the practical aspects, detailing the implementation of the soil
moisture prediction system. It covers coding, model training, and the integration of
weather data. Results are thoroughly analyzed, and visualizations may be included to
validate the system's efficacy.

Chapter 5: Conclusion
Synthesizing the project, this chapter summarizes key findings, reflecting on limitations
and proposing avenues for future research. It underscores the project's significance in
agriculture and environmental sustainability, providing closure to the report while
emphasizing real-world applications.
CHAPTER 2

LITERATURE SURVEY

India, recognized for its agricultural legacy, saw around 73 million hectares fitted with irrigation
infrastructure in fiscal year 2022-23, accounting for 52% of the total 141 million hectares of gross planted
land [10]. The nation's available utilizable water resources are limited to an annual capacity of 1122 BCM
(Billion Cubic Meters), which includes 690 BCM from surface water and 432 BCM from groundwater. The
utilized water potential for this allotment is roughly 699 BCM, with 450 BCM generated from surface water
and 249 BCM from groundwater. Noticeably, the agriculture sector is thought to utilize 85-90% of the
country's overall water demand.
The work in [2] focuses on the prediction of soil moisture using specific implementations of RNN (Recurrent
Neural Network), LSTM (Long-Short Term Memory), and Volumetric Soil maintenance clustering using
weather data [5]. The data set was collected from 28 dissimilar locations located near Siberia which
inculcated 4 distinct types of soil. The data set was collected from 2017-2018 and was tested for about 8
months from September 2019 to April 2020. The results were proved to be 84 percent accurate wherein the
accuracy was compromised when the no.of days was reduced and soil depth was increased.
Gap: The shorter time frame prediction produced an unsatisfactory accuracy rate and inculcated that a large
set of data should be researched.
The research in [3] focuses on soil moisture prediction for irrigation automation and forecasting utilising
time-series modelling. The models utilised were Lasso, Decision Tree, Random Forest, and Support Vector
Machine. The investigation was conducted on a cotton farm, employing wireless soil moisture monitoring
equipment set in five plots. Temperature, depth, air humidity, and wind speed were some of the factors used.
Gap: In this model, only a single data set with a small count of a number of rows is used. No comparative
analysis is provided in the result and analysis section to select the best out of all.
The study in [4] focuses on predicting soil moisture. The models utilised were Deep Neural Network
Regression (DNNR). The research region included 25 weather stations that collected data on 32 factors such
as air temperature, rainfall, soil moisture, wind direction, and so on, with 15 placed in 8 North Dakota
counties and 10 in seven Minnesota counties. The study included various sorts of crops, including wheat, dry
beans, canola, oats, and barley.
Gap: Predict the soil moisture only at a depth of 20 cm.
The research in [5] focuses on forecasting soil moisture. The models used were Random Forest (RF),
Extremely Randomised Trees (ET), and Gradient Boosting Machines (GBM), with temperature, wind speed,
and air humidity as variables.
Gap: Limited parameters; not suitable for varied depths.
In this model, we are going to fill the following gaps discovered in several research papers.

1. Comparative analysis between all the regression techniques.

2. Combining multiple datasets for better results.
3. Result values such as prediction time, fit time and r2 score at various depths.
Related work

AUTHORS METHODOLOg ATTRIBUTE GAPS

Y S SE-
LECTED
SAGARI SVM, Temperat The study
KA Linear ure, Hu- covers
PAUL, Re- midity, less types
et. al[2] gression, and of crop
Naive Mois-
Bayes ture

Umesh random air Data sets

Acharya, forest re- temperat are not
et. al[3] gression( ure, big
advanced rainfall, enough
such as soil
boosted mois-
regression ture,
t
r wind
e direction
e
s
,
support
vector
regressio
n,
multiple
regressio
n,

and
artificial
neural
network)

Ramendra Lasso, rainfall, soil gaps

Prasad et. al[4] mois related to
Decisio - the usage
n Tree, ture, Humidity of
Random surface
reflectan
Forest ce
and and land-
use data,
Support
Vector
Machin
e.
Cai Y, Deep air predict
Zheng Neural temperat the soil
W,et. al[5] Net- ure, air moisture
work humidity only
Regressi , atmo- at depth
on of 20 cm
(DNNR) spheric
pressure,

soil
moisture,
daily
precipitat
ion, illu-
mination
duration

Jang, Support air Limited

Young-bin Vector temperat Parame-
Machines ure, ters and
and Jang, et. (SVM), rainfall, not for
al[6] Random variable
Forest wind depths
(RF), direc-
Extremel tion
y
Randomi
zed Trees
(ET),
Gradient
Boosting
Machines
(GBM),
and Deep
Feedforw
ard Net-
work
(DFN)

Figure 1
.

CHAPTER 3

SYSTEM DESIGN AND METHODOLOGY

1. System Design

1.1. System Architecture

1.2. DFD, Class Diagram, flow charts, ER Diagrams

2. Methodology
1. Linear Regression

2. Ridge Regression

3. Lasso Regression

4. Decision Tree Regression

5. Random Forest Regression

6. Gradient Boosting Regression

7. Support Vector Regression (SVR)

8. K-Nearest Neighbors Regression (KNN)

9. ElasticNet Regression

1. Bayesian Ridge Regression

CHAPTER 3

SYSTEM

DESIGN

System Design should include the following sections (Refer each figure or table in some
text). Figure number should be provided below the figure and the table numbering should
be provided above the table.

1. Architecture diagrams

Figure 2 -Tier Architecture Diagram example

2. Data Flow Diagram

Figure 3

3. Class Diagram

Figure 4
4. Database schema diagrams

Figure 5
CHAPTER 4

RESULTS AND

DISCUSSION

Figure 6
Figure 7
Figure 8
Figure 9
Figure 10
Figure 11

Hybrid KNN-Decision Tree and Hybrid Decision Tree-Ridge models have a longer training time.
This could imply that they need more time to study the initial data and understand the underlying
patterns. However, Hybrid Lasso-Ridge and Hybrid KNN-Lasso models can be trained much faster,
approximately between 1.5 to 2 seconds. It seems like they can learn quickly due to getting rid of
features that are unimportant or reducing complexity in general.
On the other hand, both Hybrid Decision Tree-Ridge and Hybrid Lasso-Ridge are also coming up
with predictions in 0.2 seconds or less; unlike the Hybrid KNN-Decision Tree and Hybrid
KNN-Lasso predictions that require a large amount of time - 24 seconds. This might be due to the
fact that before making a prediction, KNN needs to compare new data points to all of the data it was
trained on. The r2_score determines how well the model's predictions correspond with the actual
values; here, Hybrid Decision Tree-Ridge performs excellently since it has the highest score, hence
there is a strong connection between the data leading to highly reliable predictions.
Figure 12

Random Forest demonstrates remarkable efficiency, showcasing a reduction in processing time as

the dataset expands. Its scalability is evident, making it adept at handling larger datasets efficiently.
Conversely, Linear Regression and SVM experience an uptick in computation time with growing
dataset sizes, potentially due to sequential processing constraints. Ridge CV, while generally
efficient, displays performance variations, indicating sensitivity to dataset size. These observations
underscore the nuanced interplay between algorithmic efficiency and dataset characteristics. When
selecting models, it becomes crucial to consider scalability and processing capabilities, tailoring
choices to the specific demands of the dataset's size and complexity. This nuanced understanding
aids in optimizing model performance across a diverse array of scenarios, ensuring effective and
efficient utilization of machine learning algorithms.
CHAPTER 5

CONCLUSIO

1. Performance Evaluation

The outcome of all experimentation was a proceeding in which two data sets could be joined and
fed into a model to forecast the soil moisture with great accuracy, an r2 score lies between 0.977
and 0.991 depending on the depth using a Random Forest Regressor with default settings. This
procedure could be a repeatable process in which a peasant contracts a company to collect
training data on their land specifically for a growing season. As the collected sets of sensor data
could be cumbersome and expensive to deal with as a peasant, this is an alternative that is
cheaper and still gives nearly the same outcomes as having sensors that are constantly running.
Alternatively, this process could be a sub-process in a larger suite of software that peasants could
use for forecasting analysis or even have a set of data on soil moisture from a growing season to
use in post-season analysis of their crop produced. As long as large-scale AI programs are still
expensive and cumbersome for peasants to deal with, there will be a low rate of adoption. This
project has shown that a solution for large-scale soil moisture prediction software could be done
with relatively low computational costs.

2. Comparison with existing State-of-the-Art Technologies

A comparison with existing state-of-the-art technologies in soil moisture prediction reveals the
project's innovative contributions. While traditional methods often rely on empirical models and
remote sensing, this project leverages advanced machine learning algorithms to enhance accuracy
and adaptability. State-of-the-art technologies may incorporate complex models like ensemble
methods or deep learning, yet this project aims to balance performance and interpretability.
Additionally, the integration of soil moisture predictions into weather forecasting models
represents a forward-looking approach, aligning with the evolving landscape of environmental
data science. The project's focus on practical applications in agriculture and water management
further distinguishes it in the context of existing technologies.
3. Future Directions
Future directions for the soil moisture prediction project involve several potential
enhancements and expansions. Firstly, the integration of real-time data feeds could enable
more dynamic and responsive predictions. Exploring additional machine learning models,
including state-of-the-art deep learning architectures, could further improve accuracy.
Collaboration with meteorological agencies for access to comprehensive weather data may
enhance the model's predictive capabilities. The scalability of the system for broader
geographical coverage and adaptation to diverse ecosystems is another avenue for
development. Moreover, incorporating feedback loops for continuous model improvement
and addressing uncertainties in predictions through probabilistic models are areas
warranting exploration. Finally, expanding the project's applications to include soil health
assessment and irrigation system optimization could provide holistic solutions for
sustainable agriculture.
Code for the Project
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"### Results are better! Let's try Lasso"
]
},
{
"cell_type": "code",
"execution_count":
18, "metadata": {},
"outputs": [
{
"name": "stdout",
"output_type":
"stream", "text": [
" Experiment Depth Fit_Time Pred_Time r2_score \\\n",
"0 First Linear Reg 30cm 32.042364 14.371771 9.154623e-01 \n",
"1 First Linear Reg 60cm 2.874317 0.135288 -1.662894e+14 \n",
"2 First Linear Reg 90cm 2.858727 0.156215 9.487954e-01 \n",
"3 First Linear Reg 120cm 2.874285 0.171831 9.460321e-01 \n",
"4 First Linear Reg 150cm 2.952462 0.125001 9.433287e-01 \n",
"5 Ridge Reg - Alpha = 1 30cm 28.188629 0.140590 9.162112e-01 \n",
"6 Ridge Reg - Alpha = 1 60cm 1.312190 0.218699 9.427566e-01 \n",
"7 Ridge Reg - Alpha = 1 90cm 1.202838 0.156215 9.487904e-01 \n",
"8 Ridge Reg - Alpha = 1 120cm 1.140353 0.140597 9.460320e-01 \n",
"9 Ridge Reg - Alpha = 1 150cm 1.140351 0.140591 9.433202e-01 \n",
"10 Lasso Reg - Alpha = 1 30cm 15.871741 0.156247 -1.832157e-04 \n",
"11 Lasso Reg - Alpha = 1 60cm 1.327864 0.140546 -4.613909e-05 \n",
"12 Lasso Reg - Alpha = 1 90cm 1.359043 0.140626 -5.673799e-06 \n",
"13 Lasso Reg - Alpha = 1 120cm 1.373374 0.140591 -1.131381e-06 \n",
"14 Lasso Reg - Alpha = 1 150cm 1.348597 0.140538 -1.814059e-04 \n",
"\n",
" datetime \n",
"0 2021-12-22 15:17:55 \n",
"1 2021-12-22 15:17:58 \n",
"2 2021-12-22 15:18:05 \n",
"3 2021-12-22 15:18:08 \n",
"4 2021-12-22 15:18:11 \n",
"5 2021-12-22 15:18:39 \n",
"6 2021-12-22 15:18:41 \n",
"7 2021-12-22 15:18:42 \n",
"8 2021-12-22 15:18:43 \n",
"9 2021-12-22 15:18:45 \n",
"10 2021-12-22 15:19:01 \n",
"11 2021-12-22 15:19:02 \n",
"12 2021-12-22 15:19:04 \n",
"13 2021-12-22 15:19:05 \n",
"14 2021-12-22 15:19:07 \n"
]

}
],
"source": [
"pipe_with_estimator = Pipeline(steps=[('preprocessor',
preprocessor),\n", " ('classifier', Lasso(alpha = 1))])\n",
"\n",
"data_cols = ['30cm', '60cm', '90cm', '120cm', '150cm']\n",
"try:\n",
" log\n",
"except NameError:\n",
" log = pd.DataFrame(columns = ['Experiment', 'Depth', 'Fit_Time', 'Pred_Time',
'r2_score', 'datetime'])\n",
" \n",
"for cols in
data_cols:\n", " t0 =
time.time()\n",
" pipe_with_estimator.fit(X_train_set[cols],
y_train_set[cols])\n", " t1 = time.time()\n",
" preds =
pipe_with_estimator.predict(X_test_set[cols])\n", " t2 =
time.time()\n",
" r2sc = r2_score(y_test_set[cols], preds)\n",
" now = datetime.now().strftime('%Y-%m-%d %H:%M:%S')\n",
" log.loc[len(log)] = ['Lasso Reg - Alpha = 1', cols, t1-t0, t2-t1, r2sc,
now]\n", " \n",
"print(log)"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"### At least with with these parameters, Lasso Fits Poorly"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"### Ridge with a built in gridsearch cross validation"
]
},
{
"cell_type": "code",
"execution_count":
19, "metadata": {},
"outputs": [
{
"name": "stdout",
"output_type":
"stream", "text": [
" Experiment Depth Fit_Time Pred_Time r2_score \\\n",
"0 First Linear Reg 30cm 32.042364 14.371771 9.154623e-01 \n",
"1 First Linear Reg 60cm 2.874317 0.135288 -1.662894e+14 \n",
"2 First Linear Reg 90cm 2.858727 0.156215 9.487954e-01 \n",
"3 First Linear Reg 120cm 2.874285 0.171831 9.460321e-01 \n",
"4 First Linear Reg 150cm 2.952462 0.125001 9.433287e-01 \n",
"5 Ridge Reg - Alpha = 1 30cm 28.188629 0.140590 9.162112e-01 \n",
"6 Ridge Reg - Alpha = 1 60cm 1.312190 0.218699 9.427566e-01 \n",
"7 Ridge Reg - Alpha = 1 90cm 1.202838 0.156215 9.487904e-01 \n",
"8 Ridge Reg - Alpha = 1 120cm 1.140353 0.140597 9.460320e-01 \n",
"9 Ridge Reg - Alpha = 1 150cm 1.140351 0.140591 9.433202e-01 \n",
"10 Lasso Reg - Alpha = 1 30cm 15.871741 0.156247 -1.832157e-04 \n",
"11 Lasso Reg - Alpha = 1 60cm 1.327864 0.140546 -4.613909e-05 \n",
"12 Lasso Reg - Alpha = 1 90cm 1.359043 0.140626 -5.673799e-06 \n",
"13 Lasso Reg - Alpha = 1 120cm 1.373374 0.140591 -1.131381e-06 \n",
"14 Lasso Reg - Alpha = 1 150cm 1.348597 0.140538 -1.814059e-04 \n",
"15 Ridge Reg - GSCV 30cm 15.183893 0.187457 9.162351e-01 \n",
"16 Ridge Reg - GSCV 60cm 5.366714 0.156221 9.427570e-01 \n",
"17 Ridge Reg - GSCV 90cm 5.436197 0.140594 9.487957e-01 \n",
"18 Ridge Reg - GSCV 120cm 5.483074 0.171830 9.460322e-01 \n",
"19 Ridge Reg - GSCV 150cm 5.561178 0.203086 9.433280e-01 \n",
"\n",
}
],
"source": [
"pipe_with_estimator = Pipeline(steps=[('preprocessor', preprocessor),\n",
" ('classifier', RidgeCV(alphas = [0.001, 0.01, 0.1, 1, 10, 100,
1000]))])\n", "\n",
"data_cols = ['30cm', '60cm', '90cm', '120cm', '150cm']\n",
"try:\n",
" log\n",
"except NameError:\n",
" log = pd.DataFrame(columns = ['Experiment', 'Depth', 'Fit_Time', 'Pred_Time',
'r2_score', 'datetime'])\n",
" \n",
"for cols in
data_cols:\n", " t0 =
time.time()\n",
" pipe_with_estimator.fit(X_train_set[cols],
y_train_set[cols])\n", " t1 = time.time()\n",
" preds =
pipe_with_estimator.predict(X_test_set[cols])\n", " t2 =
time.time()\n",
" r2sc = r2_score(y_test_set[cols], preds)\n",
" now = datetime.now().strftime('%Y-%m-%d %H:%M:%S')\n",
" log.loc[len(log)] = ['Ridge Reg - GSCV', cols, t1-t0, t2-t1, r2sc,
now]\n", " \n",
"print(log)"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"Gridsearch found alpha = 1 to be the best parameter"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"## Other Regressor Tests"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"Right now Ridge Regression with an alpha of 1 is winning as the best model so far. Let's see if we
can beat it"
]
},
{
"cell_type": "code",
"execution_count":
20, "metadata": {},
"outputs": [
{
"name": "stdout",
"output_type":
"stream", "text": [
" Experiment Depth Fit_Time Pred_Time r2_score \\\n",

"0 Random Forest - Default 30cm 688.237841 0.860152 0.980310 \n",

"1 Random Forest - Default 60cm 680.878078 0.687300 0.990726 \n",
"2 Random Forest - Default 90cm 689.800418 0.734206 0.992370 \n",
"3 Random Forest - Default 120cm 722.545116 0.718582 0.992590 \n",
"4 Random Forest - Default 150cm 733.399503 0.725267 0.993203 \n",
"5 SVM - Default 30cm 63.639315 7.706923 0.658935 \n",
"6 SVM - Default 60cm 148.874223 10.001588 0.753807 \n",
"7 SVM - Default 90cm 150.792928 10.414850 0.775367 \n",
"8 SVM - Default 120cm 127.845249 9.556673 0.746775 \n",
"9 SVM - Default 150cm 158.235881 11.079853 0.747956 \n",
"10 SGD - Default 30cm 6.263381 0.558115 0.889629 \n",
"11 SGD - Default 60cm 1.503087 0.148150 0.930496 \n",
"12 SGD - Default 90cm 1.475703 0.139627 0.941424 \n",
"13 SGD - Default 120cm 1.440918 0.169494 0.935683 \n",
"14 SGD - Default 150cm 29.228487 0.136069 0.928644 \n",
"\n",
}
],
"source": [
"pipe_with_estimator = Pipeline(steps=[('preprocessor',
preprocessor),\n", " ('classifier', SGDRegressor())])\n",
"\n",
"data_cols = ['30cm', '60cm', '90cm', '120cm', '150cm']\n",
"try:\n",
" log_other\n",
"except
NameError:\n",
" log_other = pd.DataFrame(columns = ['Experiment', 'Depth', 'Fit_Time', 'Pred_Time',
'r2_score', 'datetime'])\n",
"for cols in
data_cols:\n", " t0 =
time.time()\n",
" pipe_with_estimator.fit(X_train_set[cols],
y_train_set[cols])\n", " t1 = time.time()\n",
" preds =
pipe_with_estimator.predict(X_test_set[cols])\n", " t2 =
time.time()\n",
" r2sc = r2_score(y_test_set[cols], preds)\n",
" now = datetime.now().strftime('%Y-%m-%d %H:%M:%S')\n",
" log_other.loc[len(log_other)] = ['SGD - Default', cols, t1-t0, t2-t1, r2sc, now]\n",
" \n",
"print(log_other)
"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"## Hyper Parameter Tuning Random Forest"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"The following, will take a considerable amount of time to run. Run with caution!!"
]
},
{
"cell_type":
"markdown",
"metadata": {},
"source": [
"This experiment is not included in the final report, but shows an extension of trying to get better
results."
]
},
{
"cell_type": "code",
"execution_count":
null, "metadata": {},
"outputs": [
{
"name": "stdout",
"output_type":
"stream", "text": [
"Fitting 3 folds for each of 10 candidates, totalling 30 fits\n"
]
},
{
"name": "stderr",
"output_type":
"stream", "text": [
"[Parallel(n_jobs=-1)]: Using backend LokyBackend with 8 concurrent
workers.\n", "[Parallel(n_jobs=-1)]: Done 2 tasks | elapsed: 7.9min\n"
]
}
],
"source": [
"## Param grid comes from the following site:\n",
"##
https://towardsdatascience.com/hyperparameter-tuning-the-random-forest-in-python-using-scikit-learn
-28d2aa77dd74\n",
"\n",
"pipe_with_estimator = Pipeline(steps=[('preprocessor',
preprocessor),\n", " ('classifier', RandomForestRegressor())])\n",
"\n",
"param_grid = {'classifier bootstrap': [True, False],\n",
" 'classifier max_depth': [10, 20, 30, 40, 50, 60, 70, 80, 90, 100, None],\n",
" 'classifier max_features': ['auto', 'sqrt'],\n",
" 'classifier min_samples_leaf': [1, 2, 4],\n",
" 'classifier min_samples_split': [2, 5, 10],\n",
" 'classifier n_estimators': [200, 400, 600, 800, 1000, 1200, 1400, 1600, 1800, 2000]}\n",
"\n",
"data_cols = ['30cm', '60cm', '90cm', '120cm', '150cm']\n",
"cv_res = {}\n",
"try:\n",
" log_rf\n",
"except NameError:\n",
" log_rf = pd.DataFrame(columns = ['Experiment', 'Depth', 'Fit_Time', 'Pred_Time',
'r2_score', 'best_params' 'datetime'])\n",
"for cols in
data_cols:\n", " t0 =
time.time()\n",
" random_search = RandomizedSearchCV(estimator = pipe_with_estimator, param_distributions
= param_grid, n_iter = 10, cv = 3, verbose=10, random_state=42, n_jobs =
-1)\n", " random_search.fit(X_train_set[cols], y_train_set[cols])\n",
" best =
random_search.best_params_\n", "
t1 = time.time()\n",
" preds =
random_search.predict(X_test_set[cols])\n", " t2 =
time.time()\n",
" r2sc = r2_score(y_test_set[cols], preds)\n",
" now = datetime.now().strftime('%Y-%m-%d %H:%M:%S')\n",
" log_rf.loc[len(log_rf)] = ['RF - random search', cols, t1-t0, t2-t1, r2sc, best,
now]\n", " cv_res[cols] = random_search.cv_results_\n",
"
print(log_rf)\n", "
\n", "print(log_rf)"
]
}
],
"metadata": {
"kernelspec":
{
"display_name": "Python
3", "language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode":
{ "name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter":
"python", "pygments_lexer":
"ipython3", "version": "3.8.3"
}
},
"nbformat": 4,
"nbformat_minor": 4
}

Sagrika Conference
No ratings yet
Sagrika Conference
11 pages
Priyanka Finalijase
No ratings yet
Priyanka Finalijase
7 pages
A Data-Driven Approach To Soil Moisture Collection and Prediction Hong2016
No ratings yet
A Data-Driven Approach To Soil Moisture Collection and Prediction Hong2016
6 pages
Smart Agriculture Using IoT and Machine Learning Hemand Semninar
No ratings yet
Smart Agriculture Using IoT and Machine Learning Hemand Semninar
38 pages
Irjet V9i4282
No ratings yet
Irjet V9i4282
5 pages
GT Innovative
No ratings yet
GT Innovative
8 pages
Code Merged
No ratings yet
Code Merged
24 pages
ITD - Report 21ee01036
No ratings yet
ITD - Report 21ee01036
21 pages
Smart Farming Based On IoT To Predict Conditions Using Machine Learning
No ratings yet
Smart Farming Based On IoT To Predict Conditions Using Machine Learning
9 pages
CROPCULTIVATION Phase1
No ratings yet
CROPCULTIVATION Phase1
32 pages
Real-Time Crop Growth Tracking and Disease Detection Using Machine Learning
No ratings yet
Real-Time Crop Growth Tracking and Disease Detection Using Machine Learning
5 pages
Art 4
No ratings yet
Art 4
9 pages
Effective Use of Big Data Planning and Analytics in Crop Planting
No ratings yet
Effective Use of Big Data Planning and Analytics in Crop Planting
12 pages
Smart Agriculture Using Automation
No ratings yet
Smart Agriculture Using Automation
5 pages
IEEE Paper Manisha.
No ratings yet
IEEE Paper Manisha.
8 pages
Bs Tract
No ratings yet
Bs Tract
11 pages
Code Merged Organized
No ratings yet
Code Merged Organized
24 pages
IJRPR14070
No ratings yet
IJRPR14070
7 pages
Final Eeee
No ratings yet
Final Eeee
45 pages
Ijit V6i5p1 PDF
No ratings yet
Ijit V6i5p1 PDF
5 pages
Agriculture With LoRAWAN
No ratings yet
Agriculture With LoRAWAN
18 pages
Chapter Two
No ratings yet
Chapter Two
15 pages
Smart Farming Report
No ratings yet
Smart Farming Report
67 pages
Fine Tuning and Tools Development For Plantain Production To Manage Natural Disasters Like Wind
No ratings yet
Fine Tuning and Tools Development For Plantain Production To Manage Natural Disasters Like Wind
67 pages
Report PBL
No ratings yet
Report PBL
34 pages
Iot ttt22
No ratings yet
Iot ttt22
13 pages
Electronics-14-00395 Research Paper 2025
No ratings yet
Electronics-14-00395 Research Paper 2025
26 pages
Soil Moisture and Weather Monitoring System
No ratings yet
Soil Moisture and Weather Monitoring System
13 pages
Controlling and Monitoring The Plant Growth Conditions Using Embedded Systems
No ratings yet
Controlling and Monitoring The Plant Growth Conditions Using Embedded Systems
4 pages
Climate Elasticity Base Estimation of Soil Moisture Using LSSVM and Iwo Way Sensitivity Analysis
No ratings yet
Climate Elasticity Base Estimation of Soil Moisture Using LSSVM and Iwo Way Sensitivity Analysis
33 pages
Soil Moisture Forecast for Farmers
No ratings yet
Soil Moisture Forecast for Farmers
17 pages
10.1515 - Jisys 2022 0046
No ratings yet
10.1515 - Jisys 2022 0046
19 pages
Soil Parameter Based - Prediction
No ratings yet
Soil Parameter Based - Prediction
4 pages
Web-Based Soil Moisture Monitoring System and Automatic Soil Watering Robot
No ratings yet
Web-Based Soil Moisture Monitoring System and Automatic Soil Watering Robot
11 pages
Research Paper by Chaitanya
No ratings yet
Research Paper by Chaitanya
5 pages
Sheti Mitra Project Final Synopsis
No ratings yet
Sheti Mitra Project Final Synopsis
8 pages
CropRecommendationSystem PDF
No ratings yet
CropRecommendationSystem PDF
4 pages
CropRecommendationSystem PDF
No ratings yet
CropRecommendationSystem PDF
4 pages
IoT Soil Moisture System for Agriculture
No ratings yet
IoT Soil Moisture System for Agriculture
5 pages
Mitra 2019
No ratings yet
Mitra 2019
4 pages
Paper 5
No ratings yet
Paper 5
6 pages
Juhi Reshma 2017
No ratings yet
Juhi Reshma 2017
12 pages
Unveiling Soil Moisture: Techniques and Insights For Precision Measurement
No ratings yet
Unveiling Soil Moisture: Techniques and Insights For Precision Measurement
6 pages
Incorporating Soil Information With Machine Learning For Crop Recommendation To Improve Agricultural Output
No ratings yet
Incorporating Soil Information With Machine Learning For Crop Recommendation To Improve Agricultural Output
15 pages
Ipd Djstrike Final
No ratings yet
Ipd Djstrike Final
5 pages
Final Paper PDF
No ratings yet
Final Paper PDF
5 pages
Agriculture-Smart Weather Data Management Based On Artificial Intelligence-2025
No ratings yet
Agriculture-Smart Weather Data Management Based On Artificial Intelligence-2025
22 pages
DB - Report - Springer UPDATED 1
No ratings yet
DB - Report - Springer UPDATED 1
12 pages
Title: Environment Monitoring System For Agricultural Application Using Iot
No ratings yet
Title: Environment Monitoring System For Agricultural Application Using Iot
7 pages
CPE REport
No ratings yet
CPE REport
52 pages
Ambildhuke and Banik - 2022 - IoT Based Portable Weather Station For Irrigation
No ratings yet
Ambildhuke and Banik - 2022 - IoT Based Portable Weather Station For Irrigation
12 pages
Agriscense Final-1
No ratings yet
Agriscense Final-1
70 pages
Smart Farming with IoT & ML
No ratings yet
Smart Farming with IoT & ML
3 pages
IoT Data and Random Forest Algorithm For Optimized Crop Rotation Planning For Sustainable Agriculture
No ratings yet
IoT Data and Random Forest Algorithm For Optimized Crop Rotation Planning For Sustainable Agriculture
6 pages
2022-Pag 155-ICT With Intelligent Applications - Proceedings of ICTIS 2022, Volume 1-Springer (2 - Removed - Removed
No ratings yet
2022-Pag 155-ICT With Intelligent Applications - Proceedings of ICTIS 2022, Volume 1-Springer (2 - Removed - Removed
29 pages
Environment Monitoring System For Agricultural Application Using IoT and Predicting Crop Yield
No ratings yet
Environment Monitoring System For Agricultural Application Using IoT and Predicting Crop Yield
9 pages
Soil Health Monitoring System
No ratings yet
Soil Health Monitoring System
10 pages
Complete Document
No ratings yet
Complete Document
70 pages
Rice Crop Yield
No ratings yet
Rice Crop Yield
6 pages
Guide To Clinical Documentation. ISBN 0803666624, 978-0803666627
96% (25)
Guide To Clinical Documentation. ISBN 0803666624, 978-0803666627
23 pages
LabJack U3 Datasheet Export 20160108
No ratings yet
LabJack U3 Datasheet Export 20160108
109 pages
The Hindu Arabic Place Value System
No ratings yet
The Hindu Arabic Place Value System
11 pages
Saronix: Technical Data
No ratings yet
Saronix: Technical Data
2 pages
Mini Riset Bahasa Inggris Bisnis-1
100% (2)
Mini Riset Bahasa Inggris Bisnis-1
12 pages
Converting MicroSim® Schematics Designs To OrCAD Capture® Designs
No ratings yet
Converting MicroSim® Schematics Designs To OrCAD Capture® Designs
44 pages
PQLI - India
No ratings yet
PQLI - India
4 pages
Protocol SLR 53481 - Andreas Tzeremes
No ratings yet
Protocol SLR 53481 - Andreas Tzeremes
11 pages
Carb Identity Road
No ratings yet
Carb Identity Road
1 page
SPPA-T3000 Distributed Control System
100% (1)
SPPA-T3000 Distributed Control System
9 pages
Employee Management System: Background Study
No ratings yet
Employee Management System: Background Study
71 pages
Time and Labor Overview 1217515239320101 9
100% (2)
Time and Labor Overview 1217515239320101 9
35 pages
Writing Questions (My New Friend) Part 1
No ratings yet
Writing Questions (My New Friend) Part 1
8 pages
PS4 Solution
No ratings yet
PS4 Solution
9 pages
Obd Report Final
No ratings yet
Obd Report Final
51 pages
Lesson Plan #3: Culture in America
No ratings yet
Lesson Plan #3: Culture in America
11 pages
Muniratnam Printed Notes Paper-1-2
No ratings yet
Muniratnam Printed Notes Paper-1-2
236 pages
(Ebook) The Bell in The Fog by Lev AC Rosen ISBN 9781250834256, 9781250834263, 1250834252, 1250834260 All Chapters Available
No ratings yet
(Ebook) The Bell in The Fog by Lev AC Rosen ISBN 9781250834256, 9781250834263, 1250834252, 1250834260 All Chapters Available
65 pages
Dropshipping Ultimate Guide
No ratings yet
Dropshipping Ultimate Guide
17 pages
Olga 2014
No ratings yet
Olga 2014
2 pages
The Truth About Binge Watching
No ratings yet
The Truth About Binge Watching
5 pages
An 41251
No ratings yet
An 41251
8 pages
Delayed Coker Fired Heater Designand Operation Fouling PDF
No ratings yet
Delayed Coker Fired Heater Designand Operation Fouling PDF
10 pages
Novel Approach For Scale-Up of Fermentation Process PDF
No ratings yet
Novel Approach For Scale-Up of Fermentation Process PDF
8 pages
Removal of Colour in Sugar Cane Juice Clarificatio
No ratings yet
Removal of Colour in Sugar Cane Juice Clarificatio
9 pages
Cleanroom Systems Ultratech Precision: Insulated Panels
No ratings yet
Cleanroom Systems Ultratech Precision: Insulated Panels
24 pages
Aurora PVI Desktop Monitor
No ratings yet
Aurora PVI Desktop Monitor
2 pages
2.2 Thermal Properties and Temperature - Watermark
No ratings yet
2.2 Thermal Properties and Temperature - Watermark
82 pages
Comp1821 CW Term 1 2223
No ratings yet
Comp1821 CW Term 1 2223
16 pages
03 Laboratory Exercise 1
No ratings yet
03 Laboratory Exercise 1
7 pages

Report TTT

Uploaded by

Report TTT

Uploaded by

Predicting Soil Moisture Using Weather Data

Aryan Parashar (2000320120048)

Submitted to the department of Computer Science

ABES Engineering College, Ghaziabad

Date: 17-05-2024 (Supervisor Signature)

Designation: Assistant Professor

Table 2 Literature Survey Paper

Fig 2.1 Diagrams

[x] Integer value of x.

Optical thickness or optical half

AAM Active Appearance Model

ICA Independent Component Analysis

ISC Increment Sign Correlation

PCA Principal Component Analysis

ROC Receiver Operating Characteristics

1.2 Project Objective

1.3 Scope of the Project

The project's scope encompasses the development and implementation of a

2. Related Previous Work

2.1 Organization of the Report.

Chapter 2: Literature Survey and Software Requirement Specification

Chapter 3: System Design and Methodology

Chapter 4: Implementation and Results

1. Comparative analysis between all the regression techniques.

AUTHORS METHODOLOg ATTRIBUTE GAPS

Umesh random air Data sets

Ramendra Lasso, rainfall, soil gaps

Jang, Support air Limited

SYSTEM DESIGN AND METHODOLOGY

1.1. System Architecture

1.2. DFD, Class Diagram, flow charts, ER Diagrams

4. Decision Tree Regression

5. Random Forest Regression

6. Gradient Boosting Regression

7. Support Vector Regression (SVR)

8. K-Nearest Neighbors Regression (KNN)

1. Bayesian Ridge Regression

Figure 2 -Tier Architecture Diagram example

Random Forest demonstrates remarkable efficiency, showcasing a reduction in processing time as

2. Comparison with existing State-of-the-Art Technologies

"0 Random Forest - Default 30cm 688.237841 0.860152 0.980310 \n",

"0 Random Forest - Default 30cm 688.237841 0.860152 0.980310 \n",

"0 Random Forest - Default 30cm 688.237841 0.860152 0.980310 \n",

You might also like