Académique Documents
Professionnel Documents
Culture Documents
Multi-Linear Regression
Prof. D.S. Zingade, Omkar Buchade, Nilesh Mehta, Shubham Ghodekar, Chandan Mehta
deeplakshmisach@gmail.com, omkar543210@gmail.com, mehtanilesh13@gmail.com, smghodekar19@gmail.com,
chandanmehtahelpus@gmail.com
B.E. Computer Dept., Aissms Ioit, Kennedy Road, Pune
ABSTRACT the new crops and are not completely aware of the benefits
India being an agricultural country, its economy they get while farming them. Also, the farm productivity can
predominantly depends on agriculture yield growth and allied be increased by understanding and forecasting crop
agro industry products. In India, agriculture is largely performance in a variety of environmental conditions. Thus,
influenced by rainwater which is highly unpredictable. the proposed system takes the location of the user as an input.
Agriculture growth also depends on diverse soil parameters, From the location, the nutrients of the soil such as Nitrogen,
namely Nitrogen, Phosphorus, Potassium, Crop rotation, Soil Phosphorous, Potassium, forecasted weather is obtained. The
moisture, pH, surface temperature and weather aspects like proposed system applies Machine Learning and prediction
temperature, rainfall, etc. India now is rapidly progressing algorithm like Multiple Linear Regression to identify the
towards technical development. Thus, technology will prove pattern among data and then process it as per input conditions.
to be beneficial to agriculture which will increase crop This in turn will propose the best feasible crops according to
productivity resulting in better yields to the farmer. The given environmental conditions. As past year production is
proposed project provides a solution for Smart Agriculture by also taken into account, the prediction will be more accurate.
monitoring the agricultural field which can assist the farmers Thus, this system will suggest profitable crops providing a
in increasing productivity to a great extent. Weather forecast choice directly to the farmer.
data obtained from IMD (Indian Metrological Department)
such as temperature and rainfall and soil parameters
2. SYSTEM DESCRIPTION
repository gives insight into which crops are suitable to be There is no system existing which recommends crops based
cultivated in a particular area. This work presents a system, in on multiple factors such as Nitrogen, Phosphorus and
form of an android based application and a website, which Potassium nutrients in soil, pH and weather components
uses Machine Learning techniques in order to predict the most which include temperature and rainfall. The proposed system
profitable crop in the current weather and soil conditions. The suggests an android and a web based application, which can
proposed system will integrate the data obtained from precisely predict the most profitable crop to the farmer. The
repository, weather department and by applying machine user location is identified with the help of GPS. According to
learning algorithm: Multiple Linear Regression, a prediction user location, the feasible crops in the respective location is
identified from the soil, pH and weather database. These soils
of most suitable crops according to current environmental
conditions is made. This provides a farmer with variety of are compared with past year production database to identify
options of crops that can be cultivated. Thus, the project the most profitable crop in the current location. After this
develops a system by integrating data from various sources, processing is done at server side, the result is sent to the user’s
data analytics, prediction analysis which can improve crop android and web application. The previous production of the
yield productivity and increase the profit margins of farmer crops is also taken into account which in turn leads to precise
helping them over a longer run. crop proposition. Location is the only input for the
extrapolation system. Depending on the numerous scenarios
Keywords and additional filters according to the user requirement the
Data Analytics, Prediction, Machine learning, Multiple linear most producible crop is suggested.
regression.
1. INTRODUCTION
Agriculture is one of the most important occupation practiced
in our country. It is the broadest economic sector and plays an
important role in overall development of the country. About
60 % of the land in the country is used for agriculture in order
to suffice the needs of 1.2 billion people. Thus, modernization
of agriculture is very important and thus will lead the farmers
of our country towards profit. [1] Data analytic (DA) is the
process of examining data sets in order to draw conclusions
about the information they contain, increasingly with the aid
of specialized systems and software. [2] Earlier yield
prediction was performed by considering the farmer's
experience on a particular field and crop. However, as the
conditions change day by day very rapidly, farmers are forced
to cultivate more and more crops. Being this as the current Fig 1: System Architecture
situation, many of them don’t have enough knowledge about
Where,
… are coefficients of Multiple Linear Regression
… are independent variables.
The model is linear because it is linear in the
parameters , and . The model describes a plane in the
three-dimensional space of , . The parameter is
the intercept of this plane. Parameters and are referred
to as partial regression coefficients. Parameter represents
the change in the mean response corresponding to a unit
change in when is held constant.
Parameter represents the change in the mean response
corresponding to a unit change in when is held
constant.
2. Regression Coefficients:
To obtain the regression model, β should be known. β can be
Fig. 5: Point plot of linear regression estimated by method of Least Squares Estimates. The equation
for it is:
5. MULTI-LINEAR REGRESSION
A Linear Regression model that contains more than one Thus,
predictor variable is called a Multiple Linear Regression
5. Adjusted R-Square:
3. R-Square: Adjusted R2 shows how well data points fit a curve or a line,
R2 is the regression sum of squares divided by the total but adjusts the number of data points in a model. If you add
sum of squares. Alternatively, as demonstrated in this, more and more useless variables to your model, adjusted R 2
since SSTO = SSR + SSE, the quantity r2 also equals one will decrease. If you add more useful variables to your model,
minus the ratio of the error sum of squares to the total Adjusted R2 will increase.
sum of squares:
1. Since r2 is a proportion, it is always a number
between 0 and 1.
2. If r2 = 1, all of the data points fall perfectly on the Where,
regression line. N = Number of points in data sample.
3. If r2 = 0, the estimated regression line is perfectly K = Number of independent regressors i.e. number of
horizontal. variables in model, excluding constant.
6. Standard Error:
Standard Error(S) represents the average distance of values
from the regression line. It tells how wrong the regression
model is. Smaller value of S is desirable as it indicates
observations being close to regression line. It is calculated by,
7. ANOVA:
ANOVA is used to compare differences of means among
4. Multiple R: more than two groups. It does this by looking at the variation
This is the correlation coefficient. It tells you how strong the in the data and where the variation is found. Specifically,
linear relationship is. For example, a value of 1 means a ANOVA compares the amount of variation between groups to
perfect positive relationship and a value of zero means no amount of variation within groups. The ANOVA table deals
relationship at all. It is the square root of r. with multiple deciding factors of Multiple Linear Regression
Thus, by means of calculating Degrees of Freedom, Sum of
Squares, Mean of Squares and F-Test to find F-Ratio.
where,
7. IMPLEMENTATION
7.1 Datasets used
5. Android:
0
1 2 3 4 5 6 Y_Test
7 8 9 10
Y_Test Y_Predicted
IOT may lead to connection of all farming devices together [8] N.Heemageetha, “A survey on Application of Data Mining
with help of internet in future. Different types of sensors Techniques to Analyze the soil for agricultural purpose”,
employed in farm will give real time data of farm condition 2016IEEE.
and the devices can be used to increase the moisture, acidity,
etc. accordingly. Farm vehicles like tractor will be connected [9] https://en.wikipedia.org/wiki/Nonlinear_regression
to internet in future which will, in real time pass data to
farmer about crop harvesting and the disease crops may be [10] https://en.wikipedia.org/wiki/Linear_regression
suffering from thus helping the farmer in taking appropriate
[11] DhivyaB ,Manjula , Siva Bharathi, Madhumathi, “A
action. Further the best profitable crop can also be found in
Survey on Crop Yield Prediction based on Agricultural Data”,
light of the monetary and inflation ratio.
International Conferencence in Modern Science and
Engineering,March 2017.
8. REFERENCES
[1]https://en.wikipedia.org/wiki/Agriculture [12] GiritharanRavichandran, ,Koteeshwari R S “Agricultural
Crop Predictor and Advisor using ANN for Smartphones”,
[2]https://en.wikipedia.org/wiki/Data_analysis 2016 IEEE.
[3]JeetendraShenoy, YogeshPingle, “IOT in agriculture”, [13]R.Nagini, Dr. T.V. Rajnikanth, B.V. Kiranmayee,
2016 IEEE. “Agriculture Yield Prediction Using Predictive Analytic
Techniques , 2nd InternationalConference on Contemporary
[4] M.R. Bendre, R.C. Thool, V.R.Thool, “Big Data in Computing and Informatics (ic3i),2016
Precision agriculture”,Sept,2015 NGCT.
[14] Awanit Kumar, Shiv Kumar, “Prediction of production of
crops using K-Means and Fuzzy Logic”, IJCSMC, 201