Sip FINAL

Deriving the Effects of Online Purchasing
Habits
A Senior Integration Project
Submitted to the Economics Department of Covenant College
In Partial Fulfillment of the Requirements
for the Degree of Bachelor of Arts
By Matt Schroeder
1
Discovering the Effects of Online Purchasing
Habits
Abstract
by
Matt Schroeder
Research Topic:
This paper will attempt to analyze what variables affect online
purchasing value, using zip code income in the United States as a proxy for
individual household income.
Abstract:
This empirical study was conducted to see if zip code income is a good
explainer for order value when shopping online. I am using Shelly Cove, LLC
as the proxy for all online purchases. I am also including the variables:
population, number of orders within a zip code, and percent of the population
that is dependent. For the second half of the study, I will be testing the
percent of people who earn over $75,000 a year, rather than average
income, in an attempt to remove any outlier bias that can result from mean
measures. Last, I will run a simple test to see if income has any affect during
holiday times. I will be splitting zip codes into quartiles based on income,
and curating a time series line chart, to see if order behavior converges
closer to holiday times.
Two equations were created from this study that attempt to describe
order value. One with the primary independent variable of mean income,
2
and the other equation with the primary variable of percent of income over
$75,000. All regressions were run using simple OLS models.
Table of Contents
Abstract
..2
Appreciations
.4
Chapter 1:
Introduction
..5
Chapter 2: Literature
Review
..7
Chapter 3: Data, Theory, and

Equations.9
3.1
Data
.9
3.1.1 Data
Source.
9
3.1.2 Data
Limitations
..9
3.1.3 Data
Manipulation10
3.2
Theory
10
3.3 Equations and

Variables12
3
Chapter 4:
Results
...14
4.1 Mean Income

Regression.
14
4.2 Percent of Income over $75,000

Regression..16
4.3 Seasonal Convergence

Test.18
Chapter 5: Conclusion & Improvements .

21
5.1
Conclusion
..21
5.2
Improvements
..22
Appendix: Regression Tables.

..23
Works
Cited
..27
4
I would like to thank my parents, who showed me the importance of
hard work and following your passions. Without you, I would not have been
able to attend Covenant, and receive this wonderful education. I would also
like to thank my readers Oliver Beers and Hunter Davis, as well as my
professors Dr. Lance Wescher and Dr. John Rush, who have all helped me
design my experiment.
Chapter 1:
Introduction
Online shopping has clearly become a new beast in the realm of
retailers across the globe. With the rise of Amazon (revenue of $107 billion
in 2015 alone)i, and more and more retailers creating online companions to
their stores, eCommerce is obviously something that is not going away any
5
time soon. In fact, in the United States alone, eCommerce is expected to rise
56% from $335 billion (2015) to $523 billion by 2020ii. One major difference
between online shopping and retail shopping, is the intangibility of money
that online shopping creates. Not only are customers using a credit card
instead of physical cash, but sites like Amazon, and other major retailers
have credit card info saved, so we dont even have to enter it on checkout.
Less time spent thinking about our decisions means easier consumption on
our end. Purchasing from home is quite literally a click away.
Does this simplicity cause an irresponsibility in online shopping? More
specifically, does ones income make a difference on how much they spend
online, or does income not matter at all? You could argue that necessary
spending (insurance, mortgage, food, gas, etc.) is not spent online, where
luxury spending (clothes, electronics, accessories, etc.) is mainly through
online purchases. Therefore, if ones income is higher, it would make sense
that their higher level of disposable income is spent through online shopping,
versus someone with a lower income who cannot afford as many luxuries
that online shopping may provide.
This paper is going to test the theory that a customers online
purchasing value can be described by zip code income, number of
dependents, order activity within that zip code, and population. I also
believe that orders placed around holiday periods will be of higher value, and
that mean income will have less of an effect on the order value from a zip
code. There also should be some correlation to the number of dependents in
6
a household and their spending. I will discuss this theory later, but the
correlation could feasibly go either way. My hypothesis is that there is a
correlation between order value and dependents, which could either be
positive or negative. I will attempt to structure a model around mean
income, noting the average order value within the zip code, average number
of dependents, number of orders from that zip code, and the population
within the zip code. I also plan on accounting for mean bias by running two
regressions, one for the mean, and one structured around quartile analysis.
Chapter 2:
7
Literature Review and Theoretical Analysis
Based on basic economic theory, it would make sense that as mean
income increases, disposable income would increase, therefore online
spending would increase. However, we also know that people who have less
money are often less wise with their money. I believe that the disposable
income theory will outweigh peoples lack of self-control with their money,
especially given the nature of the goods that Shelly Cove, LLC sells on their
online store.
Blanca Hernndez, Julio Jimnez, and M. Jos Martn ran a study in
2011 to see if age, gender, and income had an effect on whether or not
people did their primary shopping onlineiii. They found, contrary to their
hypothesis, that these variables have no effect on whether or not people do
most of their buying online vs. in a brick and mortar shop. This is beneficial
to my study, as it goes to show that there is not an inherent bias to the
market we are studying, which is anybody who shops online. If the study
found that only people above an income level did their shopping online, then
we would be faced with a bias issue.
Brendan Hannah and Kristina M. Lybecker, however, ran a study in
2009 about advertising and online spending in income brackets. They
concluded that advertisers who can get a hold of the top income bracket
(specifically men) would find themselves increased revenue more than any
other demographic. This implies that the spending in the higher income
8
bracket, while maybe on par with other income levels on the aggregate, is
more volatile as a whole people will spend more money, just not as ofteniv.
Another similar study in the International Journal of Humanities and
Social Sciences found that there is a significant difference in attitudes of
online shopping in different income brackets. Wealthier people are more
prone to be open to online shopping. Age group makes little to no difference,
as well as occupational groupv.
A study from BI Intelligence in 2014 found that although millennials
from 18-34 have much lower incomes than older adults, they make up a
larger proportion of online spending. However, on the aggregate, online
shoppers tend to live in households with above average incomes, with 55%
of shoppers living with incomes above $75,000vi. It also states that while
women make up a majority of a households spending, men are more likely
to purchase online and on devices.
We can also think of online shopping as the convenience factor. People
that have higher incomes are often more conscious about how they spend
their time, and if saving time by purchasing online is an option, it would
make sense that they would take advantage. A study by Girish Punj verifies
this theory, and found that people with higher incomes are more inclined to
buy online, because they view online shopping as a time-saving mechanism.
People with college degrees especially were inclined to shop online. The
explanation to this, is college students are often more conscious of
maintaining efficiency, and online shopping can provide this efficiency.
9
A study in 2004 shows that recommendations from friends have a large
effect on buying in women specifically. The risk of trying a new site is neutral
across men and women, however, if women get a recommendation from a
friend to try a site, they are much more likely to purchase from it then men
arevii. Since Shelly Cove is more geared towards womens apparel, this is
something to consider.
Chapter 3:
Data, Theory, and Equations
3.1 Data
3.1.1 Data Source
This paper will include two data sets. The first data set is from the IRS
Individual Income Tax Statistics from 2014viii. This data set gives numerous
variables describing zip codes in the United States, notably including
average income, population, total number of dependents, and categorical
income levels. My second data set is from the online store Shelly Cove, LLC.
This data set includes online purchases from July 2015 - September 2016.
Over 16,000 customers data (zip code, date, and order value) will be cross
referenced with the IRS zip code data (mean income, number of dependents,
etc.) to see how order value, mean income, and percentage of dependents
are correlated. I will be running 2 variations of this test. One test will be a
simple regression with the average income value as the independent
10
variable. The other test will be measuring the percent of people in the zip
code that earn over $75,000 a year.
3.1.2 Data Limitations
This pair of data sets in particular contain a number of limitations
which I am attempting to control for. Initially, there is an issue with
individual household income. While zip code average income may be a
suitable proxy for household income, the accuracy of this generalization may
be lacking. Also, while the Shelly Cove data set with 16,000 customers is
relatively large in itself, it is divided amongst individual zip codes, which may
provide bias to certain zip codes with only 1 or 2 orders. I will attempt to
control for this issue by only regressing zip codes with more than 2 orders.
Although, controlling for zip codes with few orders also reduces the size of
the overall data set, and we end up losing data. A real tradeoff exists. Last,
there are other variables I simply do not have access to that could be helpful
in describing online purchasing value. Age and gender specifically would be
interesting to test this theory on.
3.1.3 Data Manipulation
These raw data sets included data that needed to be manipulated in
order to properly run tests. First, the zip code data set included 4 rows for
each zip code (one for each quartile of income). This was consolidated into
one row, allowing me to match the zip code data with the purchasing data.
Before this was done, I needed to average the purchase values for the zip
codes into a new data set, as I was not interested in individual purchases,
11
but rather the mean of the zip code as a whole. Once this was complete, I
combined the zip code data and purchase data using a matching function,
and removed zip codes that did not have any purchases, and also zip codes
that only had one purchase. I also changed the number of dependents to a
percentage to normalize the variable across zip codes. Finally, I created a
new variable % over 75k, which gave the percent of people who earned
over $75,000 a year, as an attempt to remove outlier bias (ie. People who
earn millions of dollars a year).
3.2 Theory
Model and Equations:
The model in this paper will use a series of simple multivariate
regressions to analyze and predict a reasonable outcome. The tests will
analyze different variables and how they affect ones online order value.
That being said, the dependent variable in the model is going to be average
order value, and the independent variables are mean income (of percent of
people over $75,000 income), time of year, population, number of orders
from that zip code, and percent of the population that are dependents. Time
of year will be in an analysis by itself, as the formatting will be different for
this analysis. The average order value, number of orders per zip code, and
time of year will come from the Shelly Cove data set, while the mean
income, population, and number of dependents is going to come from the
2014 IRS data. The mean order value from the Shelly Cove data set is
$43.11, ranging from $2.99 to $306. All 50 states are represented, and over
12
3,000 zip codes will be tested from the 50 states. The average number of
orders from a zip code (excluding the zip codes with only one order), is 5.1
orders/zip code.
If the mean income in a zip code is larger, it would seem to have a
positive correlation on the zip codes average order value online. For the
Shelly Cove data, I have removed zip codes that only have 1 order in an
attempt to get a better aggregate on what the average person within the zip
code will purchase. For time of year, I plan on visualizing this variable,
making a line chart, separating zip codes into 4 levels of income, and seeing
if average order value converges as holiday periods roll around. For the
average number of dependents in a household, this could be a tossup. I am
hypothesizing a negative effect on average order value, but there are a
couple ways to view this variable. The more children you have, the less
disposable income you have, meaning you will spend less. But on the other
hand, the more children you have, the more people you need to buy for, so
the order value would go up since you are buying for 4 people, for instance,
instead of 2. Population could also be a tossup. The Shelly Cove brand
appeals more to preppy individuals, which often rely in cities, so I will
hypothesize that as population goes up, the average order value will also go
up. Finally, the number of orders per zip code. If there is a large influx of
orders from a zip code, I would expect that there may be some sort of trend
happening within that zip code for Shelly Cove apparel. Therefore, if the
13
number of orders within a zip code increases, I will expect the average order
value to also increase.
I am predicting that mean income will have the strongest correlation
between all of the variables, followed by holiday times. During Christmas
time, I would expect the quantity of orders to increase, but due to sales
offsetting how much people are buying, the average order value to increase,
but not by much.
3.3 Equations and Variables
Variable Description Hypothesis

AVINCOME This represents the H: > 0
0
mean income in the

zip code for H: 0
A
dependents. It does
not represent
households.
OVER75K This is another income H: > 0
0
measure, that
describes the percent H: 0
A
of the zip code

population that earns
over $75,000 a year.
TIME OF YEAR Time of year is a
dummy variable where
(1) is from October H: > 0
0
December,
H: 0
representing holiday
A
times, and (0) is from

January September,
representing off-
holiday times.
PERCOFDEP This represents the H: 0
0
percent of the
population that is H: = 0
A
dependents.
14
POPULATION Population represents H: > 0
0
the population of the

zip code. This is both H: 0
A
dependents and non-

dependents.
NUMORDERS This represents the H: > 0
0
number of orders that

have come from the H: 0
A
zip code.
Equation 1:
(+) (+)
(-)
Average Order Value = f(Mean Income, Time of Year, Number of
Dependents,
(+) (+)
Population, Number of Zip Code Orders
Equation 2:
(+) (+)
(-)
Average Order Value = f(Over75k, Time of Year, Number of
Dependents,
(+) (+)
Population, Number of Zip Code Orders
15
Chapter 4:
Results
4.1 Mean Income Regressions
The following are the results from the series of regressions I ran, in an
attempt to see if the variables I selected were good predictors for average
order value. I ran two series of regressions to test the mean. The first was
on zip codes with at least two orders, and the second was on zip codes with
at least five orders. This was an attempt to remove bias from a single large
order (or small order) skewing the data too much. There is a tradeoff when
performing this kind of test, which is trading a smaller sample size for less
potential bias. (Essentially, a smaller sample size in one capacity vs. a
smaller sample size in another capacity).
The table on page 16 shows my regression results. The first column is
the regression on zip codes that have at least two orders, and the second
column is the regression zip codes with at least five orders. As you can see,
avincome and the constant term are the only statistically significant
variables in these regressions. Also, you will notice that percofdep is
removed from the second regression. This is because the statistical
significance for that variable was significantly lower than all other variables,
so I left it out (a 0.0741 coefficient with a standard error of over 1). I did not
include the results from this regression in the table, but I removed the
percofdep variable from the first regression data, and it hardly altered the
results, so I left only my original theory in the report. Youll also notice that
16
the R^2 values in these regressions are very low. This is expected, as we are
clearly missing variables from our data set that could help explain the
average order value. What I am more concerned about is statistical
significance. My main theory was that average income would be a good
explainer for average order value. It appears that as income increases by
$1,000, the average order value is expected to increase, respectively, $.022,
and $.039. Clearly, this has very little marketing initiative, as an order
increase of only pennies makes the extra cost of target marketing a wash,
and maybe even a loss. The following variables will be taken with a grain of
salt, so to speak, as their statistical significance is not notable, but the
results will still be described. As the population in the zip code increases by
1,000 people, the predicted order value will decrease by $.029 and $.032.
The percent of the population that is dependent has a positive correlation on
the order value. As the percent of dependents increases by 1%, the average
order value will increase less than a penny (since the coefficient needs to be
divided by 100, since the data is in terms of a proportion). Lastly, the
number of orders within a zip code has a positive correlation with the order
value. For every extra order a zip code receives, it is expected to have a
$.11, and $.048 increase on the order value respectively.
17
(<2 Orders) (<5 Orders)
VARIABLES avordervalue avordervalue
avincome 2.17e-05*** 3.90e-05***

(6.21e-06) (1.23e-05)
population -2.86e-05 -3.19e-05

(3.63e-05) (6.08e-05)
percofdep 0.0741
(1.062)
numorders 0.108 0.0484

(0.112) (0.128)
Constant 41.18*** 40.32***

(0.898) (1.553)
Observations 2,923 549

R-squared 0.005 0.019
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
4.2 Percent of Income over $75,000 Regressions
Next, I ran a series of regressions that included percent of the
population who earn over $75,000, rather than mean income. This was an
attempt to control for extreme cases like millionaires living in an area (and
18
with zip codes typically containing only a few thousand people, extremely
rich individuals could easily skew the data).
The table below shows my regression results. The first column is all zip
codes with at least two orders, and the second column is all zip codes with at
least five orders. I have left out the variable percofdep (percent of
population that is dependent) due to its extreme lack of significance. We see
again, that the income measure and the constant are the only variables with
any statistical significance. However, the income measure has lowered in
statistical significance. Similar to the average income measure, we are
seeing a positive correlation with income and order value. As the percent of
people who earn over $75,000 a year increases by 1%, we can expect that
the order value within that zip code will increase $.055, and $.068
respectively. The following variables have no statistical significance, but
their results are still noted. Population again has a negative effect on order
value. As the population increases by 1,000 people, the order value is
predicted to decrease by $.03-.04, similar to the previous set of regressions.
Last, as the number of orders within a zip code increases, we see another
positive correlation on order value. For every additional order within a zip
code, we can expect a $.09 or $.042 increase in the order value respectively.
Our R^2 became even lower, and our statistical significance also became
lower, so this leads me to believe that there was not much rich income bias
in the average income measure to begin with.
(1) (2)
VARIABLES avordervalue avordervalue
19
over75k 5.586** 6.785*
(2.239) (3.845)
population -3.01e-05 -3.95e-05

(3.62e-05) (6.14e-05)
numorders 0.0897 0.0417

(0.113) (0.130)
Constant 41.23*** 41.32***

(0.801) (1.672)
Observations 2,923 549

R-squared 0.003 0.006
Standard errors in parentheses
*** p<0.01, ** p<0.05, * p<0.1
4.3 Seasonal Convergence Test
For this next test, I wanted to see if holiday times effected income-
based buying habits. Rather does it matter what your income is when the
holidays roll around? I separated each zip code into quartiles based on
income. Next, I divided the sales data into monthly totals, and tabulated
them based on what quartile the sale came from. My theory was that there
would be a noticeable separation during the off-season (between holiday
times) between quartiles. In November and December, there would be a
convergence, as people are more spendy during those two months, and it
doesnt matter as much what your income is. In the first graph, the X-axis is
percent of monthly sales, and the Y axis is time. As we can see, there
becomes a relatively large separation close to the holiday season, where the
top quartile becomes separated from the others, and the bottom quartile
dips below the middle 50%. Oddly enough, from May to October, the
20
monthly spends become inverted, where the poorer zip codes spend more
than the richest ones. However, contrary to my hypothesis, there is a
noticeable divergence in the rich zip codes in November and December,
rather than a convergence. However, these findings are still notable and
significant.
Seasonality Convergence Test

35%
30%
25%
20%
15%
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Poorest 25% Lower Middle25% Upper Middle25% Richest 25%
In the below graph, I changed the Y-Axis from a percent value, to
a total dollar value to try and see another side of this picture. In the
months of April October, we see little to no separation between the
quartiles. In fact, the poorest zip codes had a relatively high purchase
rate in August, higher than all the other categories of zip codes.
Similar to above, however, we see a noticeable divergence in the
holiday months, where the richest zip codes end up spending more
than the poorer ones. Again, although my hypothesis was that the zip
21
code spending would converge in November and December, I still find
these findings extremely interesting. Rich people appear to have
similar spending habits throughout the year, but treat the holidays as a
time to stretch the wallet a little farther than other folks. Maybe
holiday spending is proportional to your income, while spending
throughout the year is simply a fixed cost on a site such as Shelly
Cove.
Seasonality Convergence Test

40000
35000
30000
25000
20000
15000
10000
5000
0
Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec
Poorest 25% Lower Middle25% Upper Middle25% Richest 25%
So what does this mean? Well, two possible conclusions can be
deduced. One, is that rich people dont spend as much of their disposable
income during the year (or theoretically, in off-season times, everyone
spends about the same regardless of income). The second conclusion is that
22
rich areas do tend to spend more during holiday times (or at least more
quantities rich people buy, since we are measuring total spend in the zip
codes rather than the average order value). While this is the opposite of
what my hypothesis was, it is still encouraging to find notable results in this
test.
Chapter 5:
Conclusion & improvements
5.1 Conclusion
In this paper, I aimed to test the statistical significance of a series of
variables on the purchasing habits within a zip code. Data that describes the
characteristics of the most spendy customers are extremely valuable
information for marketers and business owners, but not easy to nail down
and define without expansive data sets that span a large number of
variables. Using variables that were able to be derived from my data set, I
was able to run tests for income levels, number of dependents, number of
orders from a zip code, and population to see if they accurately predicted
checkout totals for the store Shelly Cove, LLC. After conducting this study, I
believe that the number of dependents does not have an effect on
purchasing habits. The statistical significance is severely lacking, and theory
would say that the hypothesis could go either way. The rest of the variables I
23
believe would be useful in a larger scale study, with more variables included
and a much larger data set. The statistical significance on many of these
variables is small, but I am hypothesizing that it is due to an already small
data set, broken even smaller into zip code segments. Population had a very
small negative affect on order value, which contradicted my hypothesis.
Number of orders received from a zip code had a decently sized positive
effect on order value, but the statistical significance was low. Again, I believe
this problem could be fixed with a larger data set. The most significant
variable I found was income levels. In the two ways of testing this variable, I
have come to the conclusion that measuring mean income is a better way to
predict order value, rather than the $75,000 measure. While income had a
positive effect on order value, it did not seem to be the biggest deciding
factor, which further leads be to believe there were numerous cases of
omitted variable bias. Even with the small data set, income levels were
statistically significant at the 1% level, which is encouraging for possible
future study on this topic. Although I believe there would be statistical
significance with a larger data set in most of the variables I chose, I do not
believe this is the whole story, and to create a better model with a more
precise specification, there needs to be a data set that explains geographical
and cultural data, and a companys purchases that appeals to a wider and
more diverse population.
5.2 Improvements
24
As stated above, I have a series of improvements that I would attempt
to implement if this study was to be conducted again in the future. First,
attain a larger data set so that there are at least 10 orders from all the zip
codes you are studying. Second, attempt to attain variables such as gender,
personal household income (which would negate the need for more zip code
data), a company with a wider range of products, and has been around
longer, and finally, more variables. Gender, age, source of where the
customer heard of the company, who they are buying for, etc. would be
extremely valuable information to a marketing team, and could cut down on
advertising costs by laser targeting advertising campaigns.
Appendix:
Regression Tables
25
Table 1: The first regression for average income on the zip codes with over 2 orders.
Table 2: I regressed the average zip code order value on average income to see if there was a noticeable correlation with just
these two variables involved. There was nothing more significant than before.
26
Table 3: For the last regression in the first "half" of the study, I regressed the same variables as the first regression (minus
number of dependents), on the zip codes that contained more than 5 orders.
27
Table 4: I ran two regressions below, with the percent over $75k as the new income descriptor. In the second regression, I left out
the number of dependents measure, as it was extremely statistically insignificant.
28
Table 5: Finally, I ran the same regression as above (omitting the dependents measure) on the zip codes with at least 5 orders.
Table 6: A summary of descriptive statistics for my main variables in the regression analysis
Works Cited:
29
i Statista. Feb, 2016. Accessed September 28,
2016.https://www.statista.com/statistics/266282/annual-net-revenue-of-
amazoncom/
ii Ecommerce Sales, Internet Retailer. Matt Linder, Jan 29, 2016. Accessed
Sept 28, 2016.https://www.internetretailer.com/2016/01/29/online-sales-will-
reach-523-billion-2020-us
iii Blanca Hernndez, Julio Jimnez, M. Jos Martn, "Age, gender and
income: do they reallymoderate online shopping behaviour?", Online
Information Review, Vol. 35 Iss: 1, pp.113 - 133
iv Hannah, Brendan and Lybecker, Kristina M., Determinants of Recent

Online Purchasingand the Percentage of Income Spent Online (May 29,
2009). Colorado College Working Paper No. 2009-02. Available at
SSRN:http://ssrn.com/abstract=1413983 or http://dx.doi.org/10.2139/ssrn.1
413983
v Zuroni Md Jusoh, Goh Hai Ling, International Journal of Humanities and

Social Sciences, Vol 2No 4, FACTORS INFLUENCING CONSUMERS ATTITUDE
TOWARDS E-COMMERCE PURCHASES THROUGH ONLINE SHOPPING
vi http://www.businessinsider.com/the-surprising-demographics-of-who-
shops-online-and-onmobile-2014-6
vii Ellen Gabarino & Michael Strahilevitz, 2004, Journal of Business Research 57
pp. 768-775,
Gender differences in the perceived risk of buying online and the effects of
receiving a site recommendation
viii IRS 2014. Accessed Aug 25, 2016.https://www.irs.gov/uac/soi-tax-

stats-individual-income-tax-statistics-2014-zip-code-data-soi

Sip FINAL

Transféré par

Informations du document

Copyright

Formats disponibles

Partager ce document

Partager ou intégrer le document

Options de partage

Avez-vous trouvé ce document utile ?

Ce contenu est-il inapproprié ?

Droits d'auteur :

Formats disponibles

Sip FINAL

Transféré par

Droits d'auteur :

Formats disponibles

Deriving the Effects of Online Purchasing

A Senior Integration Project

Submitted to the Economics Department of Covenant College

In Partial Fulfillment of the Requirements

for the Degree of Bachelor of Arts

This paper will attempt to analyze what variables affect online

individual household income.

closer to holiday times.

$75,000. All regressions were run using simple OLS models.

Chapter 3: Data, Theory, and

3.3 Equations and

4.1 Mean Income

4.2 Percent of Income over $75,000

4.3 Seasonal Convergence

Chapter 5: Conclusion & Improvements .

Appendix: Regression Tables.

Online shopping has clearly become a new beast in the realm of

between online shopping and retail shopping, is the intangibility of money

our end. Purchasing from home is quite literally a click away.

Does this simplicity cause an irresponsibility in online shopping? More

luxury spending (clothes, electronics, accessories, etc.) is mainly through

online purchases. Therefore, if ones income is higher, it would make sense

that online shopping may provide.

This paper is going to test the theory that a customers online

purchasing value can be described by zip code income, number of

code. There also should be some correlation to the number of dependents in

correlation could feasibly go either way. My hypothesis is that there is a

correlation between order value and dependents, which could either be

positive or negative. I will attempt to structure a model around mean

Based on basic economic theory, it would make sense that as mean

income increases, disposable income would increase, therefore online

Blanca Hernndez, Julio Jimnez, and M. Jos Martn ran a study in

hypothesis, that these variables have no effect on whether or not people do

to my study, as it goes to show that there is not an inherent bias to the

we would be faced with a bias issue.

Brendan Hannah and Kristina M. Lybecker, however, ran a study in

2009 about advertising and online spending in income brackets. They

Another similar study in the International Journal of Humanities and

Social Sciences found that there is a significant difference in attitudes of

online shopping in different income brackets. Wealthier people are more

prone to be open to online shopping. Age group makes little to no difference,

as well as occupational groupv.

A study from BI Intelligence in 2014 found that although millennials

larger proportion of online spending. However, on the aggregate, online

women make up a majority of a households spending, men are more likely

to purchase online and on devices.

We can also think of online shopping as the convenience factor. People

their time, and if saving time by purchasing online is an option, it would

buy online, because they view online shopping as a time-saving mechanism.

explanation to this, is college students are often more conscious of

maintaining efficiency, and online shopping can provide this efficiency.

across men and women, however, if women get a recommendation from a

Data, Theory, and Equations

3.1.1 Data Source

variables describing zip codes in the United States, notably including

average income, population, total number of dependents, and categorical

simple regression with the average income value as the independent

code that earn over $75,000 a year.

3.1.2 Data Limitations

This pair of data sets in particular contain a number of limitations

which I am attempting to control for. Initially, there is an issue with

individual household income. While zip code average income may be a

in describing online purchasing value. Age and gender specifically would be

interesting to test this theory on.

avincome 2.17e-05* 3.90e-05*

Constant 41.18* 40.32*

Constant 41.23* 41.32*