Académique Documents
Professionnel Documents
Culture Documents
This given dataset was made on the basis of data provided by US Census
Bureau. The data were collected as part of the 1990 US census. These are
mostly counts cumulated at different survey levels. For the purpose of this
data set a level State-Place was used. Data from all states was obtained.
Most of the counts were changed into appropriate proportions.
There are 4 different data sets obtained from this database:
House(8H)
House(8L)
House(16H)
House(16L)
These are all concerned with predicting the median price of the house in the
region based on demographic composition and a state of housing market in
the region.
A number in the name signifies the number of attributes of the data set. A
following letter denotes a very rough approximation to the difficulty of the
task. For Low task difficulty, more correlated attributes were chosen as
signified by univariate smooth fit of that input on the target. Tasks with High
difficulty have had their attributes chosen to make the modelling more
difficult due to higher variance or lower correlation of the inputs to the
target.
It contains 4 tasks, each concerned with predicting the median price of the
house in a small survey region. This dataset consists of data about houses
that are vacant for rent or vacant for sale. The data has been categorized on
the basis of:
State/Location
Age of the residents
Ethnicity of the residents
Marital status
No of members in each household
Rent or Price for each combination of above attributes etc
Approach used
Data Cleaning
Variable Reduction
Model Building
Data Reduction
MODEL BUILDING