Vous êtes sur la page 1sur 11

Unstructured Data Management Mini Project3

Team: Puneet Gujral; Shakib Enam


a. Clearly describe the purpose of your project
By compiling a database full of reported Ebola cases we are trying to analyze the
number of deaths due to Ebola within Sierra Leone and Guinea according to the
data provided by WHO and the respective governments for the month of September
2014.
b. Clearly describe the dataset you are working with as well as provide
snapshots
We are storing Ebola outbreak cases reported from Guinea and Sierra Leone which
contains information from various regions within these two countries based on
confirmed cases and number of deaths. This dataset covers the number of Ebola
cases from these countries for September 2014. The dataset contains links to more
in-depth information on the actions taken by the local government.
c. Clearly list the steps you will follow to execute your project
(i)
(ii)
(iii)
(iv)
(v)
(vi)
(vii)

Identify the Ebola dataset


Download the data on local machine in csv format.
Create a database for this project in mongodb
Create 2 different collections to store Ebola data from 2 countries i.e.
Guinea and Sierra Leone.
Import the data in respective collections within the mongodb database.
Query the data to find the number of deaths due to Ebola in some regions
of these 2 countries.
Insert, Delete and Update sample records in one of the collections

d. Clearly list the code lines. Ensure that they will run successfully when
we test your code.
Create database: Use mongodb_lab3
Create collection1: Db.createCollection(Sierraleone_Data)
Create collection2: Db.createCollection(Guinea_Data)
Import data for Sierra Leone collection: mongoimport -d mongodb_lab3 -c
Sierraleone_Data --type csv --file C:\Temp\sierra_leone_ebola_data.csv --headerline
connected to: 127.0.0.1
2015-02-01T23:06:34.961-0500 check 9 750
2015-02-01T23:06:34.999-0500 imported 749 objects
Import data for Guinea collection: mongoimport -d mongodb_lab3 -c
Guinea_Data --type csv --file C:\Temp\guinea_ebola_data.csv --headerline

connected to: 127.0.0.1


2015-02-01T23:18:10.181-0500 check 9 871
2015-02-01T23:18:10.202-0500 imported 870 objects
List out the data imported in each collection: db.Guinea_Data.find().pretty()
db.Sierraleone_Data.find().pretty()
Query data in Sierraleone collection for finding the number of deaths in
Kambia and Kailahun regions of Sierra Leone:
db.Sierraleone_Data.find( { Localite: "Kambia", Localite: "Kailahun", Category :
"Deaths"}).pretty()

Query data in Guinea collection for finding the number of deaths in


Kouroussa region of Guinea: db.Guinea_Data.find( { Localite: "Kouroussa",
Category : "Deaths"}).pretty()
Inserting a sample record in Guinea_Data Collection: >
db.Guinea_Data.insert({
... Country : "Guinea",
... Localite: "Kouroussa",
... Category : "Deaths",
... Value : 3,
... Date: "9/30/2014",
... Sources : "WHO",
... Link : "Siterep169 30Sep" })
WriteResult({ "nInserted" : 1 })
Deleting a sample record in Guinea_Data Collection:
db.Guinea_Data.remove({'Value':3})
WriteResult({ "nRemoved" : 135 })
Updating a sample record in Guinea_Data_Collection:
db.Guinea_Data.update({'Category':'Deaths'}, {'Value':3}, {$set:{'Value':2}})

e. Also ensure you provide the code to insert the data as well as snapshot
of the output
Please find below the following snapshots:
Snapshot1: Snapshot of Sierra Leone dataset
Snapshot2: Snapshot of Guinea dataset

Snapshot3: Inserting data into 2 collections i.e. Sierraleone_Data and Guinea_Data


Snapshot4: Showing data inserted into Guinea_Data collection.
Snapshot5: Showing data inserted into Sierraleone_data collection.
Snapshot6: Query result for finding the number of deaths in Kambia and Kailahun
regions of Sierra Leone.
Snapshot7: Query results for finding the number of deaths in Kouroussa region of
Guinea.
Snapshot8 : Inserting a sample record in Guinea_Data collection
Snapshot 9: Deleting a sample record in Guinea_Data collection
Snapshot10: Updating a sample record in Guinea_Data collection

1. Snapshot of Sierra Leone dataset:

2. Snapshot of Guinea dataset:

3. Snapshot for inserting data in our database mongodb_lab3:

4.

Showing data inserted into Guinea_Data collection:

5.

Showing data inserted into Sierraleone_data collection:

6. Query result for finding the number of deaths in Kambia and Kailahun regions
of Sierra Leone.

7.

Query results for finding the number of deaths in Kouroussa region of


Guinea.

8.

Inserting sample record into Guinea_Data collection:

9. Deleting sample record from Guinea_Data_Collection:

10. Updating sample record from Guinea_Data_Collection:

Vous aimerez peut-être aussi