Vous êtes sur la page 1sur 5

ASSIGNMENT: MODULE 10

ALGORITHM DESIGN
Input format: Input file has 12 columns with headings –
1. event_epoch_time 2. user_id
3. device_id 4. user_agent
5. pizza_name 6. isCheeseBurst
7. Size 8. AddedToppings (colon separated string)
9. Price 10. CouponCode
11. Order_Event 12. isVeg

1. Map-only algorithm for filtering out all the records which have event_epoch_time, user_id,
device_id, user_agent as NULL by taking original dataset as input.
1. /* Here Row[1], Row[2], Row[3]… denotes data in the column event_epoch_time, user_id,
2. * device_id and so on with the index as shown in the list above.
3. */
4. Map(Key, Value)
5. Row = split(value, ‘\t’) // Here, ‘\t’ is to denote tab
6.
7. IF(Row[1] != NULL AND Row[2] != NULL AND Row[3] != NULL AND Row[4] != NULL AND)
8. Write(Row)
9.
10. EXIT Map Function
11. /*This function will output the Row as a 2D array of the data we got from the table.
12. * from here onwards Map(key, Row ) will denote the output of this function is taken as
13. * input for the map function
14. */

2. An algorithm to read the user agent and extract OS Version and Platform from it.
1. Map(Key, Row) // Taking input form output of question 1, 1 row at a time from
2D Array, making the input as 1D array .
2. OS_P = split(Row[4], ‘:’)
3. OS_version =OS_P[2] //Assuming array’s index starts from 1 instead of 0
4. Platform = OS_P[1]
5. Write(OS_version,1 )
6. Write(Platform,1 )
7. // This will output the OS Version and Platform from user_agent

3. getCounter(“Orders”) creates a global variable of same name if already not available.


3.1. To find out the number of veg and non-veg pizzas sold.
1. Map(Key, Row) // Taking input form output of question 1
2. getCounter(“Veg”)
3. getCounter(“Non-Veg”)
4. IF(Row[12] == “Y”)
5. getCounter(“Veg”).incrementBy(1)
6. IF(Row[12] == “N”)
7. getCounter(“Non-Veg”).incrementBy(1)
8. ELSE
9. EXIT Map function
10.
11. EXIT Map Function
12. PRINT Veg
13. PRINT Non-Veg
14. /*Print statement would display the total number of Veg/Non-Veg Pizzas sold since the
Veg and Non-Veg are global variables.*/
3.2 To find out the size wise distribution of pizzas sold
1. Map(Key, Row) // Taking input form output of question 1
2. getCounter(“Small”)
3. getCounter(“Medium”)
4. getCounter(“Large”)
5. getCounter(“Total”)
6. IF(Row[7] == “R”)
7. getCounter(“Small”).incrementBy(1)
8. getCounter(“Total”).incrementBy(1)
9. IF(Row[7] == “M”)
10. getCounter(“Medium”).incrementBy(1)
11. getCounter(“Total”).incrementBy(1)
12.
13. IF(Row[7] == “L”)
14. getCounter(“Large”).incrementBy(1)
15. getCounter(“Total”).incrementBy(1)
16. ELSE
17. EXIT Map Function
18. Exit Map Function
19.
20. //Total, Small, Medium and Large are global variable
21. Total = Small + Medium + Large
22. Distribution_small = (Small / Total)*100
23. Distribution_medium = (Medium / Total)*100
24. Distribution_large = (Large / Total)*100
25. PRINT Distribution_small, Distribution_medium, Distribution_large
26. //Prints the size-wise distribution as the percentage of total pizzas sold

3.3 To find out how many cheese burst pizzas were sold
1. Map(Key, Row) // Taking input form output of question 1
2. getCounter(“Cheese_Burst_Total”)
3. IF(Row[6] == “Y”)
4. getCounter(“Cheese_Burst_Total”).incrementBy(1)
5. ELSE
6. EXIT Map Function
7. EXIT Map Function
8. PRINT Cheese_Burst_Total

3.4 To find out how many small cheese burst pizzas were sold
1. Map(Key, Row) // Taking input form output of question 1
2. getCounter(“Cheese_Burst_Small”)
3. IF(Row[6] == “Y” AND Row[7] == “R”)
4. getCounter(“Cheese_Burst_Small”).incrementBy(1)
5. EXIT Map function
6. PRINT Cheese_Burst_Small
7. //Ideally Cheese_Burst_Small will be 0 as cheese burst is available for medium and
//large. But if there is error in data entry that would be seen in this case.

3.5 To find out the number of cheese burst pizzas whose cost is below Rs 500
1. Map(Key, Row) // Taking input form output of question 1
2. getCounter(“Cheese_Burst_Cheap”)
3. IF(Row[6] == “Y” AND Row[9] < 500)
4. getCounter(“Cheese_Burst_Cheap”).incrementBy(1)
5. EXIT Map Function
6. PRINT Cheese_Burst_Cheap //Prints number of cheese burst pizza sold below
//Rs.500
4. getCounter(“Orders”) function is not available and write the algorithms for functions in question-
3.

4.1 To find out the number of veg and non-veg pizzas sold.
1. Map(Key, Row) // Taking input form output of question 1
2. IF(Row[12] == “Y”)
3. Pizza_type = “Veg”
4. IF(Row[12] == “N”)
5. Pizza_type = “Non-Veg”
6. Write(Pizza_type,1)
7. EXIT Map Function
8.
9. Reduce(key, ValueList) //Taking aggregated output of Map Function as input
10. Pizza_count = 0
11. for i = 0 to ValueList.length
12. Pizza_count = Pizza_count + 1
13. Write(key, Pizza_count)
14. Exit Reduce function
15. //output will be the number of veg/Non-veg pizzas sold

4.2 To find out the size wise distribution of pizzas sold


1. Map(Key, Row) // Taking input form output of question 1
2. IF (Row[7] == “S”)
3. Pizza_size = “Regular”
4. IF (Row[7] == “M”)
5. Pizza_size = “Medium”
6. IF (Row[7] == “L”)
7. Pizza_size = “Large”
8. Write(Pizza_size,1)
9. EXIT Map function
10.
11. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input
12. Size_count = 0
13. for i = 0 to ValueList.length
14. Size _count = Size _count + 1
15. Write(key, Size_count)
16. Exit Reduce function
17.
18. Distribution(key, Size_count_list) // Taking the output of Reduce function as input
19. For(i = 0 to 2){
20. IF(Key == “Regular”) // here Regular, Medium, Large are integer variables
21. Regular = Size_count[i]
22. IF(Key == “Medium”)
23. Medium = Size_count[i]
24. IF(Key == “Large”)
25. Large = Size_count[i]
26. }
27.
28. Total = Regular + Medium + large
29. Distribution_small = (Regular / Total)*100
30. Distribution_medium = (Medium / Total)*100
31. Distribution_large = (Large / Total)*100
32.
33. PRINT Distribution_small, Distribution_medium, Distribution_large
34. //Prints the size-wise distribution as the percentage of total pizzas sold
4.3 To find out how many cheese burst pizzas were sold
1. Map(key, Row) // Taking input form output of question 1
2. IF(Row[6] == “Y”)
3. Crust = “Cheese_burst”
4. ELSE
5. Crust = “other”
6. Write(Crust, 1)
7. EXIT Map function
8.
9. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input
10. CB_count = 0
11. for i = 0 to ValueList.length
12. CB_count = CB_count +1
13. IF(Key == “Cheese_burst”)
14. Write(key, CB_count) //Output will be total number of Cheese burst
15. ELSE //pizzas sold, else no output
16. Return -1
17. Exit Reduce Function

4.4 To find out how many small cheese burst pizzas were sold.

1. Map(Key, Row)
2. IF(Row[6] == “Y” AND Row[7] == “R”)
3. CB_Pizza_Size = “Cheese burst Small”
4. ELSE
5. CB_Pizza_Size = “Cheese burst other”
6.
7. Write(CB_Pizza_Size,1)
8. EXIT Map function
9.
10. Reduce(Key, ValueList) //Taking aggregated output of Map Function as input
11. CB_size_count = 0
12. for i = 0 to ValueList.length
13. CB_size_count = CB_size_count + 1
14.
15. IF(Key == “Cheese Burst Small”)
16. Write(Key, CB_size_count)
17. ELSE
18. Return -1
19. EXIT Reduce Function
//Ideally CB_size_count will be 0 as cheese burst is available for medium and large
sizes. Here the Map function would always exit before the Write command as there are no
small cheese burst pizzas available. But if there is error in data-set that would be seen
in this case.

4.5 To find out the number of cheese burst pizzas whose cost is below Rs.500
1. Map(Key, Row)
2. IF(Row[6] == “Y” AND Row[9] < 500)
3. CB_cheap = “Cheese burst Price < 500”
4. ELSE
5. CB_cheap = “Cheese burst Price > 500”
6. Write(CB_cheap, 1)
7. EXIT Map function
8.
9. Reduce(CB_cheap, Valuelist) //Taking aggregated output of Map Function as input
10. CB_cheap_count = 0
11. for i = 0 to ValueList.length
12. CB_cheap_count = CB_cheap_count + 1
13. IF(Key == Cheese burst Price < 500”)
14. Write(CB_cheap, CB_cheap_count) //output will be “Cheese burst Price < 500,
//<CB_cheap_count’s Value>”. Else no output
15. ELSE
16. Return -1
17. EXIT Reduce Function

Submitted By:

Animesh Anand

STUDENT ID: 2017CBDE037

Vous aimerez peut-être aussi