Python Assignment
All code should be submitted as PDF and not as a picture (you can use pictures for your flowcharts only). PDFs should be submitted as a primary resource, and a zip file including the .ipynb file and any additional files (for instance, a picture or pdf for your flowchart) as a secondary resource.
Important note: You can use either Anaconda or Colab to work on the Jupyter notebook that you will submit as your final project on Forum:
1 – Start by downloading this Jupyter notebook to your local machine.
2 – Open a tab in your browser and type https://colab.research.google.com/.
3 – This will open a small window. Choose the last option Show notebooks in Drive on the upper menu, “Upload”. Then choose the Jupyter notebook you have saved in step 1.
4 – You can start working on your assignment by answering the questions in the corresponding cells.
5 – If you have any questions, reach out to your instructors and the CIS tutors.
As you solve each problem, follow the steps of algorithmic thinking as outlined below.
Step 1: Algorithm Description. Use an algorithm and a flow chart to develop and express your algorithm that accomplishes the given task. Remember, you have to be very explicit and clear to make sure one can actually accomplish the task following your directions. Describe the input(s), output(s) and the process of the algorithm.
Step 2: Program Code – Implementation: Implement the algorithm in Python using the basic structures we covered in class (ONLY USE CONCEPTS COVERED IN CLASS):
User input
Variables
Operators
Conditional execution
For/while loops
Data structures
Functions and modules
Pandas
Step 3: Program Testing: Create a Test Plan with two or three test cases that demonstrate your code works as intended. Explain how you used these test cases in your comments.
Step 4: Program Documentation: Be sure to comment thoroughly so that it is clear that you understand what every line of the code is intended to accomplish.
Part 1: Data Analysis and Visualization
You will work with a dataset that contains information on a coffee shop’s sales. The dataset is below. DOWNLOAD THE DATASET AS A CSV FILE ON YOUR COMPUTER FROM THE LINK BELOW AND READ IT IN PANDAS FROM THERE. DO NOT READ IT FROM THE LINK BELOW.
Dataset: https://drive.google.com/file/d/141afTVoF0J2FjpLI-VfERyJM7aWUQ8az/view?usp=sharing
Variables:
transaction_id – transaction id
transaction_date – transaction date
transaction_time – transaction time
sales_outlet_id – sales outlet (A, B, C, D, E, F or G)
staff_id – id of the staff member
customer_id – ID of the customer
instore_yn – whether the sale was in the store (yes or no)
product_id – id of the product
quantity – quantity purchased
unit_price – price per unit (item) in USD
promo_item_yn – whether the item was on promotion (yes or no)
Question 1.
Import the csv file in pandas and save it as a dataframe. Then, write a code that returns: (1) the first 10 and last 10 rows; and (2) the number of rows and columns in the data set. Discuss what the code shows you about the data set.
Question 2.
Write a code that returns: (1) the distribution of sales outlets (including a count of each outlet type and a bar chart); (2) the minimum and maximum transaction_id; (3) the minimum, maximum and average customer_id; and (4) the distribution of products in bought in store (yes or no) using a pie chart.
Question 3.
You discover that the variable unit_price was incorrectly recorded. Create a new variable unit_price_corrected where you add 1.50 to unit_price for the first 100 items, and you subtract 1.50 from the unit price for the remaining items in the data set. Then, calculate and compare the average of unit_price and unit_price_corrected.
Question 4.
The coffee shop’s management wants to find out which of the outlets has the highest revenue. Calculate the total revenue for each of the outlets. Remember that total revenue will be unit_price_corrected multiplied by quantity. Also, present your calculations using a line graph. Explain what you found and what the chart shows.
Question 5.
The coffee shop’s management wants to find out how the staff are doing in terms of sales. For each of the staff ids, calculate the total product units sold and the total revenue sold. Provide two bar charts (one for total product units, one for total revenue) by staff id, and interpret your findings.
Question 6.
Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.
Question 7.
Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.
Part 2
You are hired to develop an online management system for a cafe. This program will be used by the café admins and will help them manage online orders.
Use a function to develop a program with the following features:
Allow the café admin to enter the menu items until the user enters quit to stop. The list should include a minimum of 10 items. For example: main_categories = [Americano, Espresso, Cheese sandwich]
Use the main menu list you created in step 1 to create a dictionary that should contain the price of each of the menu items with their respective cost. For example: items_price= {“Americano”: 13, “Espresso”: 9, “Cheese sandwich”:15}
Use the main menu list you created in step 1 to create another dictionary that should contain the quantity of each menu item. items_quantity={“Americano”: 50, “Esspresso”: 30, “Cheese sandwich”:10}
Use the main menu list you created in step 1 to create another dictionary that allows the cafe admin to record the rating received from customers on menu items. The ratings are scored on a scale from 1 to 5, with 5 indicating the maximum customer satisfaction. For example: items_rating = {“Americano”: 4, “Esspresso”: 1, “Cheese sandwich”:5}
Your function should return the following data structures separately:
The dictionary that includes all entries.
A list named satisfied_item, which includes the items with satisfaction of 3 or higher.
A list named highprice_item, which includes the items with price above 10 .
A list named few_items, which includes the items with quantity less than 5.
For part 2 only: First, create a step-by-step algorithm and a flowchart and then translate it into a fully functional and documented Python code. Follow the flowchart shape conventions from the session 3 reading, available here.