Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Python Assignment

All code should be submitted as PDF and not as a picture (you can use pictures for your flowcharts only). PDFs should be submitted as a primary resource, and a zip file including the .ipynb file and any additional files (for instance, a picture or pdf for your flowchart) as a secondary resource.

Important note: You can use either Anaconda or Colab to work on the Jupyter notebook that you will submit as your final project on Forum:

1 – Start by downloading this Jupyter notebook to your local machine.

2 – Open a tab in your browser and type https://colab.research.google.com/.

3 – This will open a small window. Choose the last option Show notebooks in Drive on the upper menu, “Upload”. Then choose the Jupyter notebook you have saved in step 1.

4 – You can start working on your assignment by answering the questions in the corresponding cells.

5 – If you have any questions, reach out to your instructors and the CIS tutors.

As you solve each problem, follow the steps of algorithmic thinking as outlined below.

Step 1: Algorithm Description. Use an algorithm and a flow chart to develop and express your algorithm that accomplishes the given task. Remember, you have to be very explicit and clear to make sure one can actually accomplish the task following your directions. Describe the input(s), output(s) and the process of the algorithm.

Step 2: Program Code – Implementation: Implement the algorithm in Python using the basic structures we covered in class (ONLY USE CONCEPTS COVERED IN CLASS):

User input

Variables

Operators

Conditional execution

For/while loops

Data structures

Functions and modules

Pandas

Step 3: Program Testing: Create a Test Plan with two or three test cases that demonstrate your code works as intended. Explain how you used these test cases in your comments.

Step 4: Program Documentation: Be sure to comment thoroughly so that it is clear that you understand what every line of the code is intended to accomplish.

Part 1: Data Analysis and Visualization

You will work with a dataset that contains information on a coffee shop’s sales. The dataset is below. DOWNLOAD THE DATASET AS A CSV FILE ON YOUR COMPUTER FROM THE LINK BELOW AND READ IT IN PANDAS FROM THERE. DO NOT READ IT FROM THE LINK BELOW.

Dataset: https://drive.google.com/file/d/141afTVoF0J2FjpLI-VfERyJM7aWUQ8az/view?usp=sharing

Variables:

transaction_id – transaction id

transaction_date – transaction date

transaction_time – transaction time

sales_outlet_id – sales outlet (A, B, C, D, E, F or G)

staff_id – id of the staff member

customer_id – ID of the customer

instore_yn – whether the sale was in the store (yes or no)

product_id – id of the product

quantity – quantity purchased

unit_price – price per unit (item) in USD

promo_item_yn – whether the item was on promotion (yes or no)

Question 1.

Import the csv file in pandas and save it as a dataframe. Then, write a code that returns: (1) the first 10 and last 10 rows; and (2) the number of rows and columns in the data set. Discuss what the code shows you about the data set.

Question 2.

Write a code that returns: (1) the distribution of sales outlets (including a count of each outlet type and a bar chart); (2) the minimum and maximum transaction_id; (3) the minimum, maximum and average customer_id; and (4) the distribution of products in bought in store (yes or no) using a pie chart.

Question 3.

You discover that the variable unit_price was incorrectly recorded. Create a new variable unit_price_corrected where you add 1.50 to unit_price for the first 100 items, and you subtract 1.50 from the unit price for the remaining items in the data set. Then, calculate and compare the average of unit_price and unit_price_corrected.

Question 4.

The coffee shop’s management wants to find out which of the outlets has the highest revenue. Calculate the total revenue for each of the outlets. Remember that total revenue will be unit_price_corrected multiplied by quantity. Also, present your calculations using a line graph. Explain what you found and what the chart shows.

Question 5.

The coffee shop’s management wants to find out how the staff are doing in terms of sales. For each of the staff ids, calculate the total product units sold and the total revenue sold. Provide two bar charts (one for total product units, one for total revenue) by staff id, and interpret your findings.

Question 6.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Question 7.

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.

Part 2

You are hired to develop an online management system for a cafe. This program will be used by the café admins and will help them manage online orders.

Use a function to develop a program with the following features:

Allow the café admin to enter the menu items until the user enters quit to stop. The list should include a minimum of 10 items. For example: main_categories = [Americano, Espresso, Cheese sandwich]

Use the main menu list you created in step 1 to create a dictionary that should contain the price of each of the menu items with their respective cost. For example: items_price= {“Americano”: 13, “Espresso”: 9, “Cheese sandwich”:15}

Use the main menu list you created in step 1 to create another dictionary that should contain the quantity of each menu item. items_quantity={“Americano”: 50, “Esspresso”: 30, “Cheese sandwich”:10}

Use the main menu list you created in step 1 to create another dictionary that allows the cafe admin to record the rating received from customers on menu items. The ratings are scored on a scale from 1 to 5, with 5 indicating the maximum customer satisfaction. For example: items_rating = {“Americano”: 4, “Esspresso”: 1, “Cheese sandwich”:5}

Your function should return the following data structures separately:

The dictionary that includes all entries.

A list named satisfied_item, which includes the items with satisfaction of 3 or higher.

A list named highprice_item, which includes the items with price above 10 .

A list named few_items, which includes the items with quantity less than 5.

For part 2 only: First, create a step-by-step algorithm and a flowchart and then translate it into a fully functional and documented Python code. Follow the flowchart shape conventions from the session 3 reading, available here.

 

Develop one question yourself that can be answered with the information included in this dataset. Write the code to answer the question, and include a visualization.
Scroll to top