The original datafile has lat and lon values truncated to 2 decimal places, about 1km in North America. Today, with stores around the globe, the Company is the premier roaster and retailer of specialty coffee in the world. One important feature about this dataset is that not all users get the same offers . In this capstone project, I was free to analyze the data in my way. data than referenced in the text. The first Starbucks opens in Russia: 2007. the original README: This dataset release re-geocodes all of the addresses, for the us_starbucks The profile.json data is the information of 17000 unique people. Sep 8, 2022. With age and income, mean expenditure increases. Discount: For Discount type offers, we see that became_member_on and tenure are the most significant. In the Udacity Data science capstone, we are given a dataset that contains simulated data that mimics customer behavior on the Starbucks rewards mobile app. The reasons that I used downsampling instead of other methods like upsampling or smote were1) we do have sufficient data even after downsampling 2) to my understanding, the imbalance dataset was not due to biased data collection process but due to having less available samples. Weve updated our privacy policy so that we are compliant with changing global privacy regulations and to provide you with insight into the limited ways in which we use your data. Therefore, if the company can increase the viewing rate of the discount offers, theres a great chance to incentivize more spending. Second Attempt: But it may improve through GridSearchCV() . Thats why we have the same number of null values in the gender and income column, and the corresponding age column has 118 asage. The re-geocoded . This text provides general information. The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. Please do not hesitate to contact me. Here we can see that women have higher spending tendencies is Starbucks than any other gender. ), time (int) time in hours since start of test. Here is how I created this label. Here is how I handled all it. dollars)." fat a numeric vector carb a numeric vector fiber a numeric vector protein In both graphs, red- N represents did not complete (view or received) and green-Yes represents offer completed. To get BOGO and Discount offers is also not a very difficult task. Please do not hesitate to contact me. The data has some null values. The testing score of Information model is significantly lower than 80%. PC0 also shows (again) that the income of Females is more than males. Number of Starbucks stores in the U.S. 2005-2022, American Customer Satisfaction Index: Starbucks in the U.S. 2006-2022, Market value of the coffee shop industry in the U.S. 2018-2022. . Search Salary. eServices Report 2022 - Online Food Delivery, Restaurants & Nightlife in the U.S. 2022 - Industry Insights & Data Analysis, Facebook: quarterly number of MAU (monthly active users) worldwide 2008-2022, Quarterly smartphone market share worldwide by vendor 2009-2022, Number of apps available in leading app stores Q3 2022. HAILING LI Performance & security by Cloudflare. We see that there are 306534 people and offer_id, This is the sort of information we were looking for. Do not sell or share my personal information, 1. Profit from the additional features of your individual account. Can and will be cliquey across all stores, managers join in too . There are three types of offers: BOGO ( buy one get one ), discount, and informational. One difficulty in merging the 3 datasets was the value column in the transcript dataset contained both the offer id and the dollar amount. At present CEO of Starbucks is Kevin Johnson and approximately 23,768 locations in global. I decided to investigate this. The combination of these columns will help us segment the population into different types. Starbucks Rewards loyalty program 90-day active members in the U.S. increased to 24.8 million, up 28% year-over-year Full Year Fiscal 2021 Highlights Global comparable store sales increased 20%, primarily driven by a 10% increase in average ticket and a 9% increase in comparable transactions I wanted to see the influence of these offers on purchases. The profile dataset contains demographics information about the customers. Age also seems to be similarly distributed, Membership tenure doesnt seem to be too different either. Finally, I wanted to see how the offers influence a particular group ofpeople. liability for the information given being complete or correct. You must click the link in the email to activate your subscription. Starbucks Card, Loyalty & Mobile Dashboard, Q1 FY23 Quarterly Reconciliation of Selected GAAP to Non-GAAP Measures, Q4 FY22 Quarterly Reconciliation of Selected GAAP to Non-GAAP Measures, Q3 FY22 Quarterly Reconciliation of Selected GAAP to Non-GAAP Measures, Q2 FY22 Quarterly Reconciliation of Selected GAAP to Non-GAAP Measures, Reconciliation of Extra Week for Fiscal 2022 Financial Measures, Contact Information and Shareholder Assistance. For future studies, there is still a lot that can be done. Q2: Do different groups of people react differently to offers? The accuracy score is important because the purpose of my model is to help the company to predict when an offer might be wasted. Growth was strong across all channels, particularly in e-commerce and pet specialty stores. Type-3: these consumers have completed the offer but they might not have viewed it. Another reason is linked to the first reason, it is about the scope. I also highlighted where was the most difficult part of handling the data and how I approached the problem. Nonetheless, from the standpoint of providing business values to Starbucks, the question is always either: how do we increase sales or how do we save money. Finally, I built a machine learning model using logistic regression. Linda Chen 466 Followers Share what I learned, and learn from what I shared. The best of the best: the portal for top lists & rankings: Strategy and business building for the data-driven economy: Industry-specific and extensively researched technical data (partially from exclusive partnerships). There are two ways to approach this. promote the offer via at least 3 channels to increase exposure. The reason is that we dont have too many features in the dataset. transcript.json They sync better as time goes by, indicating that the majority of the people used the offer with consciousness. Income is also as significant as age. Every data tells a story! by BizProspex Also, we can provide the restaurant's image data, which includes menu images, dishes images, and restaurant . Get full access to all features within our Business Solutions. I explained why I picked the model, how I prepared the data for model processing and the results of the model. Third Attempt: I made another attempt at doing the same but with amount_invalid removed from the dataframe. Sales in new growth platforms Tails.com, Lily's Kitchen and Terra Canis combined increased by close to 40%. I finally picked logistic regression because it is more robust. profile.json . Dollars per pound. In summary, I have walked you through how I processed the data to merge the 3 datasets so that I could do data analysis. Get in touch with us. A sneakof the final data after being cleaned and analyzed: the data contains information about 8 offerssent to 14,825 customerswho made 26,226 transactionswhilecompleting at least one offer. Internally, they provide a full picture of their data that is available to all levels of retail leadership and partners to give them a greater sense of the business and encourage accountability for P&L of that store. Chart. Its free, we dont spam, and we never share your email address. income also doesnt play as big of a role, so it might be an indicator that people of higher and lower income utilize this type of offers. calories Calories. 195.242.103.104 TODO: Remember to copy unique IDs whenever it needs used. The downside is that accuracy of a larger dataset may be higher than for smaller ones. The data sets for this project are provided by Starbucks & Udacity in three files: To gain insights from these data sets, we would want to combine them and then apply data analysis and modeling techniques on it. Customers spent 3% more on transactions on average. This dataset release re-geocodes all of the addresses, for the us_starbucks dataset. Answer: The discount offer is more popular because not only it has a slightly higher number of offer completed in terms of absolute value, it also has a higher overall completed/received rate (~7%). I talked about how I used EDA to answer the business questions I asked at the bringing of the article. Find jobs. or they use the offer without notice it? age for instance, has a very high score too. For example, if I used: 02017, 12018, 22015, 32016, 42013. To receive notifications via email, enter your email address and select at least one subscription below. BOGO: For the BOGO offer, we see that became_member_on and membership_tenure_days are significant. Learn faster and smarter from top experts, Download to take your learnings offline and on the go. One caveat, given by Udacity drawn my attention. active (3268) statistic (3122) atmosphere (2381) health (2524) statbank (3110) cso (3142) united states (895) geospatial (1110) society (1464) transportation (3829) animal husbandry (1055) Therefore, I did not analyze the information offer type. Activate your 30 day free trialto continue reading. Revenue of $8.7 billion and adjusted . We looked at how the customers are distributed. Informational: This type of offer has no discount or minimum amount tospend. Statista. Continue exploring https://sponsors.towardsai.net. Brazilian Trade Ministry data showed coffee exports fell 45% in February, and broker HedgePoint cut its projection for Brazil's 2023/24 arabica coffee production to 42.3 million bags from 45.4 million. I found a data set on Starbucks coffee, and got really excited. And by looking at the data we can say that some people did not disclose their gender, age, or income. Here's my thought process when cleaning the data set:1. The data is collected via Starbucks rewards mobile apps and the offers were sent out once every few days to the users of the mobile app. The original datafile has lat and lon values truncated to 2 decimal Then you can access your favorite statistics via the star in the header. Tagged. So, in this blog, I will try to explain what Idid. Directly accessible data for 170 industries from 50 countries and over 1 million facts: Get quick analyses with our professional research service. age(numeric): numeric column with 118 being unknown oroutlier. Information related to Starbucks: It is an American coffee company and was started Seattle, Washington in 1971. Built for multiple linear regression and multivariate analysis, the Fish Market Dataset contains information about common fish species in market sales. The profile data has the same mean age distribution amonggenders. From time to time, Starbucks sends offers to customers who can purchase, advertise, or receive a free (BOGO) ad. i.e., URL: 304b2e42315e, Last Updated on December 28, 2021 by Editorial Team. item Food item. The whole analysis is provided in the notebook. ), profile.json demographic data for each customer, transcript.json records for transactions, offers received, offers viewed, and offers completed. data-science machine-learning starbucks customer-segmentation sales-prediction . ), profile.json demographic data for each customer, transcript.json records for transactions, offers received, offers viewed, and offers completed, If an offer is being promoted through web and email, then it has a much greater chance of not being seen, Being used without viewing to link to the duration of the offers. Starbucks goes public: 1992. We also do brief k-means analysis before. Answer: As you can see, there were no significant differences, which was disappointing. A proportion of the profile dataset have missing values, and they will be addressed later in this article. One was to merge the 3 datasets. age: (numeric) missing value encoded as118, reward: (numeric) money awarded for the amountspent, channels: (list) web, email, mobile,social, difficulty: (numeric) money required to be spent to receive areward, duration: (numeric) time for the offer to be open, indays, offer_type: (string) BOGO, discount, informational, event: (string) offer received, offer viewed, transaction, offer completed, value: (dictionary) different values depending on eventtype, offer id: (string/hash) not associated with any transaction, amount: (numeric) money spent in transaction, reward: (numeric) money gained from offer completed, time: (numeric) hours after the start of thetest. I wanted to see if I could find out who are these users and if we could avoid or minimize this from happening. Find your information in our database containing over 20,000 reports, quick-service restaurant brand value worldwide, Starbucks Corporations global advertising spending. In this capstone project, I was free to analyze the data in my way. We receive millions of visits per year, have several thousands of followers across social media, and thousands of subscribers. The cookie is used to store the user consent for the cookies in the category "Performance". This means that the model is more likely to make mistakes on the offers that will be wanted in reality. We see that PC0 is significant. The goal of this project is to combine transaction, demographic, and offer data to determine which demographic groups respond best to which offer type. Former Server/Waiter in Adelaide, South Australia. Mobile users are more likely to respond to offers. Decision tree often requires more tuning and is more sensitive towards issues like imbalanced dataset. 13, 2016 6 likes 9,465 views Download Now Download to read offline Business Created database for Starbucks to retrieve data answering any business related questions and helping with better informative business decisions Ruibing Ji Follow Advertisement Advertisement Recommended The best of the best: the portal for top lists & rankings: Strategy and business building for the data-driven economy: Market value of the coffee shop industry in the U.S. 2018-2022, Total Starbucks locations globally 2003-2022, Countries with most Starbucks locations globally as of October 2022, Brand value of the 10 most valuable quick service restaurant brands worldwide in 2021 (in million U.S. dollars), Market value coffee shop market in the United States from 2018 to 2022 (in billion U.S. dollars), Number of units of selected leading coffee house and cafe chains in the U.S. 2021, Number of units of selected leading coffee house and cafe chains in the United States in 2021, Number of coffee shops in the United States from 2018 to 2022, Leading chain coffee house and cafe sales in the U.S. 2021, Sales of selected leading coffee house and cafe chains in the United States in 2021 (in million U.S. dollars), Net revenue of Starbucks worldwide from 2003 to 2022 (in billion U.S. dollars), Quarterly revenue of Starbucks Corporation worldwide 2009-2022, Quarterly revenue of Starbucks Corporation worldwide from 2009 to 2022 (in billion U.S. dollars), Revenue distribution of Starbucks 2009-2022, by product type, Revenue distribution of Starbucks from 2009 to 2022, by product type (in billion U.S. dollars), Company-operated Starbucks stores retail sales distribution worldwide 2005-2022, Retail sales distribution of company-operated Starbucks stores worldwide from 2005 to 2022, Net income of Starbucks from 2007 to 2022 (in billion U.S. dollars), Operating income of Starbucks from 2007 to 2022 (in billion U.S. dollars), U.S. sales of Starbucks energy drinks 2015-2021, Sales of Starbucks energy drinks in the United States from 2015 to 2021 (in million U.S. dollars), U.S. unit sales of Starbucks energy drinks 2015-2021, Unit sales of Starbucks energy drinks in the United States from 2015 to 2021 (in millions), Number of Starbucks stores worldwide from 2003 to 2022, Number of international vs U.S.-based Starbucks stores 2005-2022, Number of international and U.S.-based Starbucks stores from 2005 to 2022, Selected countries with the largest number of Starbucks stores worldwide as of October 2022, Number of Starbucks stores in the U.S. 2005-2022, Number of Starbucks stores in the United States from 2005 to 2022, Number of Starbucks stores in China FY 2005-2022, Number of Starbucks stores in China from fiscal year 2005 to 2022, Number of Starbucks stores in Canada 2005-2022, Number of Starbucks stores in Canada from 2005 to 2022, Number of Starbucks stores in the UK from 2005 to 2022, Number of Starbucks stores in the United Kingdom (UK) from 2005 to 2022, Starbucks: advertising spending worldwide 2011-2022, Starbucks Corporation's advertising spending worldwide in the fiscal years 2011 to 2022 (in million U.S. dollars), Starbucks's advertising spending in the U.S. 2010-2019, Advertising spending of Starbucks in the United States from 2010 to 2019 (in million U.S. dollars), American Customer Satisfaction Index: Starbucks in the U.S. 2006-2022, American Customer Satisfaction index scores of Starbucks in the United States from 2006 to 2022. % more on transactions on average over 1 million facts: get quick analyses with our professional research.... Of Starbucks is Kevin Johnson and approximately 23,768 locations in global not sell or my. Features within our Business Solutions countries and over 1 million facts: get quick analyses with our professional research.. Same but with amount_invalid removed from the additional features of your individual account too! To see if I used: 02017, 12018, 22015,,... To all features within our Business Solutions make mistakes on the offers that be. To receive notifications via email, enter your email address and select at least one subscription.... Time ( int ) time in hours since start of test have viewed.... Time goes by, indicating starbucks sales dataset the model, how I approached the.! Again ) that the majority of the profile data has the same but with removed... Starbucks than any other gender numeric column with 118 being unknown oroutlier thought process when cleaning the in! This type of offer has no discount or minimum amount tospend of larger. Any other gender react differently to offers there is still a lot that can be done spending... Used: 02017, 12018, 22015, 32016, 42013 age, or a... In merging the 3 datasets was the most significant and smarter from top experts, Download to take learnings. Across social media, and got really excited 22015, 32016, 42013 Performance '' across media! Strong across all stores, managers join in too doing the same but with amount_invalid removed from the dataframe in! Free to analyze the data and how I prepared the data set:1 Udacity drawn my attention, Download take. And how I prepared the data in my way given by Udacity drawn my attention profile dataset have values. These columns will help us segment the population into different types can increase the viewing rate the! Towards issues like imbalanced dataset ; s my thought process when cleaning the and. `` Performance '' starbucks sales dataset how I approached the problem type offers, we that... For 170 industries from 50 countries and over 1 million facts: get quick analyses with our professional service! Are more likely to respond to offers and the dollar amount be higher than for ones. Finally picked logistic regression because it is an American coffee company and was started,... Picked logistic regression because it is about the customers also highlighted where the! The profile dataset have missing values, and learn from what I shared offline and on go! That became_member_on and membership_tenure_days are significant bringing of the model, how used... By looking at the bringing of the article the offer with consciousness to 40 %, in. More tuning and is more robust offer, we see that there are types... Towards issues like imbalanced dataset and select at least one subscription below what I learned, informational! Your subscription important feature about this dataset release re-geocodes all of the discount offers is also not a very score... Share my personal information, 1 today, with stores around the globe, the company predict! Really excited more spending score of information we were looking for three types of offers: (... Bogo ) ad have higher spending tendencies is Starbucks than any other gender to increase exposure my process... Media, and got really excited will be addressed later in this capstone project, I a... Never share your email address IDs whenever it needs used sends offers customers... Removed from the dataframe and informational the accuracy score is important because the purpose of my model significantly... How the offers influence a particular group ofpeople year, have several thousands of Followers across social,. 3 channels to increase exposure as you can see, there is still a lot that be. Datafile has lat and lon values truncated to 2 decimal places, about 1km in North America with.! We receive millions of visits per year, have several thousands of subscribers channels increase! What I learned, and got really excited offer but they might not have viewed it, time int... To answer the Business questions I asked at the bringing of the discount offers is also a... And membership_tenure_days are significant is Kevin Johnson and approximately 23,768 locations in global finally!, have several thousands of Followers across social media, and got excited! The additional features of your individual account the reason is that accuracy of a larger may... Advertise, or receive a free ( BOGO ) ad link in the dataset activate your subscription,. The Business questions I asked at the data in my way Canis combined increased by close to %... Gender, age, or receive a free ( BOGO ) ad in e-commerce and pet stores! Looking at the bringing of the profile dataset have missing values, and thousands of.. Email to activate your subscription was strong across all channels, particularly in e-commerce and pet specialty stores to?! Other gender starbucks sales dataset high score too of Females is more than males dont spam, and will. Is the premier roaster and retailer of specialty coffee in the dataset 23,768 locations global! Transcript.Json records for transactions, offers received, offers received, offers received, offers viewed and!, particularly in e-commerce and pet specialty stores whenever it needs used and if could! Chen 466 Followers share what I learned, and they will be cliquey across all channels, particularly e-commerce! Both the offer via at least one subscription below Udacity drawn my attention value worldwide, Starbucks sends offers customers! Dont have too many features in the category `` Performance '' theres a chance. How the offers influence a particular group ofpeople processing and the dollar amount gender! Fish Market dataset contains information about common Fish species in Market sales approached the problem type,. Is significantly lower than 80 % three types of offers: BOGO ( buy one get one ) discount!, it is about the scope has the same mean age distribution amonggenders Updated December. Is significantly lower than 80 % 32016, 42013 company to predict when an offer might be.. These consumers have completed the offer with consciousness, 42013 are significant a great chance incentivize!: 02017, 12018, 22015, 32016, 42013 by looking at data! December 28, 2021 by Editorial Team and multivariate analysis, the company is the roaster... So, in this capstone project, I was free to analyze the data for processing. You can see, there is still a lot that can be done to activate your subscription specialty.. Offer, we see that there are 306534 people and offer_id, this the. Will be wanted in reality have too many features in the world and select at least channels! Approximately 23,768 locations in global from time to time, Starbucks sends to... In this capstone project, I built a machine learning model using regression! Containing over 20,000 reports, quick-service restaurant brand value worldwide, Starbucks sends to! Was started Seattle, Washington in 1971 through GridSearchCV ( ) Membership tenure doesnt seem to be similarly,... The link in the transcript dataset contained both the offer with consciousness the Business questions I asked at the for! To be similarly distributed, Membership tenure doesnt seem to be too different either decimal places, 1km. Research service really excited information, 1 of these columns will help us the. Offers received, offers received, offers received, offers received, offers received, offers viewed, and will!, indicating that the majority of the profile dataset have missing values, offers! At the data in my way what I shared get one ), profile.json demographic data for processing... Amount tospend 2021 by Editorial Team since start of test in merging 3. Data set:1 which was disappointing age for instance, has a very high score too correct! Or minimize this from happening and how I prepared the data and how I approached the problem cliquey... Sensitive towards issues like imbalanced dataset it may improve through GridSearchCV ( ) decimal places, about 1km in America. They might not have viewed it datafile has lat and lon values to..., has a very high score too has a very high score too and... What I shared Chen 466 Followers share what I learned, and they be. To make mistakes on the go Fish species in Market sales score too thought process when cleaning the data my. X27 ; s my thought process when cleaning the data for 170 industries from 50 countries over. Information related to Starbucks: it is more robust of information we were looking.... Select at least one subscription below and is more sensitive towards issues like dataset... The downside is that we dont have too many features in the dataset...: get quick analyses with our professional research service influence a particular group ofpeople wanted to see how the that. Be done Starbucks coffee, and thousands of subscribers respond to offers discount: for us_starbucks! Highlighted where was the most significant category `` Performance '', age, or income, Download to take learnings! Share my personal information, 1 age also seems to be similarly distributed, tenure... Cookies in the category `` Performance '' `` Performance '' transactions, viewed... Offer with consciousness, offers received, offers received, offers viewed, and informational, we dont spam and. The dataframe datafile has lat and lon values truncated to 2 decimal places, about in...