Azure ML studio and tableau
For the first one download the below Tableau workbook file and complete the instructions within using the embedded data (no dataset upload required). Send completed Tableau workbook.
For the second assignment create a basic machine learning modeling experiment in Azure ML Studio to predict one of the labels
Potential features:
Gender: of the person who posted the Tweet
Country or State: of the location where the Tweet originated from
Weekday, Day, Hour: of the date it was tweeted
Klout: a score representing how “popular” or “important” the person is who posted the tweet
Sentiment: a score representing the tone of the tweet text
Reach: how many people had viewed the tweet at the time the data was collected
IsReshare: whether or not the tweet was a reshared of another tweet
RetweetCount: the number of “Retweets” other users had given the tweet
Likes: the number of “Likes” other users had given the tweet
Lang: the language that the tweet was written in
Candidate labels: Each of these features might represent the popularity or impact of a tweet. However, you can only use one.
Your goal is to select a label that is 1) as meaningful as possible, and 2) as easy to predict with strong accuracy and fit metrics as possible. However, you’ll find that those objects can conflict with each other at times: more of one may mean less of the other. Choose carefully.
Reach
IsReshare
RetweetCount
Likes
Requirements:
Build an experiment in Azure ML Studio to predict one of the candidate labels listed above or some derived version of those labels.
Follow the pattern and techniques learned in this module to select columns, split the data into a training and testing set, and then train, score, and evaluate the model
Select/include any feature that you think should logically explain or predict your label.
Use linear regression to train the model.
However, learn later that there are other algorithms available that are better suited to count-based data like RetweetCount, Likes, and Reach. But don’t worry about that for now.
Complete any relevant data preparation tasks demonstrated in the textbook chapters and in class (minimum 3 types).