Data source: https://github.com/gdv/foundationsCS-2018/tree/master/ex-data/project
All groups and individual must do the following:
- Convert the app sizes to a number
- Convert the number of installs to a number
- Transform “Varies with device” into a missing value
- Convert Current Ver and Android Ver into a dotted number (e.g. 4.0.3 or 4.2)
- Remove the duplicates
- For each category, compute the number of apps
- For each category, compute the average rating
- Create two dataframes: one for the genres and one bridging apps and genres. So that, for instance, the app Pixel Draw - Number Art Coloring Book appears twice in the bridging table, once for Art & Design, once for Creativity
- For each genre, create a new column of the original dataframe. The new columns must have boolean values (True if the app has a given genre)
- For each genre, compute the average rating. What is the genre with highest average?
- For each app, compute the approximate income, obtain as a product of number of installs and price.
- For each app, compute its minimum and maximum Sentiment_polarity