Categories
Mastering Development

R Group character strings based on numeric value and word

I have a dataset that contains Names as strings with year, order, and grouping name. Each Name has a value associated with the name. I need to reorder the ‘Names’ based on ascending year/order per grouping name and assign a value that is the average within the group.

My issue is further complicated by some names say spelling out the name rather than using the number. For example: “2020 Mid 1st Rounder” would be comparable to “2020 1.05-1.08 Draft Pick”.

Here is an example of my dataframe:

Data <- data.frame(Value = c(1,1,1,1,2,2,2,2,2,3,3,3,3,3,3,5),
 Name = c("2020 1.01 Draft Pick", "2020 1.04 Draft Pick", "2020 1.02 Draft Pick", "2020 1.03 Draft Pick",
"2020 1.06 Draft Pick","2020 1.04 Draft Pick","2020 1.05 Draft Pick","2020 1.04 Draft Pick",
"2020 Mid 1st Rounder","2020 1.04 Draft Pick","2020 1.08 Draft Pick","2020 1.03 Draft Pick",
"2020 Last Round","2020 1.04 Draft Pick","2020 1.07 Draft Pick","2020 Early 1st Rounder"))

The only I can think to accomplish this would require a lot of manual changing (str_replace(“1.05”, “Mid 1st Rounder”, Names)) and then splitting the string to reorder and I know there must be a better way. Thanks!

EDIT: Using @Akrun’s method this is the output that I get:
Image of output using @Akrun's method.  This is very close to what I need, but I want the average grouped by (1.01-1.04&Early 1st Round),(1.05-1.08&Mid 1st Round),(1.09-1.12 &Last 1st Round), (2.01-2.04&Early 2nd Round) , &etc.

The exact code I used is:

temp <- Output_table[!(Output_table$`Draft Pick.Name`==""), c('Player.Value', 'Draft Pick.Name')] %>%
  group_by('Player.Value', year = readr::parse_number(as.character(`Draft Pick.Name`))) %>% 
  mutate(averagePergroup = mean(as.numeric(str_replace(`Player.Value`, "^\\d+\\s+([0-9.]+)\\s+.*", "\\1")), na.rm = TRUE))

Leave a Reply

Your email address will not be published. Required fields are marked *