6 Arranging Columns with Across

I’m trying to arrange calculated columns immediately after source columns when using dplyr column-wise operations.

https://stackoverflow.com/questions/67076974/arrange-calculated-columns-immediately-after-source-columns-when-using-dplyr-col/67077122#67077122

https://community.rstudio.com/t/arrange-calculated-columns-immediately-after-source-columns-when-using-dplyr-column-wise-operations/101726

library(dplyr)

Example data

df <- tibble(
  id         = c(1, 2, 2),
  id_row     = c(1, 1, 2),
  name_first = c("John", "Jane", "Jane"),
  city       = c("NY", "DAL", "HOU"),
  x          = c(0, 1, 0)
)

df

## # A tibble: 3 × 5
##      id id_row name_first city      x
##   <dbl>  <dbl> <chr>      <chr> <dbl>
## 1     1      1 John       NY        0
## 2     2      1 Jane       DAL       1
## 3     2      2 Jane       HOU       0

Within ID, the values of name_first and city should be constant. The values for id_row and x need not be constant. I want to test for and inspect differing values of name_first and city within-id.

One convenient way to do this is with mutate() and across().

df %>% 
  group_by(id) %>% 
  mutate(
    across(
      .cols  = c(name_first, city),
      .fns   = ~ length(unique(.x)) == 1,
      .names = "{col}_all_match"
    )
  )

## # A tibble: 3 × 7
## # Groups:   id [2]
##      id id_row name_first city      x name_first_all_match city_all_match
##   <dbl>  <dbl> <chr>      <chr> <dbl> <lgl>                <lgl>         
## 1     1      1 John       NY        0 TRUE                 TRUE          
## 2     2      1 Jane       DAL       1 TRUE                 FALSE         
## 3     2      2 Jane       HOU       0 TRUE                 FALSE

The issue is that the “all_match” (calculated) columns are added to the far right of the data frame rather than immediately after their source column. This makes it difficult to visually inspect differing values for the columns of interest.

Of course, in this small data frame, we could easily rearrange the columns using select().

df %>% 
  group_by(id) %>% 
  mutate(
    across(
      .cols  = c(name_first, city),
      .fns   = ~ length(unique(.x)) == 1,
      .names = "{col}_all_match"
    )
  ) %>% 
  select(id, id_row, starts_with("name_first"), starts_with("city"), x)

## # A tibble: 3 × 7
## # Groups:   id [2]
##      id id_row name_first name_first_all_match city  city_all_match     x
##   <dbl>  <dbl> <chr>      <lgl>                <chr> <lgl>          <dbl>
## 1     1      1 John       TRUE                 NY    TRUE               0
## 2     2      1 Jane       TRUE                 DAL   FALSE              1
## 3     2      2 Jane       TRUE                 HOU   FALSE              0

The issue with that approach is that it quickly becomes pretty cumbersome with more columns. A more tractable approach would be to sort the names alphabetically…

df %>% 
  group_by(id) %>% 
  mutate(
    across(
      .cols  = c(name_first, city),
      .fns   = ~ length(unique(.x)) == 1,
      .names = "{col}_all_match"
    )
  ) %>% 
  select(sort(names(.)))

## # A tibble: 3 × 7
## # Groups:   id [2]
##   city  city_all_match    id id_row name_first name_first_all_match     x
##   <chr> <lgl>          <dbl>  <dbl> <chr>      <lgl>                <dbl>
## 1 NY    TRUE               1      1 John       TRUE                     0
## 2 DAL   FALSE              2      1 Jane       TRUE                     1
## 3 HOU   FALSE              2      2 Jane       TRUE                     0

…but in my situation I need to preserve the original column order. I’d also prefer to stick with Tidyverse solutions if possible.

Any ideas are appreciated!

6.1 Solution by LMc:

# Use select because it allows for more complex column selection when working 
# with more complex data frames.
inspect_cols <- df %>% select(name_first, city) %>% names()

# Set column order ahead of time. This assumes that you know the names of each 
# of the columns you want to inspect
col_order <- purrr::map(
  names(df), 
  function(x) {
    if (x %in% inspect_cols) {
      c(x, paste0(x, "_all_match"))
    } else {
      x
    }
  }
) %>% 
  unlist()

df %>% 
  group_by(id) %>% 
  mutate(
    across(
      .cols  = all_of(inspect_cols),
      .fns   = ~ length(unique(.x)) == 1,
      .names = "{col}_all_match"
    )
  ) %>% 
  dplyr::select(all_of(col_order))

## # A tibble: 3 × 7
## # Groups:   id [2]
##      id id_row name_first name_first_all_match city  city_all_match     x
##   <dbl>  <dbl> <chr>      <lgl>                <chr> <lgl>          <dbl>
## 1     1      1 John       TRUE                 NY    TRUE               0
## 2     2      1 Jane       TRUE                 DAL   FALSE              1
## 3     2      2 Jane       TRUE                 HOU   FALSE              0