Finding possible class schedules

2016-08-02

Over the weekend my brother wanted to figure out his class schedule for the next semester. He is a veterinary medicine and zootechnology student at UNAM. For this upcoming semester there is a set of classes he has to take and each has 8 or so instructor options. The website where he finds the class times lists about 8 pre-constructed class schedules. So he normally finds one he likes quite a bit, and then manually starts checking if he can change X instructor for Y for a given class. He does this based on the referalls and information he has gathered about the instructors, plus he factors in whether it’d be an overall better schedule. For example, he might prefer to have a packed Tuesday if that means that he can leave early on Friday and avoid classes on Saturday.

The problem is that it’s very easy to make a mistake. You (well he) gets all excited thinking that he’s found the perfect schedule. Only to then realize that there is a conflict between two classes. Or that the practical portion of a class is in a location one hour away from the university, meaning that the schedule he has selected won’t work. This process is very frustrating.

I was watching him and I started to think if I could help him with some code. Turns out that it was straightforward to write some code to find which options are valid. Once I wrote a test case, it took us like half an hour to fill out the data. I know that tomorrow is when he and his classmates start registering for classes, so this information might help his classmates.

First, I define some helper functions. These are rather straightforward but I’ll be using them later on. For example, dias() is just there for typing less.

## Helper functions
dias <- function(d, i) {
    paste0(d, i)
}

extract <- function(m, p) {
    m[[p]]
}

extract_names <- function(m, p) {
    names(m)[p]
}

Next comes the input information. I organized it in a set of nested list objects. The schedule is stored as a character vector. For example, Lucia Eliana’s class meets on Wednesdays (__M__iercoles in Spanish) from 9 to 11 am. I only keep the starting hours (9 and 10 am) because otherwise the code won’t detect valid opitons that include another class that starts at 11 am. For classes that are 1 hour away from the university, we included 1 hour before and 1 hour after the class.

## Input class/prof info and schedule
materias <- list(
    repro = list(
        'lucia eliana' = c(dias('m', 9:10), dias('j', 9:10), dias('v', 8:13)),
        'esquivel lacroix' = c(dias('l', 14:15), dias('ma', 12:18), dias('m', 14:15)),
        'ismael porras' = c(dias('l', 9:10), dias('m', 8:14), dias('j', 9:10)),
        'esquivel lacroix 2' = c(dias('l', 14:15), dias('m', 14:15), dias('v', 12:18)),
        'salvador galina' = c(dias('ma', 8:13), dias('j', 9:10), dias('v', 9:10)),
        'alberto balcazar' = c(dias('l', 15:16), dias('j', 12:18), dias('v', 14:15)),
        'ana myriam boeta' = c(dias('l', 8:13), dias('m', 10:11), dias('v', 9:10)),
        'rafael eduardo paz' = c(dias('l', 11:17), dias('j', 14:15), dias('v', 16:17)),
        'juan heberth' = c(dias('ma', 9:10), dias('m', 11:17), dias('v', 11:12)),
        'vicente octavio mejia' = c(dias('ma', 8:17))
    ),
    economia = list(
        'valentin efren espinoza' = c(dias('l', 8:10), dias('ma', 9:11)),
        'maria del pilar velazquez' = c(dias('l', 16:18), dias('m', 16:18)),
        'arturo alonso pesado' = c(dias('l', 11:13), dias('j', 11:13)),
        'laura mendez' = c(dias('ma', 13:15), dias('m', 18:20)),
        'laura mendez 2' = c(dias('j', 11:13), dias('v', 11:13)),
        'manuela garcia' = c(dias('l', 17:19), dias('v', 16:18)),
        'francisco alejandro' = c(dias('ma', 7:9), dias('j', 7:9)),
        'isaac reyes' = c(dias('m', 13:15), dias('v', 13:15)),
        'jose luis tinoco' = c(dias('ma', 12:14), dias('m', 9:11)),
        'isaac reyes 2' = c(dias('l', 14:16), dias('ma', 14:16))
        
    ),
    bacterianas = list(
        'jose luis gutierrez' = dias('s', 8:11),
        'rodrigo mena' = c(dias('ma', 18:19), dias('j', 18:19)),
        'beatriz arellano' = c(dias('l', 7:8), dias('ma', 10:11)),
        'de la pena, ramirez ortega' = c(dias('j', 18:19), dias('v', 18:19)),
        'ramirez ortega' = c(dias('m', 7:8), dias('j', 7:8)),
        'rodrigo mena 2' = c(dias('ma', 16:17), dias('m', 16:17)),
        'de la pena' = dias('s', 8:11),
        'efren diaz aparicio' = dias('s', 8:11),
        'lucia del carmen favila' = dias('s', 8:11)
    ),
    parasitarias = list(
        'cintli martinez' = c(dias('j', 16:17), dias('v', 18:20)),
        'osvaldo froylan' = c(dias('ma', 18:19), dias('j', 18:20)),
        'maria quintero, agustin perez' = c(dias('ma', 13:14), dias('m', 7:9)),
        'maria quintero' = c(dias('m', 16:18), dias('j', 16:17)),
        'evangelina romero' = c(dias('ma', 7:8), dias('v', 7:9)),
        'guadarrama 01' = c(dias('m', 7:8), dias('j', 11:13)),
        'guadarrama 03' = c(dias('ma', 13:15), dias('v', 7:8)),
        'guadarrama 04' = c(dias('l', 16:17), dias('ma', 18:20)),
        'guadarrama 05' = c(dias('l', 7:9), dias('j', 7:8))
    ),
    diagnosticas = list(
        '1701' = c(dias('l', 11:13), dias('m', 11:16)),
        '1702' = c(dias('j', 13:15), dias('v', 13:18)),
        '1703' = c(dias('ma', 7:9), dias('v', 8:13)),
        '1704' = c(dias('l', 18:20), dias('j', 13:18)),
        '1705' = c(dias('l', 11:13), dias('m', 7:11)),
        '1706' = c(dias('ma', 15:17), dias('m', 15:19)),
        '1707' = c(dias('ma', 10:12), dias('j', 10:15)),
        '1708' = c(dias('l', 18:20), dias('ma', 15:19)),
        '1709' = c(dias('l', 11:13), dias('j', 8:13)),
        '1711' = c(dias('j', 13:15), dias('v', 10:13))
    )
)

Now that the input information is complete, I use expand.grid() to find out all the different possible options.

## Get all the options
options <- expand.grid(lapply(materias, function(x) { seq_len(length(x))}))
dim(options)
## [1] 81000     5

There’s 81,00 of them including the classes that meet on Saturday. You can see why it’s a frustrating process to find which combination of classes work when doing this manually.

Next, I explore all these options to find those that are valid, meaning that none of the classes overlap. I do this by finding which options have no duplicated hours from the character vectors defined earlier. Nothing fancy.

valid <- apply(options, 1, function(input) {
    info <- mapply(extract, materias, input)
    !any(duplicated(unlist(info)))
})

Now that I have the valid options, I can find the names of the instructors for them. There’s 2,847 valid schedules in the end, out of the 81,000. That’s 3.5 percent!

valid_prof <- apply(options[valid, ], 1, function(input) {
    mapply(extract_names, materias, input)
})
ncol(valid_prof)
## [1] 2847

You can search the interactive version here to select only the options with a given instructor. For example, in my brother’s case there are 30 valid options once he decided to prioritize two instructors as shown in the non-interactive table below.

## Ideally, this code would create an interactive table, but it doesn't work for some reason:
#library('DT')
#datatable(t(valid_prof), options = list(pagingType='full_numbers', pageLength=10), rownames = FALSE)
valid_prof <- t(valid_prof)
rownames(valid_prof) <- seq_len(nrow(valid_prof))
top_options <- valid_prof[valid_prof[, 1] == 'lucia eliana' & valid_prof[, 2] %in% c('isaac reyes', 'isaac reyes 2'), ]
kable(top_options, format = 'markdown', row.names = TRUE)
  repro economia bacterianas parasitarias diagnosticas
5 lucia eliana isaac reyes 2 jose luis gutierrez cintli martinez 1701
11 lucia eliana isaac reyes 2 rodrigo mena cintli martinez 1701
14 lucia eliana isaac reyes 2 beatriz arellano cintli martinez 1701
19 lucia eliana isaac reyes 2 ramirez ortega cintli martinez 1701
25 lucia eliana isaac reyes 2 de la pena cintli martinez 1701
31 lucia eliana isaac reyes 2 efren diaz aparicio cintli martinez 1701
37 lucia eliana isaac reyes 2 lucia del carmen favila cintli martinez 1701
46 lucia eliana isaac reyes 2 jose luis gutierrez osvaldo froylan 1701
50 lucia eliana isaac reyes 2 beatriz arellano osvaldo froylan 1701
58 lucia eliana isaac reyes 2 ramirez ortega osvaldo froylan 1701
67 lucia eliana isaac reyes 2 de la pena osvaldo froylan 1701
76 lucia eliana isaac reyes 2 efren diaz aparicio osvaldo froylan 1701
85 lucia eliana isaac reyes 2 lucia del carmen favila osvaldo froylan 1701
123 lucia eliana isaac reyes 2 jose luis gutierrez guadarrama 01 1701
130 lucia eliana isaac reyes 2 rodrigo mena guadarrama 01 1701
134 lucia eliana isaac reyes 2 beatriz arellano guadarrama 01 1701
137 lucia eliana isaac reyes 2 de la pena, ramirez ortega guadarrama 01 1701
144 lucia eliana isaac reyes 2 de la pena guadarrama 01 1701
151 lucia eliana isaac reyes 2 efren diaz aparicio guadarrama 01 1701
158 lucia eliana isaac reyes 2 lucia del carmen favila guadarrama 01 1701
209 lucia eliana isaac reyes 2 jose luis gutierrez guadarrama 05 1701
217 lucia eliana isaac reyes 2 rodrigo mena guadarrama 05 1701
222 lucia eliana isaac reyes 2 de la pena, ramirez ortega guadarrama 05 1701
232 lucia eliana isaac reyes 2 de la pena guadarrama 05 1701
242 lucia eliana isaac reyes 2 efren diaz aparicio guadarrama 05 1701
252 lucia eliana isaac reyes 2 lucia del carmen favila guadarrama 05 1701
872 lucia eliana isaac reyes 2 jose luis gutierrez guadarrama 05 1704
885 lucia eliana isaac reyes 2 de la pena guadarrama 05 1704
894 lucia eliana isaac reyes 2 efren diaz aparicio guadarrama 05 1704
903 lucia eliana isaac reyes 2 lucia del carmen favila guadarrama 05 1704

Reproducibility

## Reproducibility info
library('devtools')
session_info()
## Session info --------------------------------------------------------------
##  setting  value                       
##  version  R version 3.3.0 (2016-05-03)
##  system   x86_64, mingw32             
##  ui       RStudio (0.99.902)          
##  language (EN)                        
##  collate  English_United States.1252  
##  tz       America/Mexico_City         
##  date     2016-08-02
## Packages ------------------------------------------------------------------
##  package   * version date       source        
##  devtools  * 1.12.0  2016-06-24 CRAN (R 3.3.1)
##  digest      0.6.9   2016-01-08 CRAN (R 3.3.0)
##  evaluate    0.9     2016-04-29 CRAN (R 3.3.0)
##  formatR     1.4     2016-05-09 CRAN (R 3.3.0)
##  highr       0.6     2016-05-09 CRAN (R 3.3.0)
##  knitr     * 1.13    2016-05-09 CRAN (R 3.3.0)
##  magrittr    1.5     2014-11-22 CRAN (R 3.3.0)
##  memoise     1.0.0   2016-01-29 CRAN (R 3.3.0)
##  rsconnect   0.4.3   2016-05-02 CRAN (R 3.3.0)
##  stringi     1.1.1   2016-05-27 CRAN (R 3.3.0)
##  stringr     1.0.0   2015-04-30 CRAN (R 3.3.0)
##  withr       1.0.2   2016-06-20 CRAN (R 3.3.1)

Want more?

Check other @jhubiostat student blogs at Bmore Biostats as well as topics on #rstats.

Federico Sánchez Rodríguez 1950-2016

2016-04-04

Federico Sanchez Ribbon

Today the UNAM community at large mourns the passing of Federico Sánchez Rodríguez. He got his bachelor’s degree from the School of Chemistry - UNAM, masters and PhD degrees from Biomedicas - UNAM, postdoc from UCSF, was a member of CIFN-UNAM now called CCG-UNAM (it’s his affiliation in this 1983 paper), and worked most of his career at IBT-UNAM.

Federico Sanchez homepage

I’m sure that he made many friends, trained many students at all levels, and had a highly productive academic career as evidenced on his homepage where he lists many papers, patents, etc. A PubMed author search includes his papers but be careful to not confuse him with other authors that shared his name.

I’ll remember Federico fondly for the time we shared at LCG-UNAM. He was a great teacher and motivated me to ask as many questions as I could think of. My background in biology was weaker than my classmates, and I loved to ask questions of the sort: if X biolgical process is possible, could Y happen in the cell? He let my imagination run wild. Federico was very supportive of an elective class a few of us organized in our 4th year. You could always tell that he fed off the energy of enthusiastic students. He was always there if you needed some advice. At the end of my time at LCG-UNAM, he enjoyed how I argued against other members of the LCG academic committee. He always supported me in my non-traditional choices. Federico knew many things about plants, but also about the best food, the best drinks, and was always eager to share his knowledge with you.

In one of my last interactions with him at FedeFest in August 2014, I asked him a question in the style that I used to during class. I asked him how would he evaluate results from single-cell sequencing and verify that the results were indeed biogical and not technical. A month later, I sent him this paper.

Fede, I will always remember you fondly.

Best, Leonardo

LCG-UNAM 2005-2009