Here's one solution using functions from `dplyr` and `tidyr`:
library(dplyr)
library(tidyr)
df <- data.frame(ExamData = rep(1:2, each = 8),
Question = rep(1:4, 4),
Student = rep(1:4, 2, each = 2),
Option = sample(1:5, 16, replace = TRUE),
stringsAsFactors = FALSE)
head(df)
# ExamData Question Student Option
# 1 1 1 1 1
# 2 1 2 1 3
# 3 1 3 2 5
# 4 1 4 2 4
# 5 1 1 3 1
# 6 1 2 3 5
df %>%
group_by(Question, Option) %>%
tally(Option) %>%
mutate(n = (n/sum(n))*100) %>%
pivot_wider(id_cols = Question,
names_from = Option,
values_from = n,
values_fill = list(n = 0),
names_prefix = "Global_")
# # A tibble: 4 x 6
# # Groups: Question [4]
# Question Global_1 Global_4 Global_3 Global_5 Global_2
# <int> <dbl> <dbl> <dbl> <dbl> <dbl>
# 1 1 42.9 57.1 0 0 0
# 2 2 0 50 18.8 31.2 0
# 3 3 16.7 0 0 83.3 0
# 4 4 0 61.5 23.1 0 15.4
Just out of curiosity I've taken a look at what happens under the hood, and I've used [dtruss/strace][1] on each test.
C++
./a.out < in
Saw 6512403 lines in 8 seconds. Crunch speed: 814050
syscalls `sudo dtruss -c ./a.out < in`
CALL COUNT
__mac_syscall 1
<snip>
open 6
pread 8
mprotect 17
mmap 22
stat64 30
read_nocancel 25958
Python
./a.py < in
Read 6512402 lines in 1 seconds. LPS: 6512402
syscalls `sudo dtruss -c ./a.py < in`
CALL COUNT
__mac_syscall 1
<snip>
open 5
pread 8
mprotect 17
mmap 21
stat64 29
[1]: http://en.wikipedia.org/wiki/Strace