CopyPastor

Detecting plagiarism made easy.

Score: 0.8144299983978271; Reported for: String similarity Open both answers

Possible Plagiarism

Plagiarized on 2020-07-01
by Dunois

Original Post

Original - Posted on 2012-03-11
by 2mia



            
Present in both answers; Present only in the new answer; Present only in the old answer;

Here's one solution using functions from `dplyr` and `tidyr`:
library(dplyr) library(tidyr) df <- data.frame(ExamData = rep(1:2, each = 8), Question = rep(1:4, 4), Student = rep(1:4, 2, each = 2), Option = sample(1:5, 16, replace = TRUE), stringsAsFactors = FALSE) head(df) # ExamData Question Student Option # 1 1 1 1 1 # 2 1 2 1 3 # 3 1 3 2 5 # 4 1 4 2 4 # 5 1 1 3 1 # 6 1 2 3 5 df %>% group_by(Question, Option) %>% tally(Option) %>% mutate(n = (n/sum(n))*100) %>% pivot_wider(id_cols = Question, names_from = Option, values_from = n, values_fill = list(n = 0), names_prefix = "Global_") # # A tibble: 4 x 6 # # Groups: Question [4] # Question Global_1 Global_4 Global_3 Global_5 Global_2 # <int> <dbl> <dbl> <dbl> <dbl> <dbl> # 1 1 42.9 57.1 0 0 0 # 2 2 0 50 18.8 31.2 0 # 3 3 16.7 0 0 83.3 0 # 4 4 0 61.5 23.1 0 15.4
Just out of curiosity I've taken a look at what happens under the hood, and I've used [dtruss/strace][1] on each test.
C++
./a.out < in Saw 6512403 lines in 8 seconds. Crunch speed: 814050
syscalls `sudo dtruss -c ./a.out < in`
CALL COUNT __mac_syscall 1 <snip> open 6 pread 8 mprotect 17 mmap 22 stat64 30 read_nocancel 25958

Python
./a.py < in Read 6512402 lines in 1 seconds. LPS: 6512402
syscalls `sudo dtruss -c ./a.py < in`
CALL COUNT __mac_syscall 1 <snip> open 5 pread 8 mprotect 17 mmap 21 stat64 29
[1]: http://en.wikipedia.org/wiki/Strace

        
Present in both answers; Present only in the new answer; Present only in the old answer;