CopyPastor

Detecting plagiarism made easy.

Score: 0.8025040626525879; Reported for: String similarity Open both answers

Possible Plagiarism

Reposted on 2023-06-03
by Massimo

Original Post

Original - Posted on 2023-06-03
by Massimo



            
Present in both answers; Present only in the new answer; Present only in the old answer;

If the data you are receiving are text, probably they are encoded in utf-8 (or other encoding).
If the are raw data, there is an encoding that mantains all the characters with ASCII code from 0 to 255:
> data = data.decode("latin1")
This changes the data type from bytes to str, preserving the characters AS-IS.
It isn't a brilliant solution because it consumes cpu time and memory, creating a new object, but it is the only one.
**It is a nuisance there isn't an instruction in Python to just change the data type, from bytes to str, without processing.**
The PDF payload obviously isn't utf-8 encoded, or other encodings. They are raw data, not any form of text.
BUT there is an encoding that mantains all the characters with code from 0 to 255:
data = data.decode("latin1")
This changes the data type from bytes to str.
It isn't a brilliant solution because it consumes cpu time and memory, creating a new object, but it is the only one.
**It is a nuisance there isn't an instruction in Python to just change the data type, from bytes to str, without processing.**

        
Present in both answers; Present only in the new answer; Present only in the old answer;