Message from Python discussions
December 2018
— Yeah a little.
I'll just make a file, and save it to 2 different files with different encodings. I'll tell you in a bit lol
— Ok thnk
— Can I post screenshot?
— I do not know hehe
— Ok. I made a text file.
Encoding ANSI:
Size: 4 325 bits
Size on disk: 8 192 bits
Encoding UTF-8:
Size: 4 328 bits
Size on disk: 8 192 bits
Encoding UNICODE BIG ENDIAN:
Size: 8 652 bits
Size on disk: 12 288 bits
— There is a difference
— Same text content
— I'll now enlarge the file content
— Thanks
— I just can't understand.
I am reading big csv-file (1GB)
working with it, cleaning it, then writing a new csv.
The new one weighs 17 GB.
Weird
— Bigger file:
ANSI:
Size: 7 266 000 bits
Size on disk: 7 266 304 bits
UNICODE BIG ENDIAN:
Size: 14 532 002 bits
Size on disk: 14 532 608 bits
A significant difference so unless you really need a bigger encoding, don't use it on large files. UNLESS YOU NEED IT.