Why Do Files Look Different on My Computer?

The concept of a text file goes well beyond files whose names end on .txt. Any file that contains text that can be edited using a text editor is considered a text file. For example, a C++ file is a text file.

An operating system needs to have a convention for how a line is ended in a text file. Usually, a single character or a two-character sequence are used. The ASCII alphabet contains two characters, line-feed (LF) and carriage-return (CR) that each indicate, in some way, and end of line. (In a C++ program, '\n' is the line-feed character and '\r' is the carriage-return character.) Here are three conventions for how lines are ended.

Notice that a Windows text file uses a two-character sequence to end a line, while Linux uses a single character. If you look at a text file that is in Linux format with the Windows text editor Notepad, it will all appear to be on one line, because Notepad only recognizes the two-character sequence CR LF as the end of a line. Wordpad and most other text editors are more intelligent.

If you look at a file that uses Windows format on a Linux system, you will sometimes see an extra character (the CR) at the end of each line.

There are Linux commands dos2Unix and unix2Dos that convert from Windows to Linux format and from Linux to Windows format, respectively.


Reading and writing text files in Windows

If your program opens a file for reading in text mode (the default), then, when the sequence CR LF is read, the program is only given the LF. Similarly, when a file is open for writing in text mode, writing LF causes two character, CR and LF, to be written into the file.

If you open files in binary mode, no such translation is done.