File Transfer Protocol (FTP)

My first experience with FTP or File Transfer Protocol was in 1992 when I tried to download some freeware from the US. I was very excited one afternoon where the university computer laboratory was not occupied by anyone. This enables me to use all the 30+ PCs in the room for downloading 386BSD to the 3.5" floppy disks. There is no graphical user interfaces available then for FTP. I can only use command line FTP client for FTPing files from the server. Time flies, floppy disks are history, but FTP still remains a useful protocol today. In fact, the command line FTP client is still shipped with Windows and UNIX operating systems.

FTP is the language that computers on a TCP/IP network use to transfer files to and from each other. It is built on client-server architecture. As far as I know, FTP is the only network protocol that utilizes two TCP ports, a 'data' port and a 'command' port (also known as the control port). Traditionally these are TCP port 21 for the command port and TCP port 20 for the data port. 

FTP users authenticate themselves using a clear-text sign-in protocol, normally in the form of a username and password, but can connect anonymously if the server is configured to allow it. FTP transmissions are in clear text and never encrypted. The usernames, passwords, commands and data can be easily read by anyone able to perform packet capture (sniffing) on the network.

FTP may run in active or passive mode, which determines how the data connection is established. In active mode, the client make a TCP control connection to the FTP server's port 21 which will remain open during the transfer process. In response, the FTP server opens a second connection that is the data connection from the server's port 20 to your computer.

The main problem with active mode FTP actually falls on the client side. The FTP client does not make the actual connection to the data port of the server. It simply tells the server what port it is listening on and the server connects back to the specified port on the client. As this appears to be an outside system initiating a connection to an internal client, the connection will probably filter out from entering a Firewall environment.

In order to resolve the issue of the server initiating the connection to the client a different method for FTP connections was developed. This was known as passive mode, or PASV, after the command used by the client to tell the server it is in passive mode. In this mode, the client uses the control connection to send a PASV command to the server and then receives a server IP address and server port number from the server, which the client then uses to open a data connection from an arbitrary client port to the server IP address and server port number received. In passive mode FTP the client initiates both connections to the server, solving the problem of firewalls filtering the incoming data port connection to the client from the server. 

Many FTP sites are protected which required a valid username and password. However, a FTP server that offer anonymous FTP services do not required users to identify themselves to the server. This is a very common practice in the world of open-source and freely distributed software. The process involves the ``foreign'' user (someone not on the system itself) creating an FTP connection and logging into the system as the user anonymous, with an arbitrary password:

Name (foo.bar.com): anonymous
Password: myemail@email.com

File transfers over FTP take two different forms, ASCII and binary. ASCII, otherwise known as American Standard Code for Information Interchange, is a set of 128 symbols that any computer in the world can display. When files are transferred in ASCII mode, the transferred data is considered to contain only ASCII formatted text. The receiver machine is responsible for translating the format of the received text to one that is compatible with their operating system. The most common example of how this is applied pertains to the way Windows and UNIX handle newlines. Different operating systems use different codes to represent line breaks. On a Windows computer, pressing the "enter" key inserts two characters in an ASCII text document - a carriage return (which places the cursor at the beginning of the line) and a line feed (which places the cursor on the line below the current one). On UNIX systems, only a line feed is used. ASCII text formatted for use on UNIX systems does not display properly when viewed on a Windows system and vice versa. As such, when a file is transferred from Windows to a UNIX based server, ASCII mode will strip out the CR (carriage return) characters found at the end of each line. ASCII mode should be used when transferring text files. Some of the most common file types that should be transferred in ASCII mode include:

* txt - Plain text files.
* htm, html, css - Files containing HTML or CSS mark-up.
* asp, vbs, js - Files containing scripting delivered through HTTP.

Binary mode refers to transferring files as a binary stream of data. Where ASCII mode may use special control characters to format data, the "Binary" transfer mode of FTP copies files exactly, byte for byte. In this way, the file is transferred in its exact original form.

If a file containing binary data is sent using ASCII mode, it will most likely end up being corrupted. If you're having problems with corrupted file transfers, try using binary mode when transferring the file. Some common file formats that are sometimes mistakenly transferred in ASCII mode includes:

* pdf - PDF files can contain embedded binary data such as images.
* doc - Microsoft Word documents are a binary formatted file.

In general, all audio, video, and image file formats are binary.

FTP is a very useful service for transferring files on the Internet. However, the use of FTP across the Internet, or other untrusted networks, exposes you to certain security risks. It is important to note that the FTP does not have any special handling for encrypted communication of any kind.  When clients login to FTP servers, they are sending clear text usernames and passwords!  This means that anyone with a packet sniffer between the client and server could capture passwords sent to the server. Potential attackers could also monitor the entire conversation on the FTP control connection and also monitor the contents of the data transfers themselves.

Comments

Popular Posts