How Does HTTPS Work?
What is HTTP and HTTPS
HTTP (The Hypertext Transfer Protocol) and HTTPS (Hypertext Transfer Protocol Secure) is an application-level protocol for information systems distributed from a source and open to common use.
HTTP is a communication protocol that works at the application layer that regulates the rules and methods of how information is transferred between servers and end users on the Internet. It is used to view the website and perform various operations on it.
HTTP by Version
In HTTP / 0.9 and 1.0, the connection is closed after a single "request-response" step. The "keep-alive-mechanism" is introduced in HTTP / 1.1 where the connection is used for more than one request. Such persistent connections reduce detectable delays because the client does not have to reconfirm the TCP connection after the first request is sent. Version 1.1 has optimized bandwidth over HTTP / 1.0. Another improvement in presentation 1.1 is the "byte serving", where the server transmits part of the resource explicitly requested by the client. This provides an advantage in terms of bandwidth.
HTTP session is a step of the "request-response" process on the network. HTTP client makes the request. The client establishes a TCP (Transmission Control Protocol) connection over a specific port (usually from port 80). The HTTP server listening on that port waits for the request message of the client. When the request arrives, the server returns the status signal. The signal may contain, for example, "HTTP / 1.1 200 OK" followed by possibly the body text of the requested resource, error message or some other information.
HTTP defines nine ways for the requested action to occur on the defined resource.
GET: Requests the statement of the specified resource.
HEAD: Asks for the response corresponding to the "GET" request, but does not display the body of the response. It is used to access the preliminary information in the response header without carrying all the content.
OPTIONS: Returns the HTTP methods that the server supports for the specified URL.
POST: Records the data to be processed to the defined resource. The data is located in the body of the request.
PUT: Loads the representation text of the specified resource.
DELETE: Deletes the specified resource.
TRACE: Echoes the request received for clients to see if any changes or additions have been made to medium servers.
CONNECT: Converts the request connection to a transparent TCP / IP tunnel to allow SSL-encrypted (Secure Sockets) communication over the usually unencrypted HTTP proxy.
PATCH: Used to apply partial changes to the resource.
HTTP servers should have at least GET, HEAD and OPTIONS methods if possible.
Basic Concepts To Understand HTTPS:
OSI (Open System Interconnection) Model
In 1984, the OSI model appeared as a standard all over the world. The reason for the exit is to ensure that the communication of the produced electronic devices with each other is at the same denominator and communication gaps are prevented.
While communication is established between two devices, if there is a network device in between, the data passes through the Physical, Data Link and Network layers of the network device and goes to its target.
The data sent is packaged downwards and opened upwards.
1. Physical Layer
- The structure on the cable through which data is sent as bits.
- It is the smallest data storage unit where bits 0 and 1 are kept.
- In this layer, how the data bits will be transmitted to the other party is defined. For example; such as cable, fiber optic cable, radio signals.
- Hub, one of the simplest network devices, operates on this layer and propagates the incoming 1 and 0 packets to other ports.
2. Data Link Layer
- It is the layer that determines how to access the physical layer.
- Data is sent from the network layer to the physical layer in this layer.
- This layer runs on the network card.
- Identifying other computers on the network, identifying who is using the cable are carried out here.
- Sender and Receiver MAC addresses are packaged in this layer.
- Switch device works in this layer. In short, it allows the devices connected to it to communicate with each other by recognizing the MAC addresses.
3. Network Layer
- The addresses of packets that will go to a different network are found in this layer.
- IP protocol works in this layer. Sender and recipient addresses are processed in this layer.
- In this layer, data transfer between two networks is ensured in the most economical way.
- Operations such as routing, network traffic are done here.
- If we need to give an example of a device working in this layer, we can give an example of Router.
- Router provides data transmission between different networks, acts as a gateway and router.
4. Transport Layer
- Divides the data from the upper layer into network packet size chunks.
- The name of the divided parts is "Segment".
- Port information and data size are added in this layer.
- TCP and UDP work in this layer.
- Error control mechanism is in this layer. Whether the data is delivered on time and without errors is checked here.
- The process of outputting the data from the lower layer by combining it with the upper layers takes place here.
5. Session Layer
- Allows devices to make multiple connections at the same time.
- Data sent from the presentation layer are separated from each other in different sessions.
- Protocols such as NetBIOS, RPC, Sockets, Apple Talk work here.
6. Presentation Layer
- It has a similar structure with session layer.
- It is the layer where the transmitted information is encoded / decoded. - It is the layer where the data comes to be understood by the other party.
- Photo encodings such as GIF, TIFF, JPEG are in this layer.
- Character encodings like ASCII also work on this layer.
7. Application Layer
- Communication between device applications and the network is established here.
- It is the point where the user and the device meet.
- Applications using protocols such as SSH, Telnet, http, DNS, FTP (browser, PuTTY, etc.) run on this layer.
SSL & TLS
SSL (Secure Socket Layer) is a security standard used to create an encrypted connection between the server and the client.
SSL certificates contain two keys, public and private. These keys will be used to create a secure connection between the server and the client. In addition, certificates contain information about the owner of the website.
To become a certificate holder, the web server must create a "Certificate Signing Request (CSR)". This CSR generation process will enable the creation of public and private keys on the server. The CSR file sent to a trusted certificate provider contains the public key of the server. It creates the certificate belonging to this CSR file by the certificate authority and this certificate is uploaded to the web server.
It consists of four layers.
- SSL Record Protocol
- SSL Change Cipher Spec Protocol
- SSL Alert Protocol
- SSL Handshake Protocol
Two important concepts of SSL are SSL session and SSL Connection. SSL Connection: Peer-to-peer-related, ad hoc and session-related communication.
SSL Session: It is the combination between Client and Server. Sessions are created with the Handshake Protocol. Sessions allow the definition of cryptographic security parameters that can be shared across multiple connections. The sessions have been created to avoid negotiation of security parameters that will be used in any costly connection.
TLS (Transport Layer Security) is a security protocol developed in 1999 that provides data privacy and data integrity between two communication applications.
TLS has evolved from Netscape's SSL security protocol, which provides encryption of data between client and receiver to ensure secure communication of websites.
SSL and TLS are protocols that form the security basis of HTTPS.
Features of TLS:
- Provides secure communication between the Web Browser and the server.
- Encrypts the data over the key using the Asymmetric cryptography algorithm.
- Supports many IP protocols such as HTTPS, SMTP, POP3, FTP in data encryption.
- In our daily life, we are unaware of VPN connections, VoIP, instant messaging, etc. over the network. It enables such operations to be performed securely in web browsers or applications.
It consists of two layers.
- TLS Record Protocol
- TLS Handshake Protocol
Server and users are authenticated with the Handshake Protocol. While Handshake allows encryption algorithms and encryption keys before data communication is made; Record Protocol ensures that the connection is secure.
SKE (Symmetric Key Encryption)
The same key is used in symmetric encryption and decryption steps. AES, DES, 3DES, RC4 are the main symmetric encryption methods.
Symmetric encryption has its advantages. Its main advantages are as follows:
- Encryption and decryption processes are fast, easy to perform with hardware.
- Confidentiality of communication between the parties is ensured.
- The integrity of the data is ensured. Unless the encrypted text can be decrypted, the original text cannot be changed.
In addition, there are some difficulties in the symmetric encryption process. The main difficulties of symmetric encryption can be listed as follows:
- Key is hard to hide. (Key Storage Problem)
- For a system with n users, [n * (n-1) / 2] keys should be stored. It is not scalable.
- Reliable key distribution is difficult. (Key Distribution Problem)
- Does not provide authenticity. Data may be encrypted by anyone with the same key.
- It does not provide integrity. The data may have been altered by a person in the middle.
- It does not provide undeniability as it does not provide authentication and integrity.
AKE (Assymetric Key Encyrption)
Different keys are used in asymmetric encryption, encryption and decryption. These keys are referred to as public/public and private/closed/secret keys.
There are a number of things to know about the two keys:
- While the public key is used for encryption and authentication, the private key is used for decryption and signing.
- Data encrypted with the recipient's public key can only be opened with the recipient's private key. For this reason, sending parties send the data by encrypting it with the public key of the receiver. Ensure that this encrypted data can only be read by the recipient with the private key.
- The data signed with the sender's private key can be verified by the receiver and anyone with the sender's public key. For these reasons, the sending party signs the data using its private key. Ensure that this signed data is sent only by the sender with the private key.
Public Key Infrastructure
- PKI provides the necessary infrastructure for the generation, distribution and authentication of digital certificates used for Authenticity in HTTPS.
- In PKI, third-party trusted organizations called CA (Certificate Authority) generate digital certificates for all parties (client, server) that want to communicate on the insecure network. The Public Key of the parties to which the certificate is issued is placed in these certificates. After this point, a party sends its own certificate to all parties it communicates with, and its Public Key is shared with the other party.
- The client who wants to verify whether the certificate he received over the insecure network really belongs to the declared party or not, follows the signatures on the certificate in the hierarchy up to the root certificate as follows.
In the above process, the certificate named Root Certificate at the top was signed by him. Root certificates are preinstalled on operating systems by operating system manufacturers. When Chrome and Internet Explorer check a site's certificate, they check whether the root certificate they find is in the relevant part of the operating system. Root certificates come with their own installation in Firefox.
Diffie-Hellman Key Exchange
The purpose of this method, which is one of the public switching systems, is to deliver the keys known to only two people to both sides with a public password known to everyone.
The working logic of the system is based on a simple mathematical fact: gab = gba
Both parties to exchange keys agree on the numbers p = 23 and g = 5 (these numbers are known from both sides and are common passwords)
Alice chooses private key a = 6, and sends it to Bob (ga mod p)
- 56 mod 23 = 8
Bob sends b = 15 as private key, and to Alice (gb mod p)
- 515 mod 23 = 19
Alice (gb mod p)a calculates equation mod p
- 196 mod 23 = 2
Bob (ga mod p)b calculates the equation mod p
- 815 mod 23 = 2
After all, the information that goes back and forth is 8 and 19. In addition, it is known by everyone in public passwords. However, only Alice and Bob can know the 2 key values. This is only possible after passing the password through the relevant formula.
After reviewing the basic information consisting of 6 sections above, we can move on to the HTTPS section.
HTTPS is the combination of HTTP and SSL / TLS (Secure Sockets Layer / Transport Layer Security) protocols for encrypted communication and secure identification. By default, it connects from the port 443. In short, HTTPS means covering the standard HTTP protocol with TLS encryption.
The main purpose of HTTPS is to create a secure channel over an insecure communication network. This method provides adequate protection against those who want to listen to the line in theory. It is preferred for bank web pages or applications requiring high security.
HTTPS specific security is based on major certificate holders that come with the browser software. This is "I trust the owner of the certificate to tell me who to trust." is the same as saying. Therefore, an HTTPS link to a Web page can only be trusted if the following are true:
- User trusts that the browser software properly provides HTTPS with pre-installed certificates.
- The user trusts the subscriber only to verify legitimate Web pages without misleading names.
- The web page provides a valid certificate signed by the trusted authority. (An invalid certificate warns in most browsers.)
- The certificate properly identifies the Web page.
Basically two main goals are achieved with HTTPS:
- Making sure that the address we reached is actually the address we are trying to reach,
- Encrypting the traffic between the server and the client so that they can only be resolved by these two parties.
To achieve these goals, the first thing to do is to establish a TLS connection. The client we use for HTTP (our browser in most cases) also mediates the establishment of the TLS connection. The parties agree between them as follows:
- The client sends a "client hello" message to the server. This message contains the version of the protocol supported by the client and the list of preferred cryptographic algorithms.
- The server receiving the first message sends a "server hello" message to the client. This message contains the cryptographic algorithm and session ID selected from the list received from the client. The server also sends its digital certificate. In cases where the server needs to verify the client, a “client certificate request” message is sent, but we will assume that this is not required.
- In order to continue beyond this stage, the client must approve the server certificate. A server certificate contains the following information:
- Certificate holder,
- The domain or machine name where the certificate is valid,
- The server's public key,
- The date range for which the certificate is valid,
- Digital signature of the certificate.
- If this certificate is generated by a Certificate Authority (CA) that the client trusts, or if the client finds the certificate secure at this stage, the validation step of the certificate is passed. Otherwise, communication will not continue. This is where we see pages where our browsers warn us "the connection is not secure".
- Let's keep in mind that the traffic we are planning to do between the server and the client has not started yet. In this case, the client has confirmed that it can trust the certificate it received from the server, but the server has not yet confirmed that it is the real owner of this certificate. Since the server sends this certificate to everyone who communicates with it, any party that has obtained this certificate may be misleading us, so it needs to be verified. The verification of this is carried out with the public key of the server included in the certificate. When the client sends data to the server using this public key, it will only be able to decrypt the data if the server has the secret key associated with that public key.
- Now the client and server can generate a public key to send and receive data encrypted throughout the session using this public key and a key exchange algorithm.
- After the key to be used in symmetric encryption is generated, the client sends a "finished" message to the server encrypted with this key. After this point, the parties use one of the symmetric encryption algorithms between them.
- The server receiving the message also encrypts it with the same key and sends the "finished" message to the client. Thus, both parties agree on a public key to be used in encryption and authenticate each other.
After all this effort, the client still hasn't even sent an HTTP GET request to the server. A secure tunnel has only been created between client and server. After this stage, the communication between the parties will be done as in standard HTTP, but since all the data that goes back and forth is encrypted, it will not be possible to see the data and even the address we reach, not to third parties as in HTTP.
As a result:
- HTTPS is the secure version of HTTP.
- It has the purpose of making the transport layer hidden.
- The client authenticating the server is available in HTTPS.
- Privacy is ensured by encrypting the Transport Layer layer, from this point of view, it is not much different from HTTP in terms of content. (GET, PUT, POST etc.)
The client starts an SSL / TLS handshake before establishing an HTTP connection with the server, and then an SSL / TLS handshake is initiated after establishing a TCP connection.
- Since the SSL / TLS mechanism works independently, it can be used between any two protocols under appropriate conditions.
- While the data prepared in the Application layer is transmitted to the Transport layer, it is encrypted and sent with SSL / TLS. Since the receiver will decode SSL / TLS in the
- Transport layer, the Application layer does not need to be aware of the encryption logic.