Authentication and Authorization concepts you must know
Your last 5000 words on authentication
What is Authentication?
Authentication is the way we check if someone is who they say they are. In the digital world, we use things like passwords, biometrics, or security tokens to do this.
Passwords have been the most common method since the 1960s, but technology has brought us better ways, like fingerprint or facial recognition, multi-factor authentication (MFA), and two-factor authentication (2FA).
These improvements help make security stronger and protect against the weaknesses of just using passwords.
What is Authorization?
Authorization is the process of giving or not giving access to a user, program, or process. It decides what actions a user can do or what resources they can use after their identity is checked.
For example, a system administrator might get top-level access to a resource, while a regular business user might get limited access or no access to the same resource.
People often mix up authorization and authentication. Authentication checks a user's identity, while authorization decides what the user can do or access after their identity is confirmed.
Single-Sign On (SSO)
What?
SSO is a way for users to safely sign in to many apps and websites using only one set of login details. With SSO, a user signs in once and can use multiple systems without needing to log in again for each one.
Why was it introduced?
Before Single Sign-On (SSO), users had to remember many usernames and passwords for different apps. This was not just inconvenient but also a security risk, as users often used weak or repeated passwords. It also made things harder for IT departments, causing higher costs and more time spent on password resets and account management. SSO was created to make this process easier by letting users sign in once and access many apps, improving user experience, lowering administrative work, and increasing security.
Before SAML
At first, SSO used cookies within a single domain. When a user logged into an app, a cookie was made, and this cookie helped the user access other apps in the same domain.
But, this method had problems. It didn't work across different domains because of security rules that stop one domain from using cookies from another domain.
So, standards like SAML (Security Assertion Markup Language) were developed. SAML lets safe web domains share user authentication and authorization info.
Identity Provider
An Identity Provider is a service that stores and manages digital identities:
Verifying Identities: It authenticates user credentials and confirms that a user is who they claim to be.
Issuing Authentication Assertions: Once the user is successfully authenticated, the IdP gives out a token or assertion (such as a SAML assertion, JWT, or OIDC token). The user can then use this to access other services.
Centralizing User Management: The IdP works as a central system for handling user identities, passwords, and other details, like roles, permissions, and personal information.
Service Provider
A Service Provider (SP) is a system or application that offers services to users and relies on an Identity Provider to authenticate these users:
Delegating Authentication: Instead of managing authentication on its own, the SP gives this job to an IdP. When a user tries to use the service, they are sent to the IdP to be authenticated.
Validating Authentication Assertions: After the IdP checks the user and gives an authentication token, the SP checks this information. This makes sure the user is verified and can use the service.
Providing Access Based on Authentication: After validating the authentication assertion from the IdP, the SP grants the user access to its services or resources.
Generalized Flow of SSO
The user attempts to access an application that is part of the SSO setup.
The application (Service Provider) sends a request to the IdP to authenticate the user.
The IdP checks if there's an existing authentication session for the user. If not, the user must enter their credentials.
Once the user is authenticated, the IdP sends an assertion or token back to the service provider.
The service provider verifies the assertion and grants the user access to the application.
If the user then tries to access another application within the SSO setup, steps 2-5 are repeated. However, since the user is already authenticated with the IdP, they won't need to enter their credentials again.
SAML (Security Assertion Markup Language)
What?
Security Assertion Markup Language (SAML) is an XML-based open standard that helps exchange authentication and authorization information between an identity provider (IdP) and a service provider (SP). It's used to enable single sign-on (SSO).
How does it work?
SAML helps move a user's identity from the IdP to the SP. It uses signed XML documents for this. When a user tries to use a service, the SP sends a SAML request to the IdP. The IdP checks the user and sends a SAML response, saying the user is authenticated. The SP then allows access based on this information.
Flow of SAML
User attempts to access a resource at the SP.
SP sends a SAML request to the IdP.
User is authenticated by the IdP.
IdP sends a SAML response with an assertion to the SP.
SP verifies the assertion and grants access to the user.
Why is SAML better than cookie based approach?
Cross-Domain Authentication: Unlike cookies that work only on one domain, SAML helps with single sign-on across various domains.
Security: SAML offers a safer method for managing authentication because it uses digital signatures and encryption to protect the data being transferred between the identity provider and service provider.
Standardization: SAML makes it easier for identity providers and service providers to talk to each other using a standard method. This standard way helps different systems and organizations work together more easily, making identity sharing simpler.
Reduced Session Handling: With SAML, the service provider doesn't have to handle session information because the identity provider takes care of authentication. This lowers the workload and possible security risks linked to session management.
Limitations of SAML that led to OAuth
Limited Application Scope: SAML was mainly created for web-based login processes and works best for business applications with a central user database. It isn't a good fit for communication between apps, modern API services, or connecting with outside identity providers not controlled by the company.
Complexity: SAML uses XML and has more complex rules, which can make it harder to set up and use. This complexity can cause mistakes in the setup and make it difficult for developers to add SAML to their applications.
Performance: Again, due to XML, messages can be long and result in bigger data sizes. This can affect performance, particularly in systems with many authentication requests.
OAuth
Motivation behind OAuth
SAML mainly helps with authentication by checking a user's identity and making single sign-on (SSO) possible across different systems. It uses XML-based assertions to safely share a user's authentication status and features.
SAML's architecture is more suited for straightforward identity assertions, not for scenarios where a user needs to grant specific permissions to third-party applications.
As the digital world evolved, the need to grant specific permissions to third-party applications became more important. For example, you're using a photo editing app and want to import photos from your Instagram account. OAuth allows the photo app to access your Instagram photos without needing your Instagram username and password.
However, it's important to note that if the user is not already authenticated, the service redirects them to an authentication service (SAML or OIDC).
Clarifying the difference between OAuth and SAML
User Consent and Delegated Access: OAuth begins when a user grants a third-party application permission to access their data on a service.
Token-based Access: OAuth uses access tokens for authorization. These tokens grant the client application limited access without exposing user credentials.
OAuth vs. SAML Flow: Unlike SAML's identity assertions, OAuth's access tokens do not share information about the user's identity. OAuth tokens are about granting permissions, not asserting identity.
Full Flow of OAuth
You may or may not have to authenticate before granting access. Oftentimes, if you're already logged in, you won't have to. This would mean skipping the authentication step below:
Initiate Access Request: The user attempts to access a service or perform an action that requires authentication and authorization.
Redirect to Authentication Service: If the user is not already authenticated, the service redirects them to the authentication service (SAML or OIDC).
User Authentication: The user logs in using their credentials. This step verifies the user's identity.
Authentication Success and Redirect Back: When the user successfully logs in, the authentication service checks their identity and sends them back to the original service or app, usually with an authorization code or token.
Request for Authorization: The service then requests an access token from the authorization server using the received code or token.
Granting Access Tokens: The authorization server validates the request and issues an access token to the application.
Access Granted: The app uses the access token to ask for access to resources or do things for the user. The token shows what permissions the user has given to the app.
Key Points in This Process
Separation of Concerns: Authentication and authorization are treated as separate steps, even though they are part of the same overall process. First, find out who the user is (authentication), and then figure out what they can do or access (authorization).
OAuth's Role: In this process, OAuth mainly focuses on the authorization part. It takes care of permissions and access levels after the user's identity is confirmed.
Enhanced Security: By splitting authentication and authorization, and not giving the user's details to third-party services, this method makes things more secure. The third-party service only gets a token with limited permissions, not the user's information.
Pros of OAuth
Flexibility and Scalability: It's well-suited for modern applications, including mobile and SPAs.
Fine-grained Access Control: Allows users to specify exact permissions for third-party applications.
Enhanced Security: Reduces the risk associated with sharing credentials.
Limitations of OAuth that led to OAuth 2.0
Complex Signatures: OAuth 1.0 required each request to be cryptographically signed. This process was complicated and easy to get wrong, especially for developers not specialized in security. The signature was needed to prove the request was really coming from the correct app (the client application) and that no one had changed the request while it was on its way to the server.
Not Flexible: It was hard to adapt OAuth 1.0 to different types of apps or websites. It was also not great at handling various ways to manage access permissions.
Difficult for Mobile Apps: OAuth 1.0 had security needs that were hard to meet in mobile apps.
Not User-Friendly: The process for users to log in or grant access was often long and confusing, which was not great for a smooth experience, especially on websites that update without reloading or on mobile apps.
Inconsistent Across Services: Different websites and apps used OAuth 1.0 in their own ways, which made it hard for developers to create apps that needed to work with many different services.
OAuth 2.0
OAuth 2.0 is a complete rewrite of OAuth 1.0.
Here are the things it introduces that solves the limitations of 1.0:
Simplicity: OAuth 2.0 simplifies the process by removing the need for clients to cryptographically sign their requests, making it easier for developers to implement.
Support for Non-Browser Clients: OAuth 2.0 adds special authorization methods for various app types, like web apps, desktop apps, mobile phones, smart devices, and non-browser apps like API-based services.
Token Management: OAuth 2.0 introduces the concept of refresh tokens, which can be used to obtain new access tokens without requiring the resource owner to re-authenticate. This simplifies token management.
Security: OAuth 2.0 outsources the encryption to the web's built-in TLS infrastructure, which is universally implemented on both client and server platforms.
Performance: OAuth 2.0 improves performance by introducing short-lived access tokens and long-lived refresh tokens, reducing the number of tokens that need to be stored and processed.
JSON Web Tokens (JWT)
What
JWT is like a note that says who you are and what you can do, which the website reads and understands. It's made in a way that it’s really hard for someone to fake or change it without the website knowing. This way, you can move around the site or use different parts of an app without having to log in again and again.
In more technical terms: A simple and complete way to safely send information between parties using a JSON object.
See Open Standard.
Why was it introduced
Before the introduction of JSON Web Tokens (JWT), there were several challenges in the world of web authentication and authorization.
Stateful Sessions: Traditional session-based authentication needed the server to save session info, usually in a central database. This method had problems when more users joined, and it didn't work well for systems with many servers that had to check user details.
Session Hijacking and CSRF: Session-based systems could be attacked through session hijacking and Cross-Site Request Forgery (CSRF). In session hijacking, an attacker might steal a user's session ID and pretend to be them. CSRF attacks trick the victim into sending a harmful request, using the victim's identity and privileges to do something they don't want.
Lack of Fine-Grained Control: Session-based systems often lacked fine-grained control over user permissions. Once a user was logged in, they were usually treated as a fully authenticated user until their session expired.
Cross-Domain Authentication: Session-based systems struggled with cross-domain authentication. As web applications evolved to become more distributed and service-oriented, there was a need for a way to authenticate users across different domains and services.
JWT addresses these issues by providing a stateless, secure way to share data between parties. Being self-contained, JWTs eliminate the need for server-side session storage, making them scalable and suitable for distributed systems. They offer granular user permission control through claims and enable cross-domain authentication.
There are also other benefits when you compare JWT to SAML assertions (based on XML).
Where are they used
Believe it or not, they're literally used everywhere. To name a few places:
OAuth 2.0
OIDC
Single Page Applications
REST APIs
Server to Server communication
Components
A JWT consists of three main parts: a header, a payload, and a signature.
Header: The header usually has two parts: the token type (which is JWT) and the signing algorithm used (HS256 or RS256). This part of the token describes the cryptographic actions applied to the token and its type. The header is then encoded with Base64Url to make the first part of the JWT.
Payload: The second part of the token is the payload, which has the claims. Claims are statements about something and extra information. There are three kinds of claims: registered, public, and private claims. The payload is also encoded with Base64Url to make the second part of the JWT.
Signature: To create the signature part you have to take the encoded header, the encoded payload, a secret, the algorithm specified in the header, and sign that. The signature is used to verify that the sender of the JWT is who they say they are and to ensure that the message wasn't changed along the way. The signature is then appended to the JWT.
These three parts are concatenated with dots to form the complete JWT.
Claims
A claim in a JWT (JSON Web Token) is like a piece of information about a user or a system. It's a key-value pair in a JSON object. For example, if we have "name": "John Doe," the claim key is "name," and the value is "John Doe." The value of a claim can be any JSON object.
There are three types of claims:
Registered Claims: These are suggested, but not required, claims that help create useful and compatible information. Examples include:
iss
(Issuer): Identifies who issued the token.exp
(Expiration Time): The time after which the token is no longer valid.sub
(Subject): The subject of the token, often the user ID.aud
(Audience): The intended recipients of the token.
Public Claims: These are custom claims created to share information between parties. They should be uniquely named to avoid collision with other claims.
Private Claims: These are custom claims to share information between parties that have agreed on using them. They are not registered or public claims and are often specific to a particular use case.
Why do we need claims?
Compact Information Sharing: Claims give a way to share important information about the user and the token in a small format. This is helpful when we need to save bandwidth and improve performance, like in mobile or single-page apps.
Stateless Authentication: Claims let the token be self-contained. Servers can check the token and know the user's context without having to look in a database or keep track of session state. This makes it easier to scale and lowers the load on the server.
Fine-Grained Access Control: With detailed info about user permissions and roles in the claims, JWTs help create precise access control. This lets servers decide who can access what based on the token's content.
Security: Claims like
exp
(expiration time) help keep tokens secure by making sure they're only valid for a specific time.
Signing Tokens: Symmetric vs Asymmetric
In JWT (JSON Web Tokens), tokens can be signed using either symmetric or asymmetric encryption methods. Each has its use cases, advantages, and disadvantages.
Symmetric Encryption (e.g., HMAC SHA256)
How it Works
- Both the signing and verification of the JWT are done with the same secret key.
Pros
Simplicity: Easier to implement as it requires managing only one key.
Speed: Generally faster than asymmetric encryption.
Cons
Key Sharing: The same key must be securely shared between parties, which can be a risk if the key is exposed.
Less suitable for public APIs or distributed systems where sharing a secret key is not possible.
Use Case
- Best for situations where the issuer and verifier of the token are the same entity or have a secure channel to share the secret key (e.g., internal applications).
Asymmetric Encryption (e.g., RSA SHA256)
How it Works
- The JWT is signed with a private key, and verified with the corresponding public key.
Pros
Enhanced Security: The private key used for signing does not need to be shared, reducing the risk of exposure.
Suitable for Public Exposure: The public key can be shared openly, making it suitable for scenarios where the issuer and verifier are different entities.
Cons:
Complexity: Requires managing two keys (public and private), adding complexity.
Slower: Usually, it's slower than symmetric encryption because of the extra calculations needed.
Use Case:
- Ideal for distributed environments or third-party APIs where the verifier needs to confirm the token's authenticity without having access to a secret key (e.g., microservices, public-facing APIs).
Weaknesses
Lack of Encryption: JWTs are not encrypted by default, so the data inside them is visible in plain text. They can be signed to make sure the data is accurate, but sensitive information in a JWT stays unencrypted. This can be a big worry when sending private user data.
Weak Revocation Method: After a JWT is created, it can't be revoked until it expires. This can be a security problem if someone steals the token.
Increased Attack Surface: The complexity of JSON parsing and validation in JWTs can lead to an increased attack surface, making them vulnerable to various exploits and attacks.
Poor Implementation and Configuration: Misconfiguring JWT libraries, using weak secret keys, and poor implementation of API clients can lead to security vulnerabilities and exploitation.
Client-Side Storage Vulnerabilities: JWTs are typically stored in client-side storage mechanisms, such as browser cookies or local storage, which can expose them to various client-side vulnerabilities, such as cross-site scripting (XSS) attacks.
Best Practices
JWT as Access Token: JWTs can help stop unwanted access to protected resources. They are often used as Bearer tokens, and the API checks and confirms them before giving a response.
Storage: Store JWT in cookies for better security. Cookies are not accessible via JavaScript, and they are automatically sent to the server.
Always Use HTTPS: Ensure that JWTs are transmitted over HTTPS to benefit from security and trust.
Security Considerations: Choose a strong algorithm and key for signing and verifying tokens. Keep the key in a safe place and ensure that the security of the whole system relies on the algorithm and key selection.
OpenID Connect (OIDC)
Motivation behind OIDC
OAuth 2.0 was a great step forward, but it had limitations:
Lack of User Authentication: OAuth 2.0 is made to let applications use user resources, but it doesn't have a way to check the user's identity. It thinks that the application or another method takes care of authentication.
No Standardized Way to Retrieve User Information: OAuth 2.0 doesn't have a standard method for getting basic user profile information. Different service providers use different ways to access user data, which can cause inconsistencies and make integration more complicated.
Ambiguity in User Identification: Even after a user allows access, the app might not have a dependable way to get the user's identity, like a unique identifier. This makes it hard to personalize the user experience or handle user-specific data.
What is OIDC
Think of OpenID Connect (OIDC) as an add-on to the existing OAuth 2.0 system, which is widely used for giving permissions to apps to do things on your behalf. While OAuth 2.0 is great at letting apps access your data without sharing your password, it doesn't tell the app who you are. That's where OIDC comes in.
What does it do?
OIDC Tells Apps Who You Are: It offers a method for apps to not only access your data but also to identify who you are. Think of it as presenting your ID card while granting someone permission to pick up a package on your behalf.
Adds an Extra Layer: OIDC builds on top of OAuth 2.0 by adding an "identity layer." This means it adds a way for the app to understand your identity, which OAuth 2.0 doesn't do by itself.
Uses Tokens: Derived from OAuth 2.0, it uses access and refresh tokens. However, it introduces a new type of token: ID Token.
ID Token
OIDC brings in the idea of an ID token, a security token with details about the user's authentication and other information requested by the client. The ID token is a JSON Web Token (JWT) that provides a standard method to obtain basic user profile information. This fixes the inconsistency and complexity in OAuth 2.0 when getting user data.
Clients can check the ID token, which is a significant improvement over OAuth 2.0. In OAuth 2.0, access tokens are unclear to clients and can only be checked by the resource server. This OIDC feature allows applications to reliably know the user's identity, such as a unique identifier, solving the uncertainty in user identification found in OAuth 2.0.
It's a bit confusing. I know because I was confused myself lol.
Let's dive into a real flow to understand the whole picture and the role of ID tokens.
Flow of OIDC
Imagine you're using a photo-sharing app (the client) that lets you log in and access photos from your social media account (the Identity Provider, IdP).
Authentication and ID Token
User Action: You click "Log in with Social Media" in the photo-sharing app.
Redirect and Login: The app redirects you to the social media platform for authentication. You enter your credentials there.
ID Token Issued: Upon successful authentication, the social media platform issues an ID token.
Role of ID Token: The photo-sharing app receives the ID token and decodes it. The ID token contains your social media profile details (like user ID, name, etc.).
Purpose of ID Token: It verifies your identity to the photo-sharing app. Without this token, the app wouldn't know who you are or if you're legitimately logged in.
User Experience: The app uses information from the ID token to personalize your experience (like displaying your name).
Authorization and Access Token
Permission Request: The app requests permission to access your social media photos.
User Consent: You agree to these permissions.
Access Token Issued: The social media platform issues an access token to the app.
Role of Access Token: This token is then used by the photo-sharing app to access your social media photos.
Purpose of Access Token: It authorizes the app to access your photos without exposing your social media credentials. Without this token, the app cannot access or import your photos.
Data Retrieval: The app makes API calls to the social media platform, including the access token, to fetch and display your photos.
Recap
The ID token simplifies the authentication process by providing a standardized way to confirm a user's identity. When a user logs in using an OIDC provider (like Google or Facebook), the ID token that the provider issues contains information about the user's identity.
OIDC is based on OAuth 2.0. By using the ID token for authentication, OIDC creates a more complete and unified process. This lets apps get permission to use resources (through OAuth 2.0) and also confirm the user's identity (using OIDC).
Pros
Unified Authentication & Authorization: Integrates authentication with OAuth 2.0's authorization framework.
Enhanced Security: Uses secure ID tokens for user identity verification.
Single Sign-On (SSO): Enables convenient SSO across multiple applications.
Standardized User Info: Provides a standardized way to access user profile data.
Flexible Implementation: Supports different client types and use cases.
Interoperability: Widely adopted, ensuring compatibility across platforms.
When to use OIDC
User Authentication Needed: Use OIDC when you need to authenticate users, not just authorize them to access resources.
Single Sign-On (SSO): Ideal for implementing SSO across various applications.
Standardized User Info: When you require a standard way to obtain user profile information.
If you only need authorization, then plain OAuth 2.0 should be sufficient. This might happen if you already have an established way of authenticating.
Best Practices
Use ID Tokens for Authentication Only: Verify user identity with ID tokens. Do not misuse them. ID tokens often contain sensitive user information. If not handled securely, this information can be exposed, leading to privacy breaches.
Use Access Tokens for Authorization: Grant access to resources with access tokens, limiting their scopes.
Secure Token Storage and Transmission: Store tokens securely in HttpOnly cookies. This attribute prevents JavaScript access to the cookie, reducing the risk of cross-site scripting (XSS) attacks. And use HTTPS for their transmission.
Minimize Sensitive Data in Tokens: Avoid storing unnecessary personal information in tokens.