When developing web applications, displaying data inputted by users directly on HTML pages is extremely risky. This is akin to opening the door wide for XSS (Cross-Site Scripting) attacks. If a malicious user submits data containing <script> tags, and this data is rendered as-is in another user's browser, session cookies can be stolen or malicious code executed.

Django provides a robust toolkit known as django.utils.html to fundamentally block such security threats and safely handle HTML. 🛡️


1. The Core of XSS Defense: escape()



This is the most basic yet essential function of this module. escape() converts specific HTML special characters within a string into HTML entities, making the browser recognize them as plain text rather than tags.

  • < becomes &lt;
  • > becomes &gt;
  • ' (single quote) becomes &#39;
  • " (double quote) becomes &quot;
  • & becomes &amp;

Example:

from django.utils.html import escape

# Malicious user input
malicious_input = "<script>alert('XSS Attack!');</script>"

# Escape processing
safe_output = escape(malicious_input)

print(safe_output)
# Output:
# &lt;script&gt;alert(&#39;XSS Attack!&#39;);&lt;/script&gt;

The converted string will not be executed as a script in the browser; instead, <script>alert('XSS Attack!');</script> will be displayed as plain text.

[Important] Automatic Escaping in Django Templates

Fortunately, the Django template engine automatically escapes all variables by default.

{{ user_input }}

Therefore, the escape() function is mainly used when manually processing HTML outside of templates (e.g., during view logic or API response generation).


2. Removing All HTML Tags: strip_tags()

Sometimes, you may want to go beyond escaping HTML and completely remove all tags to extract only pure text. For example, when you want to use the HTML tags from a blog post in a summary for search results.

strip_tags() performs this function.

Example:

from django.utils.html import strip_tags

html_content = "<p>This is a <strong>very important</strong> <em>notice</em>.</p>"

plain_text = strip_tags(html_content)

print(plain_text)
# Output:
# This is a very important notice.
# (Spaces between tags are also cleaned up)

3. Safely Generating HTML: format_html()



One of the most powerful and important functions.

There may be times you need to dynamically generate HTML in Python code (e.g., in views.py or models.py). For example, a model's method may want to return a link formatted a certain way in the admin page.

If you assemble strings using Python’s f-string or + operator, you can be very susceptible to XSS attacks.

format_html(format_string, *args, **kwargs) automatically escapes all arguments (excluding format_string itself) before inserting them into the string. And the final result is marked as “this HTML is safe” (mark_safe), so it renders without escaping in the template.

Example: (Creating an edit link in a model method for the admin page)

from django.db import models
from django.utils.html import format_html
from django.utils.text import slugify

class Post(models.Model):
    title = models.CharField(max_length=100)

    def get_edit_link(self):
        # [Bad Example] f-string: if self.title contains <script> it will cause XSS
        # return f'<a href="/admin/blog/post/{self.id}/change/">{self.title}</a>'

        # [Good Example] using format_html
        # self.id and self.title will be escaped automatically.
        url = f"/admin/blog/post/{self.id}/change/"
        return format_html(
            '<a href="{}">{} (edit)</a>',
            url,
            self.title  # If title is "My<script>..." it will change to "&lt;script&gt;"
        )

4. Text Formatting Helpers: linebreaks and urlize

These functions are the original functions behind the template filters (|linebreaks, |urlize) and are useful when converting plain text to HTML format.

  • linebreaks(text): Converts line break characters (\n) in plain text to HTML's <p> or <br> tags. It's useful for displaying text entered by users in a textarea while maintaining formatting.
  • urlize(text): Automatically wraps URL patterns like http://..., https://..., and www... in <a> tags within the text.

Example:

from django.utils.html import linebreaks, urlize

raw_text = """Hello.
Testing django.utils.html.

Visit site: https://www.djangoproject.com
"""

# 1. Apply line breaks
html_with_breaks = linebreaks(raw_text)
# Output (approximately):
# <p>Hello.<br>Testing django.utils.html.</p>
# <p>Visit site: https://www.djangoproject.com</p>

# 2. Apply URL links
html_with_links = urlize(html_with_breaks)
# Output (approximately):
# ...
# <p>Visit site: <a href="https://www.djangoproject.com" rel="nofollow">https://www.djangoproject.com</a></p>

5. Safely Combining Multiple Items into HTML: format_html_join()

While format_html() formats a single item, format_html_join() is used to safely combine multiple items (like lists or tuples) into HTML.

It is used in the format: format_html_join(separator, format_string, args_list).

  • separator: HTML used to separate each item (e.g., '\n', <br>)
  • format_string: HTML format to apply to each item (e.g., <li>{}</li>)
  • args_list: A list of data to sequentially substitute into format_string

Example: (Converting a Python list into <ul> tags)

from django.utils.html import format_html_join
from django.utils.safestring import mark_safe

options = [
    ('item1', 'Item 1'),
    ('item2', '<strong>Risky Item 2</strong>'),
]

# In format_string, {} refers to the entire tuple from args_list.
# {0} refers to the first element of the tuple, and {1} refers to the second.
# 'Item 1' and '<strong>...' parts will be escaped automatically.
list_items = format_html_join(
    '\n',  # Separate each item by new lines
    '<li><input type="radio" value="{0}">{1}</li>', # Format for each item
    options  # Data list
)

# list_items will become a 'safe' HTML fragment.
final_html = format_html('<ul>\n{}\n</ul>', list_items)

# When rendering final_html in Django template with {{ final_html }}...

Output (HTML source):

<ul>
<li><input type="radio" value="item1">Item 1</li>
<li><input type="radio" value="item2">&lt;strong&gt;Risky Item 2&lt;/strong&gt;</li>
</ul>

6. Safely Passing Data as / Tag: json_script()

There are many occasions where you need to pass Python data to JavaScript variables within Django templates. In these cases, using json_script(data, element_id) is both convenient and safe.

This function converts a Python dictionary or list into a JSON string and embeds it within a <script> tag of type application/json.

Example: (Passing data from a view)

# views.py
from django.utils.html import json_script

def my_view(request):
    user_data = {
        'id': request.user.id,
        'username': request.user.username,
        'isAdmin': request.user.is_superuser,
    }
    # Insert user_data transformed to JSON inside <script id="user-data-json">
    context = {
        'user_data_json': json_script(user_data, 'user-data-json')
    }
    return render(request, 'my_template.html', context)

Template (my_template.html):

{{ user_data_json }}

<script>
    const dataElement = document.getElementById('user-data-json');
    const userData = JSON.parse(dataElement.textContent);

    console.log(userData.username); // "admin"
</script>

This method effectively prevents any syntax errors or XSS vulnerabilities that may occur from manually inserting data like var user = {{ user_data }}; due to " or ' characters.


7. [Advanced] Explicitly Indicate HTML is Safe: mark_safe() / html_safe

Sometimes, developers may want to intentionally generate HTML and want to disable Django's automatic escaping feature as they are confident that the HTML is 100% safe.

Functions like format_html() or json_script() automatically perform this processing internally.

  • mark_safe(s): Returns the string s with a 'safe label' stating "This is safe HTML, do not escape it". This function itself does no escaping. Therefore, it should never be used on untrusted data.

  • @html_safe (decorator): Used to indicate that a string returned by a model's method or custom template tag function is safe HTML. It is useful when generating HTML through complex logic that is cumbersome to use with format_html.

Example: (Applying to a model method)

from django.db import models
from django.utils.html import format_html, html_safe

class UserProfile(models.Model):
    user = models.OneToOneField(User, on_delete=models.CASCADE)
    bio = models.TextField()

    # This method is safe because it uses format_html (recommended)
    def get_username_display(self):
        return format_html("<strong>{}</strong>", self.user.username)

    # This method is marked safe after complex logic with @html_safe (advanced way)
    @html_safe
    def get_complex_display(self):
        # ... (Logic for combining HTML safely guaranteed by developer) ...
        html_string = f"<div>{self.user.username}</div><p>{self.bio}</p>"
        # This could be vulnerable to XSS if bio contains <script>.
        # Use @html_safe with extreme caution.
        return html_string

Summary

The django.utils.html module is an essential tool that enables the implementation of Django's core security philosophy (Autoescaping) at the Python code level.

  • To prevent XSS, use escape(). (Automatically in templates)
  • To remove all tags, use strip_tags().
  • To safely generate HTML in Python code, always use format_html().
  • To combine list data into HTML, use format_html_join().
  • To pass Python data to JavaScript, json_script() is the safest and standard method.
  • mark_safe or @html_safe disables automatic escaping, so it’s recommended to use format_html instead unless absolutely necessary.

By correctly understanding and using these tools, you can create Django applications with robust security.