› Forums › Web Development › HarvardX: CS50W – CS50’s Web Programming with Python and JavaScript › CS50W – Lecture 3 – Django › 🧠 Understanding re.sub, Raw Strings, and \.md$ in Python (A Practical Learning Post)
- This topic is empty.
-
AuthorPosts
-
April 8, 2026 at 12:21 pm #6343
If you’re working with file handling in Django (like your Wiki project), you’ll often see code like this:
return list(sorted(re.sub(r"\.md$", "", filename) for filename in filenames if filename.endswith(".md")))At first glance, it looks complex. But once you understand the pieces, it becomes very logical—and powerful.
Let’s break it down step by step 👇
🔍 1. What is
re.sub()?The function
re.sub()comes from Python’s re module.👉 Meaning:
sub = substitute (replace)
Syntax:
re.sub(pattern, replacement, string)Example:
re.sub(r"\.md$", "", "Python.md")➡️ Output:
Python👉 It removes
.mdfrom the filename.
🧩 2. Understanding the Regex:
r"\.md$"This is the most important part.
Part Meaning \.literal dot ( .is special in regex)mdletters md$end of string 👉 So it means:
“Match
.mdonly if it appears at the end”
⚠️ Why not just use
.md?Because in regex:
"."👉 means:
any character, not a dot
❌ Wrong:
re.sub(".md$", "", "Python.md")This could match:
n.mdxmd- etc.
👉 Unexpected results!
✅ Correct:
re.sub(r"\.md$", "", "Python.md")👉 Matches exactly
.md
🔐 3. Why do we use
\?Because:
Symbol Meaning .any character \.actual dot 👉 So
\escapes the special meaning of.
🧠 4. Why do we need
r""(raw string)?This is where many learners get confused.
There are two systems involved:
1️⃣ Python string parser
2️⃣ Regex engine
Problem without raw string:
"\n"Python converts it into: 👉 a newline character
With raw string:
r"\n"👉 Passed as:
\ + n
💡 So:
Thing Purpose \tells regex “treat special char literally” r""tells Python “don’t interpret backslashes”
🔄 5. Full Line Explained
return list(sorted( re.sub(r"\.md$", "", filename) for filename in filenames if filename.endswith(".md") ))
Step-by-step:
✅ 1. Filter:
if filename.endswith(".md")👉 Only
.mdfiles
✅ 2. Transform:
re.sub(r"\.md$", "", filename)👉 Remove
.md
✅ 3. Sort:
sorted(...)
✅ 4. Convert to list:
list(...)
🧪 Example
filenames = ["Python.md", "Django.md", "README.txt"]Output:
["Django", "Python"]
💡 6. Simpler Alternative (No Regex)
You could also write:
filename[:-3]👉 Removes last 3 characters (
.md)
Example:
"Python.md"[:-3]➡️
"Python"
⚖️ Regex vs Simpler Method
Method When to use re.sub()complex patterns slicing ( [:-3])simple fixed endings
🚀 Real-World Use Case (Django / CS50W Wiki)
This logic is used to:
- list all wiki entries
- strip
.mdextension - display clean page names
🧾 Final Takeaways
✔
re.sub()= replace using patterns
✔\.= literal dot (not wildcard)
✔$= end of string
✔r""= prevents Python from misinterpreting\
✔ Always use raw strings with regex
🧠 One-Line Summary
Use
re.sub(r"\.md$", "", filename)to safely remove.mdfrom filenames by correctly handling both regex rules and Python string behavior. -
AuthorPosts
- You must be logged in to reply to this topic.
