Python escape slash in filename

Let me preface this by saying I'm not exactly sure what is happening with my code; I'm fairly new to programming.

I've been working on creating an individual final project for my python CS class that checks my teacher's website on a daily basis and determines if he's changed any of the web pages on his website since the last time the program ran or not.

The step I'm working on right now is as follows:

def write_pages_files():
    '''
    Writes the various page files from the website's links 
    '''
    links = get_site_links()
    for page in links:
        site_page = requests.get(root_url + page)
        soup = BeautifulSoup(site_page.text)
        with open(page + ".txt", mode='wt', encoding='utf-8') as out_file:
            out_file.write(str(soup))

The links look similar to this:

/site/sitename/class/final-code

And the error I get is as follows:

with open(page + ".txt", mode='wt', encoding='utf-8') as out_file:
FileNotFoundError: [Errno 2] No such file or directory: '/site/sitename/class.txt'

How can I write the site pages with these types of names (/site/sitename/nameofpage.txt)?

Goto start of series

Once upon a time there was a beautiful Windows programmer named Red Ridinghood.

One day, Red’s supervisor told her that they were going to start building a new application called GrandmasHouse. The feature list for the application was so long that they would never have attempted to get to GrandmasHouse if they hadn’t learned about a shortcut through Python Woods that would make the journey much shorter.

So Red started working her way through Python, and indeed found the going quick and easy. She loved the woods, and was happy to be traveling in them.

There was only one problem. Her programs did a lot of file manipulation, and so she had to do a lot of coding of filenames. Windows filenames used a backslash as a separator, but within Python the backslash had the magic power of an escape character, so every time she wanted a backslash in a filename she had to code two backslashes, like this:

myServer = "\\\\aServer" # ==> \\aServer
myFilename = myServer + "\\aSharename\\aDirName\\aFilename"

This feature of Python got very old very quickly. Red started calling it The Wolf, and it was the one part of Python that she hated.

One day as she was walking through the forest, she came to a clearing. In the clearing was a charming little pub, and inside the pub she met a tall, dark, and handsome stranger named Rawstrings.

Rawstrings said he could save her from The Wolf. All she had to do, he said, was to put an “r” in front of her quoted string literals. This would change them from escaped strings into raw strings. The backslash would lose its powers as an escape character, and become just an ordinary character. For example, with raw strings, you could code

r"\t"

and you wouldn’t get a string contining a single tab character — you would get a string containing the backslash character followed by “t”.

So instead of coding

myServer = "\\\\aServer"

Red could just code

myServer = r"\\aServer"

Red was seduced by the things that Rawstrings was telling her, and she began to spend a lot of time in his company.

Then one day, she coded

myDirname = r"c:\aDirname\"

and her program blew up with the following message:

myDirname = r"c:\aDirname\" ^ SyntaxError: invalid token 

After some experimenting, she discovered that — contrary to what Rawstrings had told her — the backslash seemingly hadn’t lost all of its magic powers after all. For example, she could code:

aString = r"abc\"xyz"
print aString

When she did this, it seemed perfectly legal. The double-quote just before “xyz” did not close the raw string at all. Somehow the backslash seemed to protect it — it wasn’t recognized as the closing delimiter of the raw string, but was included in the string. When she coded

print aString

she got

abc\"xyz

It was this protective character that the backslash had acquired that made

myDirname = r"c:\aDirname\"

blow up. The final backslash was protecting the closing double-quote, so it was not being recognized as a closing quote. And since there was nothing after the double-quote, the raw string was not closed, and she got an error. She tried coding the raw string with two backslashes at the end — as if the backslash was an escape character —

myDirname = r"c:\aDirname\\"

but that didn’t do it either. Instead of getting the single closing backslash that she wanted, she got two backslashes at the end:

c:\aDirname\\

She was in despair. She couldn’t figure out any way to use raw strings to put a single backslash at the end of a string, and she didn’t want to have to go back to fighting The Wolf.

Fortunately, at this point she confided her troubles to Paul Woodman, a co-worker who had started exploring Python a few months earlier. Here is what he told her.

In raw strings, backslashes do not have the magical power of escape characters that they do in regular strings. But they don’t lose all of their magical powers.

In raw strings — as you discovered — backslashes have the magical power of protection characters. Basically, this means that a backslash protects any character that follows it from being recognized as the closing delimiter of the raw string.

Coming from a Windows programming background, you assumed that support for raw strings was a feature whose purpose was to make the work of coding Windows filenames easier by removing the magical escape character powers from the backslash. And you were surprised to discover that raw strings aren’t truly raw in the way that you expected — raw in the sense that the backslash had no magical powers.

The reason for the special powers of backslashes in raw strings is that — contrary to what you assumed — raw strings were not developed to make it easier for Windows programmers to code filenames containing backslash characters. In fact, raw strings were originally developed to make the work of coding regular expressions easier. In raw strings, the backslash has the magical power of a protection character because that is just the kind of behavior it needs to have in order to make it easier to code regular expressions. The feature that you can’t end a raw string with a single backslash is not a bug. It is a feature, because it is not legal to end a regular expression with a single backslash (or an odd number of backslashes).

Unfortunately for you, this power makes it impossible to create a raw string that ends in a single backslash, or in an odd number of backslashes. So raw expressions won’t do what you want them to, namely save you from The Wolf.

But don’t despair! There is a way…

In Python, there are a number of functions in the os.path module that change forward slashes in a string to the appropriate filename separator for the platform that you are on. One of these function is os.path.normpath()The trick is to enter all of your filename strings using forward slashes, and then let os.path.normpath() change them to backslashes for you, this way.

myDirname = os.path.normpath("c:/aDirname/")

It takes a bit of practice to get into the habit of specifying filenames this way, but you’ll find that you adapt to it surprisingly easily, and you’ll find it a lot easier than struggling with The Wolf.

Red was super happy to hear this. She transferred to Woodman’s project team, and they all coded happily ever after!

How do you escape a slash in Python?

Learn More. In Python strings, the backslash "\" is a special character, also called the "escape" character. It is used in representing certain whitespace characters: "\t" is a tab, "\n" is a newline, and "\r" is a carriage return. Conversely, prefixing a special character with "\" turns it into an ordinary character.

Do I need to escape a forward slash in Python?

Programming languages, such as Python, treat a backslash (\) as an escape character. For instance, \n represents a line feed, and \t represents a tab. When specifying a path, a forward slash (/) can be used in place of a backslash. Two backslashes can be used instead of one to avoid a syntax error.

Can you have a slash in a filename?

There is no way to put slashes in filenames, it's absolutely forbidden by the kernel. You can patch your filesystem via block device access, or use similarly-looking characters from the Unicode, but those aren't solutions.

How do you escape a character in Python?

To insert characters that are illegal in a string, use an escape character. An escape character is a backslash \ followed by the character you want to insert.