Does python use compiler or interpreter?

Its a big confusion for people who just started working in python and the answers here are a little difficult to comprehend so i'll make it easier.

When we instruct Python to run our script, there are a few steps that Python carries out before our code actually starts crunching away:

  • It is compiled to bytecode.
  • Then it is routed to virtual machine.

When we execute some source code, Python compiles it into byte code. Compilation is a translation step, and the byte code is a low-level platform-independent representation of source code.

Note that the Python byte code is not binary machine code (e.g., instructions for an Intel chip).

Actually, Python translate each statement of the source code into byte code instructions by decomposing them into individual steps. The byte code translation is performed to speed execution. Byte code can be run much more quickly than the original source code statements. It has.pyc extension and it will be written if it can write to our machine.

So, next time we run the same program, Python will load the .pyc file and skip the compilation step unless it's been changed. Python automatically checks the timestamps of source and byte code files to know when it must recompile. If we resave the source code, byte code is automatically created again the next time the program is run.

If Python cannot write the byte code files to our machine, our program still works. The byte code is generated in memory and simply discarded on program exit. But because .pyc files speed startup time, we may want to make sure it has been written for larger programs.

Let's summarize what happens behind the scenes. When Python executes a program, Python reads the .py into memory, and parses it in order to get a bytecode, then goes on to execute. For each module that is imported by the program, Python first checks to see whether there is a precompiled bytecode version, in a .pyo or .pyc, that has a timestamp which corresponds to its .py file. Python uses the bytecode version if any. Otherwise, it parses the module's .py file, saves it into a .pyc file, and uses the bytecode it just created.

Byte code files are also one way of shipping Python codes. Python will still run a program if all it can find are.pyc files, even if the original .py source files are not there.

Python Virtual Machine (PVM)

Once our program has been compiled into byte code, it is shipped off for execution to Python Virtual Machine (PVM). The PVM is not a separate program. It need not be installed by itself. Actually, the PVM is just a big loop that iterates through our byte code instruction, one by one, to carry out their operations. The PVM is the runtime engine of Python. It's always present as part of the Python system. It's the component that truly runs our scripts. Technically it's just the last step of what is called the Python interpreter.

A simple explanation of how Python code is executed differently than older programming languages.

image credits — https://encrypted-tbn0.gstatic.com/images?q=tbn:ANd9GcQAQ5hOZAjAWsKwFbOXNONYWW-Mg4dxL7cWc-1gufkYvviTnvH8SA&s

As a Machine Learning Engineer, I have been using Python for more than a year. Recently, I have also started learning C++, for fun. It made me realize how easy and intuitive Python is. I got more curious about how Python is different from other languages and its working. In this blog, I try to scratch Python’s inner working.

Python started as a hobby project by Guido Van Rossum and was first released in 1991. A general purpose language, Python is powering large chunk of many companies like Netflix and Instagram. In an interview, Guido compares Python to languages like Java or Swift and says that while the latter two are a great choice for software developers — people whose day job is programming, but Python was made for people whose day job has nothing to do with software development but they code mainly to handle data.

When you read about Python, quite often you come across words like — compiled vs interpreted, bytecode vs machine code, dynamic typing vs static typing, garbage collectors, etc. Wikipedia describes Python as

Python is an interpreted, high-level, general-purpose programming language. It is is dynamically typed and garbage-collected.

Interpreted Languages

When you write a program in C/C++, you have to compile it. Compilation involves translating your human understandable code to machine understandable code, or Machine Code. Machine code is the base level form of instructions that can be directly executed by the CPU. Upon successful compilation, your code generates an executable file. Executing this file runs the operations in your code step by step.

For the most part, Python is an interpreted language and not a compiled one, although compilation is a step. Python code, written in .py file is first compiled to what is called bytecode (discussed in detail further) which is stored with a .pyc or .pyo format.

Instead of translating source code to machine code like C++, Python code it translated to bytecode. This bytecode is a low-level set of instructions that can be executed by an interpreter. In most PCs, Python interpreter is installed at /usr/local/bin/python3.8. Instead of executing the instructions on CPU, bytecode instructions are executed on a Virtual Machine.

Why Interpreted?

One popular advantage of interpreted languages is that they are platform-independent. As long as the Python bytecode and the Virtual Machine have the same version, Python bytecode can be executed on any platform (Windows, MacOS, etc).

Dynamic typing is another advantage. In static-typed languages like C++, you have to declare the variable type and any discrepancy like adding a string and an integer is checked during compile time. In strongly typed languages like Python, it is the job of the interpreter to check the validity of the variable types and operations performed.

Disadvantages of Interpreted languages

Dynamic typing provides a lot of freedom, but simultaneously it makes your code risky and sometimes difficult to debug.

Python is often accused of being ‘slow’. Now while the term is relative and argued a lot, the reason for being slow is because the interpreter has to do extra work to have the bytecode instruction translated into a form that can be executed on the machine. A StackOverflow post makes it easier to understand using an analogy —

If you can talk in your native language to someone, that would generally work faster than having an interpreter having to translate your language into some other language for the listener to understand.

What exactly is Garbage Collection?

In older programming languages, memory allocation was quite manual. Many times when you use variables that are no longer in use or referenced anywhere else in the program, they need to be cleaned from the memory. Garbage Collector does that for you. It automatically frees up space without you doing anything. The memory management works in two ways —

In a simplified way, it keeps track of the number of references to an object. When that number goes down to zero, it deletes that object. This is called reference counting. This cannot be disabled in Python.

In cases where object references itself or two objects refer each other, a process called generation garbage collection helps. This is something traditional reference counting cannot take care of.

What is __pycache__ ?

Many times in your personal project or on GitHub, you might have seen a folder named __pycache__ being created automatically.

/folder   - __pycache__       - preprocess.cpython-36.pyc   - preprocess.py

As you can see, the filename is the same as the one outside __pycache__ folder. The .pyc extension tells us that the file contains bytecode for preprocess.py. The names cpython denotes the type of interpreter. CPython means that the interpreter was implemented in C language. Similarly, JPython is a Python interpreter implemented in Java.

But why is the folder created in the first place? Well, it slightly increases the speed of the Python program. Unless you change your Python code, recompilation to bytecode is avoided, thereby saving time.

I have started my personal blog and I don’t intend to write more amazing articles on Medium. Support my blog by subscribing to thenlp.space

Does Python use both interpreter and compiler?

Python is Both Compiled as well as Interpreted While running the code python generates a byte code internally this byte code is then converted using a python virtual machine (p.v.m) to generate the output.

Is interpreter used in Python?

The Python interpreter is a bytecode interpreter: its input is instruction sets called bytecode. When you write Python, the lexer, parser, and compiler generate code objects for the interpreter to operate on.

Does Python language use a compiler?

Python is one of the most famous programming languages. Python is an interpreted programming language and has different execution environments. It has a wide range of compilers to execute the python programs eg.

Why compiler is not used in Python?

Compiled bytecode, which is ran through an interpreter because Python is an interpreted language. Python is a compiled language. It just isn't compiled to a language for which a hardware implementation exists; it is executed by a virtual machine.