Python read zip file csv

I'm trying to get data from a zipped csv file. Is there a way to do this without unzipping the whole files? If not, how can I unzip the files and read them efficiently?

Python read zip file csv

Burhan Ali

2,2571 gold badge25 silver badges38 bronze badges

asked Nov 15, 2014 at 4:16

1

I used the zipfile module to import the ZIP directly to pandas dataframe. Let's say the file name is "intfile" and it's in .zip named "THEZIPFILE":

import pandas as pd
import zipfile

zf = zipfile.ZipFile('C:/Users/Desktop/THEZIPFILE.zip') 
df = pd.read_csv(zf.open('intfile.csv'))

Python read zip file csv

ZygD

15.7k37 gold badges67 silver badges87 bronze badges

answered May 8, 2016 at 13:25

YaronYaron

1,51714 silver badges14 bronze badges

1

If you aren't using Pandas it can be done entirely with the standard lib. Here is Python 3.7 code:

import csv
from io import TextIOWrapper
from zipfile import ZipFile

with ZipFile('yourfile.zip') as zf:
    with zf.open('your_csv_inside_zip.csv', 'r') as infile:
        reader = csv.reader(TextIOWrapper(infile, 'utf-8'))
        for row in reader:
            # process the CSV here
            print(row)

answered Jun 25, 2019 at 21:12

volker238volker238

2,0611 gold badge18 silver badges15 bronze badges

3

A quick solution can be using below code!

import pandas as pd

#pandas support zip file reads
df = pd.read_csv("/path/to/file.csv.zip")

answered Oct 4, 2019 at 10:58

Hari PrasadHari Prasad

9021 gold badge8 silver badges10 bronze badges

1

zipfile also supports the with statement.

So adding onto yaron's answer of using pandas:

with zipfile.ZipFile('file.zip') as zip:
    with zip.open('file.csv') as myZip:
        df = pd.read_csv(myZip) 

answered May 22, 2017 at 16:43

Thought Yaron had the best answer but thought I would add a code that iterated through multiple files inside a zip folder. It will then append the results:

import os
import pandas as pd
import zipfile

curDir = os.getcwd()
zf = zipfile.ZipFile(curDir + '/targetfolder.zip')
text_files = zf.infolist()
list_ = []

print ("Uncompressing and reading data... ")

for text_file in text_files:
    print(text_file.filename)
    df = pd.read_csv(zf.open(text_file.filename)
    # do df manipulations
    list_.append(df)

df = pd.concat(list_)

Python read zip file csv

Xukrao

7,1744 gold badges25 silver badges50 bronze badges

answered Sep 13, 2017 at 18:14

Yes. You want the module 'zipfile'

You open the zip file itself with zipfile.ZipInfo([filename[, date_time]])

You can then use ZipFile.infolist() to enumerate each file within the zip, and extract it with ZipFile.open(name[, mode[, pwd]])

answered Nov 15, 2014 at 4:30

brycembrycem

5833 silver badges9 bronze badges

this is the simplest thing I always use.

import pandas as pd
df = pd.read_csv("Train.zip",compression='zip')

Python read zip file csv

SHR

7,6409 gold badges36 silver badges56 bronze badges

answered Nov 4, 2020 at 11:01

Python read zip file csv

Supposing you are downloading a zip file that contains a CSV and you don't want to use temporary storage. Here is what a sample implementation looks like:

#!/usr/bin/env python3

from csv import DictReader
from io import TextIOWrapper, BytesIO
from zipfile import ZipFile

import requests

def all_tickers():
    url = "https://simfin.com/api/bulk/bulk.php?dataset=industries&variant=null"
    r = requests.get(url)
    zip_ref = ZipFile(BytesIO(r.content))
    for name in zip_ref.namelist():
        print(name)
        with zip_ref.open(name) as file_contents:
            reader = DictReader(TextIOWrapper(file_contents, 'utf-8'), delimiter=';')
            for item in reader:
                print(item)

This takes care of all python3 bytes/str issues.

answered Feb 2, 2021 at 3:09

hughdbrownhughdbrown

46k20 gold badges81 silver badges106 bronze badges

1

If you have a file name: my_big_file.csv and you zip it with the same name my_big_file.zip

you may simply do this:

df = pd.read_csv("my_big_file.zip")

Note: check your pandas version first (not applicable for older versions)

Python read zip file csv

answered Mar 9, 2021 at 16:29

adhgadhg

9,90711 gold badges56 silver badges93 bronze badges

Not the answer you're looking for? Browse other questions tagged python-2.7 csv zip or ask your own question.

How do I read a CSV file from a ZIP file in Python?

how to read zip csv file in python.
import pandas as pd..
import zipfile..
zf = zipfile. ZipFile('C:/Users/Desktop/THEZIPFILE.zip').
# if you want to see all files inside zip folder..
zf. namelist().
# now read your csv file..
df = pd. read_csv(zf. open('intfile.csv')).

Can Python read ZIP files?

Python can work directly with data in ZIP files. You can look at the list of items in the directory and work with the data files themselves. This recipe is a snippet that lists all of the names and content lengths of the files included in the ZIP archive zipfile.

Can pandas read zipped CSV?

Read a File from Multiple Files in Zip Folder csv file. Pandas cannot directly read data from a zip folder if there are multiple files; to solve this, we will use the zipfile module within Python. The zipfile module offers two routes for reading in zip data : ZipFile and Path classes.

How do I extract a ZIP file in pandas?

read_csv() method. By assigning the compression argument in read_csv() method as zip, then pandas will first decompress the zip and then will create the dataframe from CSV file present in the zipped file.