Ethereum: Read the ZIP file from the URL generates a bad zip file error
==================================================================== == ===========
Introduction
——
When you try to download the Crypto historical data from [www.data.binance.vision] ( there is an error when you read zip files using the PD.Read_csv 'method method . This problem is probably due to the way zip files are managed by URL.
In this article, we will explore because the zip file containing historical Binance data could be considered bad and how to solve this problem using Python.
Why bad zip files?
-------
A bad zip file contains an incorrect or malformed zip archive. This can happen when the ZIP file is corrupt, not correctly compressed or has an unrealized signature. In our case, we suspect that the zip files served by the binance servers are imperfect.
Solution
----
To solve this problem, we must make sure that zip files are downloaded from the binance server are correct and well formed. A way to do this is to use therequested ‘library to download the zip file directly and then extract its content using Python.
Here is a snippet of example code that shows how to achieve this goal:
`Python
Import requests
Matters zipfile
It matters Panda as PD
Def Download_data_from_binance (URL):
Answer = Requests.get (URL)
Zip_file = Zipfile.zipfile (Reply.Content, ‘R’)
By name ZIP_File.Namelist ():):
If you do not Fileame.endswith (‘. CSV’):
continue
Check if the file has an .zip extension
If you do not file.endswith (‘. Zip’):
continue
Extract the content of the ZIP file
Filepath = Os.path.join (Os.GetCWD (), File Name)
with open (filepath, ‘wb’) as f:
For the name in zip_file.Namelist ():
If not Name.endswith (‘. CSV’):
continue
Check if the file has an .csv extension
If not Name.endswith (‘. CSV’):
continue
Read the CSV file from the Zip Archive
With zipfile.zipfile (zip_file, ‘r’) as zip_ref:
Zip_ref.extrall (Filepath)
Save the CSV data extracted in a temporary file
F.Write (Zip_file.Namelist () [Name])
Return filepath
Specify the URL of the Binance server
URL = “
Download the historical encryption data from the specified URL
Download_filepath = download_fat_from_binance (URL)
If downloaded_filepath:
Print (F “Download correctly. The following files were extracted:”)
with open (downloaded_filepath, ‘r’) as f:
For the line in F:
Print (line.strip ())
other:
Print (“Unable to download Crypto Historical data.”)
`
In this fragment of code:
- We use therequests’ library to download the zip file from the binance server.
- So we extract its content using a
zipfile.zipfile
.
- Iteamiamo through each file name in the extracted zip archive and verify if it has an extension `
.csv
. Otherwise, we jump this step.
- For each name CSV file found, read the CSV file corresponding from the Zip Archive and save it in a temporary position.
Note that this approach assumes that historical cryptocurrency file files are located in the main directory of the ZIP archive. It may be necessary to adjust the code if the files are stored elsewhere.
Conclusion
—–
In conclusion, the reading of URL ZIP files can generate bad zip file errors due to malformed corruption or zip archives. Using a direct download approach and checking the correct file extensions, you should be able to solve this problem during the download of historical cryptocurrency data from the binance server using Python.