I have a problem with downloading products in a month, for example, just six products are downloaded in a month, while there are more in that month. In the downloader_history table, there is a 'Server returned HTTP response code: 403 ’ error log for products that did not download.
I set these amounts: Scope = Query and download, Fetch Mode = overwrite, Max connections = 8, Local root = null, Download path = /mnt/archive/dwn_def/s2/default/, Max retries = 73, Retry interval minutes = 378, for User and Pasword section, I filled them by my account in ‘https://scihub.copernicus.eu/dhus/#/home’ site.
Season start = 2020-11-01, Season mid = 2020-12-01, Season end = 2021-01-31
Could you please try putting the Max connexions to 1 or 2 (1 if you have also configured the S1 datasource for SciHub and 2 if the S2 datasource is the only one on SciHub)?
Please note that the SciHub limits to a maximum of 2 simultaneous connections for download.
If you put a value of 8 the system will try to make more connections than allowed and you will end having the 403 errors from SciHub.
I’m attaching here my question because I have a similar problem and I don’t want to open a new topic.
I set up the data sources (Landsat8, Sentinel1, Sentinel2) with default parameters, and I enabled them all. I set up a site and a season for this site, enabling the three sensors, enabling the season, and enabling the site. But no download seems to take place.
In the “monitoring” tab I’m reading:
Download statistics: Estimated number of products to download: 0
I found these folders and files into the default download paths:
\dwn_def\l8\default\ <site_name>\failed_queries\ <json_name>.json
\dwn_def\s1\default\ <site_name>\failed_queries\ <json_name>.json
\dwn_def\s2\default\ <site_name>\failed_queries\ <json_name>.json
Not quite. Could you please provide more logs, not only the initialization one? We are interested in the parts when the system is trying to perform the queries to scihub.
That’s what I did before extracting the log: I started the machine, then I created a brand new site and a season (active processors were set to L2A, L3B, L2-S1, LPIS) using the GUI.
Enabled data sources are Sentinel-1 and Sentinel-2. Data sources have these parameters:
Scope: Query and download;
Fetch mode: Overwrite;
Local root: /
Max connections: 1
Download path: /mnt/archive/dwn_def/ < sensor > /default
Max retries: 72
Retry interval minutes: 30
After exporting the log I have two .json files into these two folders:
/mnt/archive/dwn_def/ < sensor > /default/ < site > /failed_queries
Is Max retries = 72 a right decision? This site is very small and it intersects only one Sentinel-2 tile, and the season covers 13 months, all in the past (some of them needs also Sentinel data from the Long Term Archive)
Is Retry interval minutes = 30 a right decision? I can’t figure out how Sen4CAP queries SciHub. If both Sentinel-1 and Sentinel-2 queries are performed almost together, I think it’s very likely they will be rejected because they are too much.
Until when does the querying flux go on?
Looking to the log… Why is periodically logged “Sensor Sentinel3 is not supported in Sen2Agri”? What are the query’s “pages”? Do the “Actual products to download” counters increment only when a query is successful? Why are these counter = 0 even if some queries returned some products? A lot of lines contain a “code ‘NOOP’” string: what does it stand for?
You could consider switching from dhus (which is the implicit query source in SciHub plugin) to apihub. It seems that dhus introduced recently some limitations on the number of queries that can be performed in a certain amount of time, limitation that is not in apihub. To do this, you could check this link:
Concerning your questions:
Please note that these settings are set taking into account a variety of possible configurations (ex. running in NRT, having local products in a network share that might fail from time to time etc.)
This is related to the first point. Anyway, as they are configurable values, you can tune them as you prefer. Also, if you do not change the maximum connexions (and let them 1 for scihub) there should be no problem as the maximum allowed connections on SciHub is 2.
The retries are made for 3 days for a product before giving up and marking it as aborted (72x60 minutes).
The system is implemented as a plugable architecture (also based on sen2agri_ and some future components might also support other satellites images. The queries are paginated as not all datasources allow to get all the products once. Concerning the actual products to download is a little bit complicated due to the various use cases (for example, user can have a site that intersects 100 S2 tiles but he wants to filter to only 1 tile). Noop are usually given by the S1 pre-processing when there is no product to be processed.
thanks a lot: I executed the script you suggested to me and soon the queries were executed and the download started and it’s still in processing. But…
I found a Sentinel-1-related .json file into the Sentinel-2 “failed_queries” folder.
The size of each downloaded file is about a few kBs, while my “Network History” in the “System Monitor” tool marks about 10 MiB/s in receiving. This happens for all the downloaded .zip archives, even for those marked as “Download completed” in the log. These .zip archives are invalid.
No product is available for this site according to the tab “products” in the GUI.
About the “monitoring tab”: in “current downloads” there are always 2 products listed (one Sentinel-1, another Sentinel-2), whose progress is increasing; in “download statistics” the sum of “in progress” and “completed” doesn’t match the declared estimated number of products to download.
The “statistics” - filtered on Sentinel-2 and on all the months - doesn’t show as filled the charts of the months in which some products have been “downloaded”.
For some products the log reports “is not online”. Observing their dates, probably they are those contained in the Long Term Archive. What happens to these ones? Are they queried to be online again? This is paramount for my purposes because I have to do an analysis for a season in the past and I have to know if the Sen4CAP “query and download” system is suitable for me or not. In other words: will I ever get even the “offline” Sentinel products through Sen4CAP?
Please, can you help me to correctly interpret the log to know what happens? I’m attaching it here, it follows the last one I uploaded: Sen4CAP_test_5000_download_log_2.txt (1.0 MB)