Problem with downloading products

Hello,

I have a problem with downloading products in a month, for example, just six products are downloaded in a month, while there are more in that month. In the downloader_history table, there is a 'Server returned HTTP response code: 403 ’ error log for products that did not download.

How can I fix it?

Hello Partow,

In order to help you, we’d need some more information:

  1. What satellite products is it about: Sentinel-1, Sentinel-2, Landsat-8?
  2. What is your datasource configuration?
  3. What is your season date interval?

Regards,
Cosmin.

Dear Cosmin,

  1. Sentinel-2
  2. I set these amounts: Scope = Query and download, Fetch Mode = overwrite, Max connections = 8, Local root = null, Download path = /mnt/archive/dwn_def/s2/default/, Max retries = 73, Retry interval minutes = 378, for User and Pasword section, I filled them by my account in ‘https://scihub.copernicus.eu/dhus/#/home’ site.
  3. Season start = 2020-11-01, Season mid = 2020-12-01, Season end = 2021-01-31

Thank you for your help and support,
Partow

Dear Partow,

Could you please try putting the Max connexions to 1 or 2 (1 if you have also configured the S1 datasource for SciHub and 2 if the S2 datasource is the only one on SciHub)?
Please note that the SciHub limits to a maximum of 2 simultaneous connections for download.
If you put a value of 8 the system will try to make more connections than allowed and you will end having the 403 errors from SciHub.

Please let us know if that helps.

Best regards,
Cosmin

Dear Cosmin,

Thank you for your solution, it helped me. All products download completely.

Best regards,
Partow

Dear all,

I’m attaching here my question because I have a similar problem and I don’t want to open a new topic.
I set up the data sources (Landsat8, Sentinel1, Sentinel2) with default parameters, and I enabled them all. I set up a site and a season for this site, enabling the three sensors, enabling the season, and enabling the site. But no download seems to take place.

In the “monitoring” tab I’m reading:

  • Download statistics: Estimated number of products to download: 0

I found these folders and files into the default download paths:
\dwn_def\l8\default\ <site_name>\failed_queries\ <json_name>.json
\dwn_def\s1\default\ <site_name>\failed_queries\ <json_name>.json
\dwn_def\s2\default\ <site_name>\failed_queries\ <json_name>.json

What’s wrong?

Thanks in advance

Hello,

Could you please provide the Sen4CAP services logs with, for example:

sudo journalctl -fu sen2agri-services --since yesterday

It looks like there are some queries issues that can occur from varying reasons (authentication, issues with scihub, proxies etc.).

Best regards,
Cosmin

Dear Cosmin,

here is the log (please consider I had to turn off the machine slightly before your reply):

-- Logs begin at Mon 2021-03-15 16:54:28 CET. --
Mar 15 16:54:40 CentOS7-x64 systemd[1]: Started Services for Sen2Agri.
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: Installed services: LPIS/GSAA Service,Object Storage Service,Progress Reporting Service,S1 Backscatter/Coherence S2-gridded Service,S1 Pre-processing Service,Sen2Agri Services
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: Configuration files will be read from /usr/share/sen2agri/sen2agri-services/config
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: Active logging levels: org.esa.sen2agri -> TRACE, org.esa.sen4cap -> TRACE, ro.cs.tao -> TRACE, root -> ERROR,
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: LOGBACK: No context given for c.q.l.core.rolling.SizeAndTimeBasedRollingPolicy@1976870338
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: Sen4CAP Services v2.1.0
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:42.860 INFO  [main      ] o.esa.sen4cap.ServicesStartup  - Starting ServicesStartup v2.1.0 on CentOS7-x64 with PID 2486 (/usr/share/sen2agri/sen2agri-services/modules/sen4cap-startup-2.1.0.jar started by sen2agri-service in /usr/share/sen2agri/sen2agri-services/bin)
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:42.862 DEBUG [main      ] o.esa.sen4cap.ServicesStartup  - Running with Spring Boot v2.2.0.RELEASE, Spring v5.2.0.RELEASE
Mar 15 16:54:42 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:42.862 INFO  [main      ] o.esa.sen4cap.ServicesStartup  - The following profiles are active: server
Mar 15 16:54:44 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:44.698 INFO  [main      ] r.c.t.s.c.ServletConfiguration - Sen4CAP Services version: 2.1.0 (2020-12-27T18:24:58Z)
Mar 15 16:54:44 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:44.698 INFO  [main      ] r.c.t.s.c.ServletConfiguration - TAO Services version: 1.0.3.5 (2020-12-17T10:03:42Z)
Mar 15 16:54:44 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:44.698 INFO  [main      ] r.c.t.s.c.ServletConfiguration - Using server port 8080
Mar 15 16:54:47 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:47.959 DEBUG [main      ] o.e.s.db.PersistenceManager    - Sensor Sentinel3 is not supported in Sen2Agri
Mar 15 16:54:47 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:47.983 DEBUG [main      ] o.esa.sen2agri.commons.Config  - Initialized datasource [Scientific Data Hub,S2] with 1 max connections and timeout 9000s
Mar 15 16:54:47 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:47.983 DEBUG [main      ] o.esa.sen2agri.commons.Config  - Initialized datasource [USGS,L8] with 1 max connections and timeout 9000s
Mar 15 16:54:47 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:47.983 DEBUG [main      ] o.esa.sen2agri.commons.Config  - Initialized datasource [Scientific Data Hub,S1] with 1 max connections and timeout 9000s
Mar 15 16:54:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:48.471 INFO  [main      ] o.e.s.services.ScheduleManager - Enabled sites:
Mar 15 16:54:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:48.541 DEBUG [main      ] o.e.s.services.ScheduleManager - Found scheduled job types: Lookup,Reports,ObjectStorage,S1,Retry
Mar 15 16:54:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:48.569 INFO  [main      ] o.e.s.services.ScheduleManager - Scheduled new job 'Reports.Reports' (next run: 2021-03-15T16:55:48, repeat after 1440 minutes)
Mar 15 16:54:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:48.574 INFO  [main      ] o.e.s.services.ScheduleManager - Setting scheduled.object.storage.move.enabled is disabled
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.144 INFO  [main      ] org.esa.sen2agri.CoreLauncher  - Network connection timeout initialized at 30 seconds
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.145 INFO  [main      ] org.esa.sen2agri.CoreLauncher  - Database configuration polling is disabled
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.151 INFO  [main      ] org.esa.sen2agri.CoreLauncher  - Batch notification initialized at 60 minutes with message limit of 0
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.153 DEBUG [main      ] o.esa.sen4cap.ServicesStartup  - Spring initialization completed
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.168 DEBUG [pool-3-thread-1] o.esa.sen4cap.ServicesStartup  - Cleaning up working directory '/mnt/archive/s1_preprocessing_work_dir'
Mar 15 16:54:50 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:54:50.598 INFO  [main      ] o.esa.sen4cap.ServicesStartup  - Started ServicesStartup in 8.292 seconds (JVM running for 10.052)
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.582 INFO  [DefaultQuartzScheduler_Worker-1] o.e.s.scheduling.ReportJob     - Starting job 'Reports.Reports'
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.593 DEBUG [DefaultQuartzScheduler_Worker-1] o.e.s.scheduling.ReportJob     - Report for S1 pre-processing added new 0 rows
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.600 DEBUG [DefaultQuartzScheduler_Worker-1] o.e.s.scheduling.ReportJob     - Report for S1 pre-processing added new 0 rows
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.605 DEBUG [DefaultQuartzScheduler_Worker-1] o.e.s.scheduling.ReportJob     - Report for S1 pre-processing added new 0 rows
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.605 INFO  [DefaultQuartzScheduler_Worker-1] o.e.s.scheduling.ReportJob     - Job 'Reports.Reports' completed
Mar 15 16:55:48 CentOS7-x64 start.sh[2476]: 2021-Mar-15 16:55:48.607 DEBUG [DefaultQuartzScheduler_Worker-1] o.e.s.services.ScheduleManager - Trigger 'Reports.Reports' completed with code 'NOOP'

Dear @cudroiu,

is the log I shared useful for identifying the problem?

Hello,

Not quite. Could you please provide more logs, not only the initialization one? We are interested in the parts when the system is trying to perform the queries to scihub.

Best regards,
Cosmin

Hello,

thank you for your quick answer. Here is the log: Sen4CAP_test_5000_download_log.txt (57.8 KB)

That’s what I did before extracting the log: I started the machine, then I created a brand new site and a season (active processors were set to L2A, L3B, L2-S1, LPIS) using the GUI.
Enabled data sources are Sentinel-1 and Sentinel-2. Data sources have these parameters:
Scope: Query and download;
Fetch mode: Overwrite;
Local root: /
Max connections: 1
Download path: /mnt/archive/dwn_def/ < sensor > /default
Max retries: 72
Retry interval minutes: 30

After exporting the log I have two .json files into these two folders:
/mnt/archive/dwn_def/ < sensor > /default/ < site > /failed_queries

  1. Is Max retries = 72 a right decision? This site is very small and it intersects only one Sentinel-2 tile, and the season covers 13 months, all in the past (some of them needs also Sentinel data from the Long Term Archive)

  2. Is Retry interval minutes = 30 a right decision? I can’t figure out how Sen4CAP queries SciHub. If both Sentinel-1 and Sentinel-2 queries are performed almost together, I think it’s very likely they will be rejected because they are too much.

  3. Until when does the querying flux go on?

  4. Looking to the log… Why is periodically logged “Sensor Sentinel3 is not supported in Sen2Agri”? What are the query’s “pages”? Do the “Actual products to download” counters increment only when a query is successful? Why are these counter = 0 even if some queries returned some products? A lot of lines contain a “code ‘NOOP’” string: what does it stand for?

Thank you, best regards

Hi,

You could consider switching from dhus (which is the implicit query source in SciHub plugin) to apihub. It seems that dhus introduced recently some limitations on the number of queries that can be performed in a certain amount of time, limitation that is not in apihub. To do this, you could check this link:

Concerning your questions:

  1. Please note that these settings are set taking into account a variety of possible configurations (ex. running in NRT, having local products in a network share that might fail from time to time etc.)
  2. This is related to the first point. Anyway, as they are configurable values, you can tune them as you prefer. Also, if you do not change the maximum connexions (and let them 1 for scihub) there should be no problem as the maximum allowed connections on SciHub is 2.
  3. The retries are made for 3 days for a product before giving up and marking it as aborted (72x60 minutes).
  4. The system is implemented as a plugable architecture (also based on sen2agri_ and some future components might also support other satellites images. The queries are paginated as not all datasources allow to get all the products once. Concerning the actual products to download is a little bit complicated due to the various use cases (for example, user can have a site that intersects 100 S2 tiles but he wants to filter to only 1 tile). Noop are usually given by the S1 pre-processing when there is no product to be processed.

Best regards,
Cosmin

Dear Cosmin,

thanks a lot: I executed the script you suggested to me and soon the queries were executed and the download started and it’s still in processing. But…

  1. I found a Sentinel-1-related .json file into the Sentinel-2 “failed_queries” folder.

  2. The size of each downloaded file is about a few kBs, while my “Network History” in the “System Monitor” tool marks about 10 MiB/s in receiving. This happens for all the downloaded .zip archives, even for those marked as “Download completed” in the log. These .zip archives are invalid.

  3. No product is available for this site according to the tab “products” in the GUI.

  4. About the “monitoring tab”: in “current downloads” there are always 2 products listed (one Sentinel-1, another Sentinel-2), whose progress is increasing; in “download statistics” the sum of “in progress” and “completed” doesn’t match the declared estimated number of products to download.

  5. The “statistics” - filtered on Sentinel-2 and on all the months - doesn’t show as filled the charts of the months in which some products have been “downloaded”.

  6. For some products the log reports “is not online”. Observing their dates, probably they are those contained in the Long Term Archive. What happens to these ones? Are they queried to be online again? This is paramount for my purposes because I have to do an analysis for a season in the past and I have to know if the Sen4CAP “query and download” system is suitable for me or not. In other words: will I ever get even the “offline” Sentinel products through Sen4CAP?

  7. Please, can you help me to correctly interpret the log to know what happens? I’m attaching it here, it follows the last one I uploaded: Sen4CAP_test_5000_download_log_2.txt (1.0 MB)

Thanks a lot. Best regards