Transferring products into a fresh install

Harijsi · February 24, 2021, 2:56pm

Dear Sen4CAP team, @cudroiu, @Philippe_Malcorps,

Is it possible to transfer files that are downloaded/processed by one Sen4CAP installation to another? Can I import products into the catalog of a fresh installation, but keep retrieving future data from SciHub as the season goes on?

We will have a dedicated server for Sen4CAP in a couple of months, but we would like to start processing the data now on a temporary machine to get ready for the season.

If it is feasible to transfer L1/L2 products between installations, we would deeply appreciate instructions on how it can be done.

Thank you in advance!

cudroiu · February 25, 2021, 7:52am

Hello,
Yes, sure, you can transfer the production from one machine to another and is quite simple.

You will have to dump the all the tables (not all are necessary to be imported as some of them do not change their content but for simplicity) from the old server and then import them into the new one’s database with something like this:
Export:

pg_dump -U postgres -Fc sen4cap -t <table_name> > /path_to_dump/<table_name>

Import:

pg_restore --clean -U postgres -d sen4cap -t <table_name> /path_to_dump/<table_name>.sql

Or, simply:

echo “Backing up database”
sudo -u postgres pg_dumpall > /tmp/db.sql

and then, on the new machine (here I think you will have to also drop the sen4cap database on the target machine (as it does not product/sites data yet)):

echo “Restoring database backup”
psql -U postgres -f /tmp/db.sql

Before doing the import you should also copy the products from the old server to the new one, in the same locations :

/mnt/archive/dwn_def/
/mnt/archive/maccs_def
/mnt/archive/grassland_mowing_files/
/mnt/archive/agric_practices_files/
/mnt/archive/marker_database_files/
/mnt/archive/lpis/
All directories /mnt/archive/<site_short_name>
And optionally (but if the new system is already installed, you had to provide them during the installation) gipp_maja, srtm and swbd directories.

Of course, you will have to make sure that all the above directories have write access for the user sen2agri-service otherwise the new installation will not be able to write into them.

Please note that you should use the same system version on the two servers when you will do the switch otherwise you can get to unpredictable results (so if you use 2.0 on the new one and the old one is 1.3, you should upgrade first the 1.3 to 2.0).

Hope this helps.

Best regards,
Cosmin

Harijsi · February 25, 2021, 8:05am

Thank you very much for the fast reply!
We will make sure to follow these instructions when the time comes. Much appreciated!

Harijsi · February 25, 2021, 11:11am

@cudroiu,

We just set up version 2.0 and there seems to be a problem with data querying from the Scientific Data Hub.
Here is a fragment of the JSON file which is generated in the “failed queries” folder of S2 dwn_def:

{“user”:null,“password”:null,“id”:“d036500f-5395-4014-84e0-7727dc3b2d1e”

It seems like the problem might be due to my credentials not being saved in the config. I already entered the correct credentials in the web interface and restarted the sen2agri-services, but no change. Maybe there is a manual way to edit the config?

Context: We launched season of 2021, it starts on 15.04, but we wanted to gather all S2 data for MAJA beforehand.

Thanks in advance!

cudroiu · February 25, 2021, 11:39am

Hello,
Could you please provide some more logs of the sen4cap services, after restart?
Is this a fresh Sen4CAP install or an updated one?

Cosmin

Harijsi · March 2, 2021, 7:46am

I have added the log as an attachment. It appears that the reason for the queries to fail is “Too Many Requests”.

This was a ‘fresh’ installation, meaning - we used the uninstall instructions in the manual to remove the old one and set up 2.0 afterwards.

sen2agri_services_log.zip (1.3 MB)

cudroiu · March 2, 2021, 2:18pm

Dear Harijsi,

It seems that this issue is due to some limitations introduces by SciHub in the last time (1-2 weeks). Apparently, they introduces some time constraints for the number of queries performed in a given amount of time (or maybe is just a temporary issue they have).
For now, a quick solution would be to change the configuration of the SciHub plugin to use apihub instead dhus:

Copy the /usr/share/sen2agri/sen2agri-services/lib/tao-datasources-scihub-1.0.X.X.jar into a distinct folder
unzip/extract from the archive (the jar is actually a zip archive)
go into ro/cs/tao/datasource/remote/scihub/scihub.properties and switch the comments as below:
scihub.search.url = https://scihub.copernicus.eu/apihub/search
#scihub.search.url = The Sentinels Scientific Data Hub
Put back the modified file in the jar file, in the same location
copy back the modified jar file into /usr/share/sen2agri/sen2agri-services/lib/
Restart the services

Or, you can use the following script to do this automatically:change_scihub_jar_cfg.zip (891 Bytes)

The script can be invoked to simply switch from dhus to apihub:

./change_scihub_jar_cfg.sh
./change_scihub_jar_cfg.sh -s apihub

or, to switch back to dhus:

./change_scihub_jar_cfg.sh -s dhus

Please note that we decided to use at some point dhus instead of apihub due to some large timeouts that SciHub had on apihub when querying S1 (I hope that they solved this issue).

Please let me know.

Best regards,
Cosmin

Harijsi · March 9, 2021, 7:57am

Thank you for the help, @cudroiu!

We changed the datasource to apihub and the query is successful, but there seems to be a different issue with downloading. Only a part of the available images are downloaded, the rest are queried and shown as “In progress” under the “monitoring” tab in the web interface, but are not being retrieved (added a screenshot below).

no_download

Only a few (2-3) tiles are downloaded per date, so it doesn’t seem to be a storage space issue.

I have also attached a system log if that helps with troubleshooting. I did not notice any peculiarities there, however.
sen2agri_services_log.txt (1.7 MB)

cudroiu · March 9, 2021, 10:38am

Dear Harijsi,

Could you try the following operation:

go to /mnt/archive/dwn_def/ and search for any failed_queries subdirectories (they are located actually several levels below like /mnt/archive/dwn_def/<sat_id>/default/<site_name>/failed_queries/) and try to delete them or their contents
execute :

psql -U admin sen4cap -c “delete from downloader_history where status_id in (1,3,4)”

run the script attached to force querying from the start of the season:force_download_restart.zip (1.3 KB) . You can launch it first with the following parameter to print your sites:

./force_download_restart.py -p

And then:

./force_download_restart.py -s <site_id>

Please let me know if this solves your issue.

Best regards,
Cosmin

Harijsi · March 9, 2021, 1:04pm

Dear Cosmin,

Thank you for the help so far! We tried your suggestions and the system downloaded two additional tiles from 15.02, but then stopped. The missing data is now labeled as retriable, but no further downloads proceed.

I have added a log since we applied the procedure.
sen2agri_services_log_2.txt (527.4 KB)

cudroiu · March 9, 2021, 2:05pm

Dear Harijsi,

Could you tell me if you changed the “Max connections” setting in the datasource for SciHub?
If so, you should set it to 1 as the maximum concurrent download allowed for SciHub is 2 (there will be 1 for S2 and 1 for S1).

Please let me know.

Best regards,
Cosmin

Harijsi · March 18, 2021, 8:06am

Dear @cudroiu,

Thank you for all the help! After some monitoring, it appears that S-2 query and download are now operational with ApiHub and Max Connections set to “1” (I had previously changed it for troubleshooting). However, I have noticed that the downloader skips tiles for some dates. Are there any eligibility criteria for download as well (e.g., cloud cover)?

Best,
Harijs

cudroiu · March 18, 2021, 8:40am

Dear Harijsi,

If you refer to the L2A products created by MAJA (what you see in the website, Products tab), yes, MAJA do not process the products that have cloud coverage > 90%.

Best regards,
Cosmin

Harijsi · March 18, 2021, 9:56am

Dear Cosmin,

I was referring to L1C data downloaded from SciHub, since there seems to be an omission of downloaded data for March 15th. Is there a quick way to re-query products from the start of the season?

Best,
Harijs

cudroiu · March 18, 2021, 10:35am

Dear Harijsi,

Yes, you can force querying from the start of the season like described here:

Or you can use this script:
force_download_restart.zip (1.3 KB)

Best regards,
Cosmin

Harijsi · May 26, 2021, 11:06am

Dear @cudroiu,

Thank you for all the help so far. We have finally set up the dedicated system and now are ready to transfer the products. Before doing so, I wanted to clarify one thing.

We have already copied the downloaded and processed data from mnt/archive to the new server with a fresh install.
However, the new system does not yet have any sites defined. Before doing the db dump/import procedure, should we create a new site and season with the exact same name as on the old server and only then import the databases?

Huge thanks again!

Best,
Harijs

cudroiu · May 26, 2021, 2:36pm

Dear Harijsi,

I think after you install the Sen4CAP system on the new system (and already copied the /mnt/archive from the old to the /mnt/archive of the new one) you can directly dump the database on the old system and import it on the new one, without the need to create site and seasons (as you will dump/import site and season tables too).
Just one remark, please pay attention for your downloader_history and product tables to have all the paths to /mnt/archive and not resolved paths (if /mnt/archive is a symlink on the old machine). Otherwise you can get to the new system with some paths that are do not exist on the new system. But for that, you can simply run on the new machine an sql replace command to corect the issue (if it is the case).

Hope this helps.

Best regards,
Cosmin

Harijsi · May 27, 2021, 8:14am

Dear @cudroiu,

Thank you so much for the very fast response!
So as I understand - we dump the entirety of the psql sen4cap database from the old server (not just a couple of tables, as mentioned before) and then import the entire db into the new, fresh install? Is that right?

Best,
Harijs

cudroiu · May 27, 2021, 8:40am

Yes, I think it will be safer to dump all the database in order to keep all the ids (for sites, seasons etc.) as they were on the original machine otherwise you will end on the target machine with some ids invalid for site_id, for example. In updated my post from February to specify that all tables should be imported, for safety.
If you prefer, you could have a look over the update.sh script in the installation package, and you will notice there the operations needed:

sudo systemctl stop sen2agri-executor sen2agri-orchestrator sen2agri-http-listener sen2agri-demmaccs sen2agri-demmaccs.timer sen2agri-monitor-agent sen2agri-scheduler sen2agri-services

echo “Backing up database”
sudo -u postgres pg_dumpall > /tmp/db.sql

Here I think you will have to also drop the sen4cap database on the target machine (as it does not product/sites data yet)

echo "Restoring database backup"
psql -U postgres -f /tmp/db.sql

sudo systemctl start sen2agri-executor sen2agri-orchestrator sen2agri-http-listener sen2agri-demmaccs sen2agri-demmaccs.timer sen2agri-monitor-agent sen2agri-scheduler sen2agri-services

Hope this helps.

Best regards,
Cosmin

Harijsi · May 31, 2021, 8:41am

Dear @cudroiu,

We have successfully migrated the system and database, for now it seems that it is working as intended (regarding data acquisition and processing). The MAJA errors I mentioned in another thread are also absent at the moment. There is some suspicion that still some clear sky dates might be getting skipped by L2A, but for consistency’s sake I will keep it out of this thread.

Thank you for assisting us with switch of systems, your help is immensely appreciated!

Best,
Harijs