No predicted classes created in L4A processor

An issue we have encountered a few times now:

We have been running into an issue with the Sen4CAP L4A Crop Type processor with several clients and ourselves. We are using the latest version (1.2) and we have made sure that the LUT and declaration data is in the right format. All data is imported correctly without any errors. After we run the L4A classifier, there are no errors and expected data is created, so all seems to work fine. The confusion matrices, etc. are created and make sense. So the RF classifier does what it needs to do.

The issue is that we never get any predicted classes, the columns CT_decl, CT_pred_1, CT_conf_1, CT_pred_2 and CT_conf_2 are all empty. Initially I thought the issue was related to the size of the test data (we had two areas with about 5k and 20k parcels), but we just came from a call with a client with the exact same problem with a 200k sample size. From the Forum I also got the impression more users have this problem.

Would anybody be able to give us a helping hand here figuring out what the problem is?

Hello Sibrant,

We have had this same issue. As of now we have not come up with a solution, but a workaround exists:

By running the python script crop-type-wrapper.py manually as Philippe instructed here we were able to produce and access L4A results successfully. Only needed to manually join them to the declaration data set.

As this is the case, we suspect the issue lies with the final exporting step of the L4A processor.

Best regards,
Sakari

Hello,

If it’s a new install it might be a known issue, caused by a change in a library that the classification script uses. On some systems it’s crashing when computing the accuracy statistics.

If you don’t mind trying a patch, I’m attaching an updated version of the classification script (can be placed in /usr/bin): crop_type.zip (6.7 KB).

Hello @sibrant,

As Sakari mentioned, can you check the Parcels_all_with_predictions_%.csv and Parcels_classified_with_predictions_%.csv files? Are the predictions also empty in these files (CT_decl, CT_pred_1, CT_conf_1, CT_pred_2 and CT_conf_2)? If this is not the case, it must be a problem in the export of the shp.

Another observation: it is not a blocking point, but with small datasets, the parameters of the strategy for the selection of the calibration parcels for each crop type should be adapted. These parameters are explained in the ATBD for L4A crop type mapping 1.2 (http://esa-sen4cap.org/content/technical-documents), P;30-31. With the by default parameters (–best-s2-pix 10 --pa-min 30 --pa-train-h 4000 --pa-train-l 1333 --sample-ratio-h 0.25 --sample-ratio-l 0.75 --smote-target 1000):

  • Only crop types with more than 30 parcels containing at least 10 S2 pixels will be classified
  • For crop types with more than 4000 parcels (containing at least 10 S2 pixels), 25% of the parcels are used for the calibration
  • For crop types with less than 4000 but more than 1333 parcels (containing at least 10 S2 pixels), 1000 parcels will be used for the calibration (which represents between 25% to 75% of the parcels)
  • For crop types with less than 1333 parcels (containing at least 10 S2 pixels), 75% of the parcels are used for the calibration and then SMOTE process is applied to artificially create new parcels, and reach 1000 parcels for the calibration

These parameters can be changed in the config table, OR when launching a L4A crop type via the custom jobs tab by showing and modifying the advanced parameters, OR when launching a L4A crop type manually.

Best regards,

Philippe

A last element: in the outputs, you can also find a Crop_types_summary_%.csv file, where you can see for each crop type, the strategy that was applied for this crop type (1, 2, 3 or 0 (not classified)).

Best regards,

Philippe

In my case there are no files named Parcels_all_with_predictions_.csv and Parcels_classified_with_predictions_.csv in the L4A data that is produced, see screenshot

I will run the L4A processor again using the updated script posted by @lnicola

Ok what you should see, are the following outputs:

image

Can you try to replace the crop_type.R script by the one from Laurentiu (in /usr/bin/), and try again the crop type?

Thank you,

Philippe

Ok, I ran the classifier twice. Both times the same result:

So no improvement I am afraid.

The S4CCropType processor gave to following error:

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

  • filter, lag*

The following objects are masked from ‘package:base’:

  • intersect, setdiff, setequal, union*

Loading required package: lattice
Loading required package: ggplot2
Loading required package: proto
Warning message:
*no DISPLAY variable so Tk is not available *
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
:heavy_check_mark: tibble 3.0.3 :heavy_check_mark: stringr 1.4.0
:heavy_check_mark: tidyr 1.1.0 :heavy_check_mark: forcats 0.5.0
*:heavy_check_mark: purrr 0.3.4 *
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
:heavy_multiplication_x: dplyr::filter() masks stats::filter()
:heavy_multiplication_x: dplyr::lag() masks stats::lag()
:heavy_multiplication_x: purrr::lift() masks caret::lift()

Attaching package: ‘data.table’

The following object is masked from ‘package:purrr’:

  • transpose*

The following objects are masked from ‘package:dplyr’:

  • between, first, last*

Warning message:
*NAs introduced by coercion *
Warning message:
*NAs introduced by coercion *
Error: Wrong argument: choose out of (1,2,3,4,5,0,‘All’,‘12’,‘123’,‘1234’)
Execution halted

Hello,

It seems that the argument --lc is not defined well in your command: by default it is defined as 1234, and can have only the values specified your message error. Did you launch the crop-type-wrapper.py manually? If yes, can you check this?

Philippe

Hi Philippe,

I ran crop-type-mapper.py from the command line as well, and got the same error when the crop_type.R script was attempted to be run, see below. It seems the R script does not parse the --lc command well.

[eouser@sen4cap-200901 ~]$ crop_type.R /mnt/archive/orchestrator_temp/s4c_l4a/1688/13408-product-formatter// /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-features.csv /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/optical-features.csv /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-temporal.csv /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/parcels.csv CTnumL4A 1234 Area_meters 3 1 30 10 4000 1333 0.25 0.75 Smote 1000 5 300 10 /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/lut.csv

Attaching package: ‘dplyr’

The following objects are masked from ‘package:stats’:

  • filter, lag*

The following objects are masked from ‘package:base’:

  • intersect, setdiff, setequal, union*

Loading required package: lattice
Loading required package: ggplot2
Loading required package: proto
── Attaching packages ─────────────────────────────────────── tidyverse 1.3.0 ──
:heavy_check_mark: tibble 3.0.3 :heavy_check_mark: stringr 1.4.0
:heavy_check_mark: tidyr 1.1.0 :heavy_check_mark: forcats 0.5.0
*:heavy_check_mark: purrr 0.3.4 *
── Conflicts ────────────────────────────────────────── tidyverse_conflicts() ──
:heavy_multiplication_x: dplyr::filter() masks stats::filter()
:heavy_multiplication_x: dplyr::lag() masks stats::lag()
:heavy_multiplication_x: purrr::lift() masks caret::lift()

Attaching package: ‘data.table’

The following object is masked from ‘package:purrr’:

  • transpose*

The following objects are masked from ‘package:dplyr’:

  • between, first, last*

  • [1] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13408-product-formatter//” *

  • [2] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-features.csv” *

  • [3] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/optical-features.csv”*

  • [4] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-temporal.csv” *

  • [5] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/parcels.csv” *

  • [6] “CTnumL4A” *

  • [7] “1234” *

  • [8] “Area_meters” *

  • [9] “3” *
    *[10] “1” *
    *[11] “30” *
    *[12] “10” *
    *[13] “4000” *
    *[14] “1333” *
    *[15] “0.25” *
    *[16] “0.75” *
    *[17] “Smote” *
    *[18] “1000” *
    *[19] “5” *
    *[20] “300” *
    *[21] “10” *
    *[22] “/mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/lut.csv” *
    Warning message:
    *NAs introduced by coercion *
    Warning message:
    *NAs introduced by coercion *
    Error: Wrong argument: choose out of (1,2,3,4,5,0,‘All’,‘12’,‘123’,‘1234’)
    Execution halted

I did a bit more research on the crop_type.R script myself and noticed an error occurs because of a missing InputOptRe variable, which is a table of extracted Red Edge data. For some reason this data is not extracted in the extraction phase before the R script is run.

I set this variable to 0 in the command sent to R, ran the script again and then I get results, see below. So if this OptRe issue is fixed, all should work ok.

This is the command as I sent it:

crop_type.R /mnt/archive/orchestrator_temp/s4c_l4a/1688/13408-product-formatter// /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-features.csv /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/optical-features.csv 0 /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/sar-temporal.csv /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/parcels.csv CTnumL4A 1234 Area_meters 3 1 30 10 4000 1333 0.25 0.75 Smote 1000 5 300 10 /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/lut.csv

Hello,

Thanks for that. Indeed, I encoutered the same problem on another machine. I will discuss it with the team. Meanwhile, you can check in the /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/optical/ folder if you have this file: optical-features-re.csv. If this is the case, you can move it to the /mnt/archive/orchestrator_temp/s4c_l4a/1688/13407-s4c-crop-type/features/ folder and use it in the crop_type.R script, to use also in the classification the red-edge features.

Philippe

Hello,

Sorry for the delay. In fact, there was a mismatch in the patch shared by Laurentiu, which integrates already some improvements that we will do in the next version of Sen4CAP (to accelerate the process). In version 1.2, you should use this patch (crop_type.zip (6.6 KB) ), because the red-edge features are calculated at the same time as the other optical features, and are included in the same csv. You can check them in the /features/optical-features.csv file (b5, b6 an b7).

Can you replace (again, sorry) the crop_type.R script, and do a test? It would be very great if you can let us know about the results.

Thank you!

Philippe

Hi Philippe,

Good news: I got a complete result with all the expected outputs when I used the script you sent.

The only thing I noticed is that I got an initial error the system could not find the ‘crop_type.R’ file, but I solved that with a dos2unix command. Apparently it was saved in a DOS environment.

Hello Sybrand,

Thank you to have tested it and finding the remaining DOS -> UNIX problem.

The final version of the patch, converted to be usable in UNIX environment, can be found here: crop_type_unix.zip (6.6 KB).

With this patch, there should be no more problem to obtain all the outputs of the L4A crop type processor in Sen4CAP verison 1.2.

Philippe

1 Like

Hello Philippe,

I have the same Issue: no prediction in the croptype results. I did already change the crop_type.R script and restarted the sen2agri services. Still I get empty prediction columns. I might have missed some details in this thread. Are there other changes, except replacing the R-script that I need to make, to get a proper result? By the way I hope I replaced the right file at: /usr/bin/crop_type.R.

regards,
Jonathan

@Jonathan

Did you run it by custom job or you run the command? What I observed is it went well when I run the script from terminal. But still get prediction column is 0 for some crop when i run from custom job.

Good luck.

Henry

Hello Henry,

I ran the job via custom jobs indeed. Thanks for your suggestion to run it from terminal. I will try that.

Jonathan

Hello everybody,

I started the L4a via console. At the end of processing I get a warning (below) and I am missing the result crop_type.csv file. Can you make sense of that warning message? Any other Ideas?

Have a nice weekend everybody!

Warning message:
NAs introduced by coercion
`summarise()` ungrouping output (override with `.groups` argument)
Warning message:
NAs introduced by coercion
Error in `levels<-`(`*tmp*`, value = as.character(levels)) :
  factor level [23] is duplicated
Calls: factor
Execution halted
Exit code: 1

Content of the output path:

Accuracy_metrics_1106_1400.csv    Data_calibration_final_after_smote_1106_1355.csv
Confusion_matrix_1106_1400.csv    Data_calibration_final_before_smote_1106_1355.csv
Confusion_producer_1106_1400.csv  Data_validation_final_1106_1355.csv
Confusion_user_1106_1400.csv      Random_Forest_Model_1106_1400.rds
Crop_types_summary_1106_1355.csv

Hello Jonathan,

Did you try again after upgrading the system to the new version 1.3?

Philippe