My question is about the selection of the site extent. The L4A random forest classifies the time series of the different input values (S1 and S2 products) if I understand this correct. How important is it to split up sites depending on the differences in growing season over a region/country. Since L4A builds weekly time series, small differences might not matter but when should I decide to create a new site instead of having one large sites? Do you have any experiences from your tests with the pilot countries?
Is there a way of knowing which input had the biggest impact on the classification of certain crops? It seems like it classifies correctly if an area is grazed in comparison to mowed (80% accuracy for grazed areas). I mostly wonder which is the most important factor here that the classifier works so well, even though both are grasslands.
Hello Bastian,
For the limit of the sites (or stratification) for the L4A crop type processor:
- Pilot countries experience: to give you an idea, we ran only one classification for the whole countries of Netherlands, Lithuania and Czech Republic. For Spain (only Castilla y Leon), France and Italy, we had geographically separated regions (not contiguous) to monitor so we created one site by region. And for Romania, we had only one site, but we created 6 strata based on the biogeographic regions and the differences in agricultural practices; we selected manually the parcels belonging to each of these strata and we launched separated classifications manually.
- It is not possible right now to directly upload a stratification to a site and the processor to perform separate classifications based on this stratification. If you want to do it, if the different regions that you monitor are geographically separated (not contiguous), I would advise to create different sites. If the different regions are contiguous, I would advise to create one site so that the system does not preprocess again data that cover more than one region. And then, manually perform the different classifications with the script given by the system.
- A small point about the Random Forest classification: the classifier creates different trees for the same crop type when creating the classification model, so small differences in the agri practices inside a crop type could be handled by the classifier, if it is clearly different than other crop types.
For a way of knowing which input had the biggest impact on the classification of certain crops, yes it would be very interesting. At my knowledge there is no way, but I will investigate it.
For the difference between grazed areas and mowed areas. In fact, we never had the information before (whether a parcel is grazed or mowed) so we could never tested. We are thus very interested by your results, if you can share them in a way or another? And I am happily surprised that the classifier is performing well distinguishing these 2 agri practices. For the elements that can explain this, I will discuss it with colleagues that are more specialized in this and come back with an answer.
Thank you fot he answer Philippe! Interesting to know the different approaches of the pilots. I think in the case of Sweden we need to consider the biogeograhical regions. Maybe we create one large site, so we only need to do the preprocessing once and then just change the application file to do classifications for different regions.
I am sure I can share some of the findings but I would like to do some more testing and especially compare the classification results with the L4B module to get an idea of the activity on these parcels. It would be great to understand a bit better what kind of factors are important, so that we can get a better understanding of grazed areas.