Deriving a per-field land use and land cover map in an agricultural mosaic catchment

Detailed data on land use and land cover constitute important information for Earth system models, environmental monitoring and ecosystem services research. Global land cover products are evolving rapidly; however, there is still a lack of information particularly for heterogeneous agricultural landscapes. We censused land use and land cover field by field in the agricultural mosaic catchment Haean in South Korea. We recorded the land cover types with additional information on agricultural practice. In this paper we introduce the data, their collection and the post-processing protocol. Furthermore, because it is important to quantitatively evaluate available land use and land cover products, we compared our data with the MODIS Land Cover Type product (MCD12Q1). During the studied period, a large portion of dry fields was converted to perennial crops. Compared to our data, the forested area was underrepresented and the agricultural area overrepresented in MCD12Q1. In addition, linear landscape elements such as waterbodies were missing in the MODIS product due to its coarse spatial resolution. The data presented here can be useful for earth science and ecosystem services research. The data are available at the public repository Pangaea (doi: 10.1594/PANGAEA.823677 ). Published by Copernicus Publications. 340 B. Seo et al.: Deriving a per-field land use and land cover map


Introduction
Agricultural land use affects ecosystem services like provisioning of drinking water or control of soil erosion. Inappropriate agricultural practice can lead to serious soil loss and pollution of surface water and ground water by agrochemicals. Detailed data on 20 land use and land cover (LULC) in an agricultural landscape constitutes basic information for environmental monitoring and pollution control (Conrad et al., 2010;Potgieter et al., 2007;Pittman et al., 2010).
In general precise information on land cover is required for running Earth system models (Ottlé et al., 2013) as the land use changes directly affect numerous climate widely used land cover databases such as GlobCover or Moderate Resolution Imaging Spectroradiometer (MODIS) land cover have only a few crop-related classes (Loveland et al., 2000;Bontemps et al., 2011;US Geological Survey, 2012). Especially for heterogeneous arable zones (e.g., irrigated fields studied by Conrad et al., 2010), land cover products based on remote sensing are underdeveloped (Colditz et al., 2011). 15 Furthermore, spatial resolution of LULC data is often restricted. This limitation is particularly pronounced in heterogeneous landscapes such as mixed farming areas due to the complex mosaic of crop/non-crop land use and land cover types (Schulp and Alkemade, 2011). Unlike a homogeneous landscape (e.g. plantation farm) this type of agricultural mosaic needs a comprehensive number of LULC classes in a relatively 20 small area. Therefore, spatial resolution up to several hundred meters might be too coarse for this type of landscape. Longitudinal land cover data is also an important element when agricultural land use changes rapidly. The MODIS Land Cover Type (MCD12Q1) is the only product that provides annual information. It has been widely used for analysing land cover changes (Loveland et al., 2000).
As a consequence, for an agricultural mosaic landscape with a frequently changing land use, the only way to obtain detailed land cover information is surveying the study area.

274
ESSDD 7,2014 Deriving a per-field land use and land cover map In our study we address some of the above-mentioned problems and provide thematically and spatially rich land use and land cover data. We censused a small agroecosystem with a complex agricultural land use and the data is now available at the public repository Pangaea (Seo et al., 2013). In this paper we introduce the data, its collection and post-processing protocol. 5 2 Material and methods

Study area
The study area Haean catchment is located at the border between North and South Korea (128 • 1 33.101 E, 38 • 28 6.231 N). It is a small agricultural catchment (64.4 km 2 ) with rice paddies, annual and perennial dry fields and orchard farms. Approximately 10 1200 inhabitants live in Haean, mostly commercial farmers running their own small farms in the catchment. Agricultural fields in this area are typically smaller than 40 ha and agricultural practice is intensive in terms of fertilisation and tillage.
The altitudes in the Haean catchment range from approximately 500 m to 1200 m. Due to its characteristic bowl shape, the land use changes from predominantly rice 15 paddies at the valley bottom to dry field farming on moderate slopes. The higher altitudes are covered by deciduous and mixed forests.
The average annual air temperature is approximately 8 • C and the mean annual precipitation ranges from 1200 mm to 1300 mm, with more than 60 % of rainfall occurring during the summer monsoon between June and August (Korean Meteorological Ad-20 ministration, http://web.kma.go.kr/eng). Between 1999 and 2010 the maximum daily rainfall during summer reached up to 223 mm. This area has been studied intensively as it shows a typical conflict between agriculture and environmental protection (Martin et al., 2013;Poppenborg and Koellner, 2013;Thanh Nguyen et al., 2012;Zhao and Lüers, 2012 al., 2013b;Meusburger et al., 2013). Concerning this conflict, the local government pursued different policy measures such as subsidising perennial crops, which caused rapid LULC changes in land use and land cover.

Preparation of data collection
Prior to the field campaign, we collected pre-existing information to create an ini-5 tial "base map". It served as a field template and was particularly useful to find access to isolated patches. We used a SPOTMaps image (Astrium Services, http: //www.astrium-geo.com), a mosaic of multiple SPOT 5 images, with a ground resolution of 2.5 m. Furthermore, we worked with aerial photographs and a land cover map from the Korean Ministry of Environment (KME) (http://egis.me.go.kr) to complement the SPOTMaps. From the KME land cover map, we extracted vector-based linear elements such as road and stream networks. An additional land use map by the Research Institute For Gangwon (http://gdri.re.kr) from 2007 provided information on previously surveyed agricultural land use. The data sources are summarised in Table 1. The images selected for the base map were only moderately well georeferenced.

15
The SPOTMaps image, for example, had an approximated location error of 10-15 m according to the specification and the other spatial data also revealed a substantial location error. Therefore, we georeferenced them again using 14 ground control points (GCPs) distributed over the entire catchment. They were established along linear elements like roads and defined by the GPS coordinates averaged over several mea-20 surements. After georeferencing by the first-order polynomial (affine) transformation, the horizontal root-mean-squared-error (RMSE) of the final base map image equalled to 9.62 m.

Data collection
The main goal of the data collection campaign was to survey LULC information in the Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | entire landscape. The term "census" is adopted here in contrast to the term "sampling" because we recorded LULC information from the whole study area and not from a subset of land parcels. Accordingly, we mapped the complete set of land parcels and documented land cover type together with additional information on data quality and spatial and temporal mixture of land use types (e.g. double dry field cropping per 5 year or mixed dry fields). In contrast to 2009 and 2010, we were only able to map the northern half of the catchment in 2011 due to time and budget limitations. Therefore, we did not consider this data when calculating descriptive statistics or analysing land use change and only compared the years 2009 and 2010. We classified landscape elements into two categories, namely patches and linear el-10 ements. The former included agricultural and non-agricultural fields, forest, water body and all other areal land cover types best represented by a polygon. In general, we visited patches once per year. However, patches with a spatially or temporally mixed land use type were inspected multiple times. Linear elements comprised roads, stream network, field edges or any other element that can be represented by a polyline. They 15 were investigated during the whole project period from 2009 to 2011 because of their large extend and relative temporal stability.
To record a landscape element we marked vertices and edges for each spatial entity as Global Positioning System (GPS) waypoints (WPs) and tracks. The WP IDs were written on the printed base map and corresponding information on the field data book. 20 GPS tracks were continuously stored in the device as we moved around and gave us complementary data for drawing polygon edges and polylines.
We used several GPS devices (Garmin CSX60, Garmin Colorado 300 and Garmin eTrex 30) simultaneously to retrieve location information. The use of multiple devices as a back-up secured the data against sudden power loss. For devices capable of loading custom maps, we loaded the base map in order to simultaneously review newly recorded WPs.

Digitising the field records
We digitised the field records and converted patches and linear elements into polygons and polylines, respectively. The base map served as a background and complemented the field records. Additionally, we stored LULC type and other descriptive information in 5 an attribute table. Any quality issue, if occurred, was marked in the Quality Assurance (QA) column of the corresponding year. The digitised data was cross-checked with GCPs recorded later during the census period and corrected when needed.

Gap-filling
After digitising the field records, some gaps remained between polygons. They oc-10 curred mostly around patches that were irregularly shaped and therefore difficult to map. We filled these gaps using the KME land cover map (Table 1) and our own data on linear elements. First, we added the main road and stream networks extracted from the KME land cover map. Subsequently, we created two major linear elements, namely semi-natural 15 field edges and a stream network from our own GPS track data. For this purpose, we converted the GPS tracks of field edges and non-paved agricultural pathways, which were initially polylines, into polygons by creating six-meter wide buffers encompassing the tracks. Similarly, based on the GPS tracks recorded along small streams, we created the stream network buffers of 5 m width and assigned them to the existing "inland 20 water" polygons.
Finally, we used the KME land cover map to fill the remaining gaps. Forest areas that were inaccessible due to military restrictions were the major part of the transferred land cover information.
We updated the QA information during the gap-filling procedure. If only original ob- to filter out transferred land cover information. Because the LULC data was recorded yearly, the gaps differed from year to year. Therefore, we filled them separately for each year.

Definition of LULC classes
We defined a LULC classification scheme with 67 land use/land cover classes to ad-5 equately represent the agriculture mosaic in the catchment. If several LULC types coexisted in one polygon, we assigned it to the LULC type with the largest portion and recorded mixture information in the attribute table. This scheme incorporates a large number of regional crop types as well as several natural and semi-natural land cover classes found in the area. In the following we call this detailed classification scheme 10 S1.
For vegetative classes, we additionally recorded information on life form, life cycle and crop type following the Land Cover Classification System (LCCS) developed by FAO (Food and Agricultural Organization of the United Nations) (Di Gregorio, 2005). We categorised the life cycle of a class as "Perennial", "Annual" or their mixture "An- 15 nual/Perennial" based on the the life cycle of the plant species and the local cultivation practice. In other words, if a perennial crop was harvested after one growing season we classified it as "Annual". We distinguished the life forms "Woody", "Herbaceous" and "Lichens/Mosses" or a combination of them. Crop type patches were further subdivided into 12 different crop types (Supplement Table S1). We assigned mixed crop 20 type values for patches where various crop/non-crop vegetation coexisted.
Additionally to the S1 scheme containing 67 classes, we reclassified the LULC information in three simpler schemes. First, we generated a locally optimised scheme with 10 classes (called in the following S2) that reflects the edaphic and socio-economic conditions in the study area. It consists of the classes "Barren", "Dry field", "Forest", 25 "Greenhouse", "Inland water", "Orchard field", "Paddy field", "Semi natural" and "Urban". Then, based on the FAO-LCCS we regrouped the S1 classes into eight major types (Supplement Table S2 Finally, we classified our data according to the International Geosphere-Biosphere Programme (IGBP) Discover land cover system which contains 17 classes, two of which are crop classes (Loveland et al., 2000;Loveland and Belward, 2010;Friedl et al., 2010). Thus, the schemes S1, S2, FAO-LCCS and IGBP differ in the total number of classes and the number of crop classes (Table 2).

5
These reclassified LULC data can be used together with global products such as MODIS Land Cover Type (MCD12Q1) or GlobCover that follow the IGBP and FAO-LCCS schemes, respectively. The reclassification was done in GNU R v3.0.2 (R Core Team, 2013) and we provide the R code in the online Supplement. 10 We extracted the Land Cover Type 1 (IGBP) layer from MODIS Land Cover Type product (MCD12Q1) and compared it to our LULC survey for the target period. We chose the MODIS product among the globally available land cover data, as it is the only one provided annually. Note that some of the perennial crops were reclassified as non-crop types (forest or shrub) to be consistent with the IGBP system (e.g., "Orchard" coded as 15 "Open shrub").

Local classification scheme S1
The field survey resulted in a vector GIS data with 67 LULC classes (S1). Overall, the study site can be characterised as an extremely heterogeneous agricultural landscape 20 with a large number of LULC types in its central part ( Fig. 1; proportions in Supplement Table S3). We provide more details about the LULC types in the meta information of the dataset (Supplement). "Deciduous forest" at the steep hill slopes was stable during the studied period. It occupied more than half of the study area and was therefore the most dominant type 5 (55.6 %, two years average). The moderate slopes from the forest edges to the flat centre were dominated by dry field farms which occupied 16.3 % (two years average) of the total catchment. The major dry field crops among the total 42 ones we recorded were soybean, ginseng, potato, radish, European and Chinese cabbages and maize. Rice paddies (8.3 %, two years average) were prevalent in the central part and surrounded 10 the small urban core (0.86 %, two years average).

Major changes in land use
During the study period, dry fields and rice farming decreased and orchards and ginseng cultivation increased (Supplement Table S3 and Table 6). Actually "Ginseng" almost doubled from 2009 to 2010 (1.26 % to 2.48 %). It is consistent with the rapid 15 ginseng expansion reported by Jun and Kang (2010) who suggested to replace annual dry crops by perennial crops to stabilise soils and thus prevent erosion. An expected reduction of soil erosion due to this land use change was discussed in (Kettering et al., 2012;Arnhold et al., 2013;Ruidisch et al., 2013;Shope et al., 2013a).
Additionally, fallow fields increased in 2010 (4.8 %) compared to 2009 (1.9 %) and 20 replaced a large number of dry fields. We attribute these changes partially to the subsidy for fallow fields and partially to corporal regulations requiring at least three years of fallow or organic farming before ginseng farming could start. Compared to the patches, linear elements such as "Semi natural" (6.0 %), "Transportation" (0.78 %) and "Inland water" (0.32 %) were low in proportion in 2009 and 2010. Nevertheless, they covered the whole catchment (Fig. 1).
Field-level land use change was more pronounced than the change of the proportions due to crop rotation, which is common for the annual crops in the region. The an-5 nual crops are rarely cultivated in successive years and the dry field crops commonly have a three-years portfolio (e.g. potato-cabbage-soybean). This pattern can be seen from the displacement of colours (LULC types) between 2009 and 2010 in Fig. 1, most distinctive in the northern part of the arable zone. However this displacement does not clearly appear in the proportions.

Life form and life cycle
For vegetated patches, "Herbaceous" vegetation dominated the central agricultural area in contrast to the surrounding forest which was entirely "Woody" (Fig. 2). "Lichen/Mosses" type vegetation was not recorded. The life form did not change over the studied period (Table 3), possibly because land use changes mainly occurred within 15 the "Herbaceous" category (i.e. in the agricultural area).
The distribution of life cycles changed from 2009 to 2010 (Table 4). "Annual" type vegetation dropped from 19.87 % to 17.45 % due to decreasing rice paddies and dry fields. In contrast, natural "Perennial" vegetation expanded over a larger area (61.53 % in 2009 to 62.78 % in 2010). These changes are clearly visible in the mid-western part 20 of the area (Fig. 3) and are probably due to the governmental policy of replacing dry fields by perennial crops such as ginseng and orchards.

Crop types
In the Haean catchment, six of the 12 FAO-LCCS crop types were present: "Cereals and Pseudocereals", "Roots and Tubers", "Pulses and Vegetables", "Fruits and Nuts", 25 "Fodder Crops", and "Industrial Crops" (Supplement Table S1 of them if multiple crop types were identified on the same patch. Occasionally, "Mixed crops" was assigned when the combination was not precisely recorded. For some crops, the most suitable type was difficult to find. Indeed, the LCCS manual classifies "Soybean" as an industrial crop, while in the region it is often used as a vegetable because the green part is popular in the local cuisine. "Wild Sesame" is 5 another example of a crop with multiple purposes, namely "Pulses and Vegetables" and "Industrial crops". In our study we defined "Soybean" and "Wild Sesame" as "Industrial Crops". The three years of crop type information are shown in Fig. 4 and summarised in Table 5. "Cereals and Pseudocereals" and "Roots and Tubers" diminished as "Rice 10 paddy", "White radish" and "Potato" cultivation decreased. In contrast, "Fruit and Nuts" and "Industrial crops" increased because the orchards and a few other industrial crops such as "Ginseng" expanded due to the governmental promotion of perennial crops. Additionally, "Non-crop Vegetation" rose from 2009 to 2010 (69.1 % to 72.0 %) as a consequence of an increased number of fallow fields in preparation for future ginseng 15 farming.

Classification schemes S2 and FAO-LCCS
The coarser classification scheme S2 summarises the main changes in land use in the study area ( Fig. 5 and Table 6). Actually, "Dry field" dropped from 2009 (17.83 %) to 2010 (14.83 %) and the "Semi natural" type increased from 11.35 % to 14.18 %. We 20 attribute the latter change to the spread of fallow fields.
The three most important FAO-LCCS types, namely "Natural and Semi-Natural Terrestrial Vegetation", "Cultivated and Managed Terrestrial Area", "Cultivated Aquatic or Regularly Flooded Areas" covered 97.2 % (two years average) of the total area. The "Natural and Semi-Natural Terrestrial Vegetation" prevailed (70.6 %, two years average) 25 and increased from 2009 to 2010 (Table 7). In contrast, "Cultivated and Managed Terrestrial Area" and "Cultivated Aquatic or Regularly Flooded Areas" decreased probably due to reduced dry field and rice farming, respectively. 283 When applying the FAO-LCCS scheme to our data, the classification of "Rice paddy" was challenging. Actually in Haean, rice is sometimes irrigated with water from deep wells. However, although the "Cultivated Aquatic or Regularly Flooded Areas" class excludes irrigated cultivated areas (Di Gregorio, 2005), we assigned rice to this type as it is mostly rainfed.

IGBP classification scheme and comparison with MODIS
The upper row of Fig. 7 shows our study site classified according to the IGBP 17-class system and the lower row compares our observations to the MODIS Land Cover Type (MCD12Q1). The survey data shows a general agreement with the MODIS product for "Croplands" and "Grasslands". In contrast, the area of "Deciduous Broadleaf Forests" is 10 substantially greater and that of "Mixed Forests" lower in our survey data (Table 8). This suggests that forest classes differ between the two datasets. One possible explanation is that limited access to forested areas caused inaccuracies in our data. However, it is also possible that MODIS is less accurate due to its coarser resolution (500 m). Indeed, "Water Bodies" and "Urban and Built-Up Lands", for example, were not detected by 15 MODIS presumably because of the coarse spatial resolution.
We reclassified rice paddies as "Croplands" unlike in S2, which distinguishes "Paddy field" from the other agricultural types. Note that in Haean, as well as in South Korea in general, it may be important to distinguish (paddy) rice fields from the general cultivated areas due to its edaphic and socio-economic implications.  5 We provide an annual per-field land use and land cover data set for the agricultural mosaic Haean (South Korea). During the study period many dry fields were converted to perennial crops such as ginseng and orchards probably due to governmental policy measures. The comparison between our survey data and the MODIS land cover for the target period revealed that the limitation of MODIS cover in identifying irrigated fields 10 could be a substantial source of error. Linear elements such as water bodies were not identified in the remote sensing product due to its coarse spatial resolution. Due to its detailed information, our LULC data set could be used for well environmental modelling as well as for ecosystem services research and decision making analysis. Discussion Paper | Discussion Paper | Discussion Paper | Discussion Paper | Tang, H., Thornton, P., Vancutsem, C., Velde, M., Wood, S., and Woodcock, C.: The need for improved maps of global cropland, Eos T. Am. Geophys. Un., 94, 31-32, 2013  Locally defined grouping 10 "Dry field", "Paddy field" and "Orchard field" FAO-LCCS FAO-LCCS major land cover classes 8 "Cultivated Terrestrial" and "Cultivated Aquatic" IGBP IGBP Discover system 17 "Croplands" and "Cropland/ Natural Vegetation Mosaics"        Supplement  Table S2. These classes are defined by the stratified structure with three dichotomous levels: presence of vegetation, edaphic condition, and artificiality of cover.