1 Introduction

In this module, we will discuss the following concepts:

How to generate a presence and absence dataset using high resolution imagery.
How to generate randomly distributed points within a feature class layer to use as field sampling locations.
How to filter your points to hone your sampling locations based on parameters.

A Rocky Mountain Elk near an aspen stand in Washington State. Credit: U.S. Fish and Wildlife Service.

2 Background

It is well documented that herbivore grazing largely by elk, can have a negative effect on aspen regeneration rates, as aspens tend to grow in large monotypic stands. As a result, the aspen regeneration rates in these stands can define the composition of the understory. Excluding elk, deer, and cow grazing from an area has observable effects on aspen regrowth but limited work has been done to understand how the presence of aspen understory effects the overall biodiversity of an area from primary producers to large mammals. Within this module we are going to use multiple datasets and one meter resolution imagery to develop sampling locations for a theoretical field survey study. We will also build a presence/absence data set that we could use to train a regionally specific model of aspen coverage. The process of creating such a model can be found in Module 7.

2.1 About the Data

National Land Cover Database : (NLCD) is a 30 m Landsat-derived land cover database available at four time periods (1992, 2001, 2006 and 2011). Data from 1992 are primarily based on an unsupervised classification of Landsat data. The other years rely on a decision-tree classification to identify the land cover classes.

The National Agriculture Imagery Program (NAIP) acquires aerial imagery during the agricultural growing seasons across the continental United States. NAIP projects are contracted each year based upon available funding and the Farm Service Agency imagery acquisition cycle. Beginning in 2003, NAIP was acquired on a 5-year cycle in most states. 2008 was a transition year, and a three-year cycle began in 2009. NAIP imagery is acquired at a one meter ground sample distance with a horizontal accuracy that matches within six meters of photo-identifiable ground control points, used during image inspection. It may depend on the given location and time frame, but NAIP imagery is often collected in four bands: blue, green, red, and near infrared. As we have seen in some of the previous modules, the near infrared band is helpful in distinguishing between different types of vegetation.

USGS National Elevation Dataset 1/3 arc-second (NED) is an elevation dataset produced by the USGS. A seamless layer is available at 0.33 arc second (~30 m cell size) spatial resolution across the lower 48 states, parts of Alaska, Hawaii, and U.S. Territories.

3 Developing Your Own Sampling Locations

We will start by developing our own potential field sampling locations based on the relative physical and ecological conditions.

3.1 Region of Interest

The geographic region for this module is the Grand Mesa in western Colorado. Considered to be the largest mesa in the world by area, the Grand Mesa rises from 4,000 feet near Grand Junction, Colorado to over 10,000 feet at the highest point. Historically, the Grand Mesa was glaciated, leaving the surface spotted with lakes. While heavily forested at higher elevation, there is a distinct transition between shrubland, aspen, and conifer forest along the slopes following a steep elevational gradient. One of the western most extensions of the montane environment in Colorado, the Grand Mesa represents important habitat for many ungulate species.

Our first step is to open a new script in GEE. First create a region of interest that encompasses the Grand Mesa (you can search for it by name in the search bar at the top). Do so using the geometry tool. Once you create the feature rename it roi.

An example of the general region of interest to use for this module.

3.2 Loading NAIP

We will rely on NAIP imagery for multiple steps in this process. NAIP Imagery is not collected every year so it makes sense to load in multiple years to determine what time frame is available. We will use a print statement to see what years are available for our area. In Module 1 we used NAIP imagery from 2011, now we will look at imagery from 2015 and 2017. Copy the follow code to start your script.

// Call in NAIP imagery as an image collection.
var NAIP = ee.ImageCollection("USDA/NAIP/DOQQ")
  .filterBounds(roi)
  .filterDate("2015-01-01", "2017-12-31");
print(NAIP);

A print statement identifying the years of available imagery for this region of interest.

The print statement shows us that imagery is available for both 2015 and 2017, yet it would be difficult to determine the extent of coverage present without visualizing the data. For now, we will add both collections to the map to see what the image availability looks like. Add the following code your existing script.

// Filter the data based on date.
var naip2017 = NAIP
  .filterDate("2017-01-01", "2017-12-31");

var naip2015 = NAIP
.filterDate("2015-01-01", "2015-12-31");

// Define viewing parameters for multi band images.
var visParamsFalse = {bands:['N', 'R', 'G']};
var visParamsTrue = {bands:['R', 'G', 'B']};

// Add both sets of NAIP imagery to the map to compare coverage.
Map.addLayer(naip2015,visParamsTrue,"2015_true",false );
Map.addLayer(naip2017,visParamsTrue,"2017_true",false );

2015 NAIP imagery has captured the aspen forest across the extent of our study area.

The 2017 imagery (shown below) has been captured later in the season so we are seeing some yellow in the aspens of this image. A challenge with this image is that there are some distinct time lags between neighboring flight paths during the initial image collection. Notice the line of green through the middle of the area. While it is usually best practice to use images that closely match the time that you will be in the field, in this case the consistency of the 2015 imagery outweighs those temporal concerns.

2017 NAIP imagery has captured the yellowing of the aspen forest but shows a distinct different in vegetation cover due to variability in the time at which the given images were captured.

We will also add a false color image to the 2015 data because this band combination visualization is very helpful for distinguishing between deciduous and evergreen forests. We do this by adding the naip2015 object with a different set of visualization parameters to the map. Add the following code your existing script.

// Add 2015 false color imagery.
Map.addLayer(naip2015, visParamsFalse, "2015_false", false);

If your imagery is slow to load, here is an example of how to comment out code.

3.3 Aspen Exclosures

In our hypothetical study, land managers have established some aspen exclosures on the southern extent of the Grand Mesa near Highway 65. The land managers did not have specific shapefiles of the exclosures but they did have GPS locations of the four corners. We will use this data to add the exclosures to the map by creating a geometry feature within our script. In this case we are creating ee.Geometry.MultiPolygon feature. Add the following code your existing script.

// Creating a geometry feature.
var enclosure = ee.Geometry.MultiPolygon(
[
  [
    [-107.91079184,39.012553345],
    [-107.90828129,39.012553345],
    [-107.90828129,39.014070552],
    [-107.91079184,39.014070552],
    [-107.91079184,39.012553345]
  ],
  [
    [-107.9512176,39.00870162],
    [-107.9496834,39.00870162],
    [-107.9496834,39.00950196],
    [-107.95121765,39.00950196],
    [-107.95121765,39.00870162]
  ]
]);

print(enclosure)
Map.addLayer(enclosure, {}, "exclosures");

The structure of the ee.Geometry.MultiPolygon feature is a bit complex, but it is effectively a set of nested lists. There are three tiers of lists present:

//Nested list example.
['tier 1'
  [ 'tier 2'
    ['tier 3']
  ]
]

Tier 1 : Single list One list that holds all the data. The ee.Geometry.MultiPolygon() function requires the input to be a list.
Tier 2 : A list for each polygon Each unique set of coordinates need to be held within a list.
Tier 3 : A list for each x,y coordinate pair Each polygon is composed of a series of x,y points with a point that overlaps exactly with the first coordinate pair. This last coordinate pair is essential and is what completes the feature.

If you are trying to make a geometry feature and are having trouble you can always create one with the draw shape tool and look at the values in a print statement as a template.

A view of the enclosure object coordinate details from the print statement.

3.4 Determining Similar Areas for Sampling

Now that we have our aspen exclosures loaded, we are going to bring in some additional layers to help quantify the landscape characteristics of the exclosures. We will use those values to find similar areas nearby to use as sampling sites outside of the exclosures. By keeping the environmental conditions similar we can make stronger statements about the effects grazing has on biodiversity within the aspen forest.

We will use three datasets to quantify the conditions within the site:

National Elevation Dataset (NED): Select areas in a similar elevation range. Elevation is correlated with many environmental conditions so we are using it as a proxy for features such as temperature, precipitation, and solar radiation.
National Agricultural Imagery Program (NAIP): Calculate an NDVI index to get a measure of vegetation productivity.
National Land Cover Database (NLCD) : Select the deciduous forest class as a way to limit the location of sampling sites.

3.5 Loading in Datasets

First, we will call the NED by its unique id. We can gather these details by searching for the feature and reading through the metadata. Add the following code your existing script.

// Load in elevation dataset clip it to general area.
var elev = ee.Image("USGS/NED")
  .clip(roi);

print(elev, "elevation");
Map.addLayer(elev, {min: 1500, max: 3300}, "elevation", false);

We already have NAIP imagery loaded but we need to convert it to an image and calculate NDVI. We have filtered our NAIP imagery to a single year, there is only one image per area. We could apply a reducer to convert the image collection to an image but with only a single layer reducer is not necessary. The median of a single value is just that value. A more logical option for this is to apply the mosaic() function to convert the image collection to an image. Add the following code your existing script.

// Apply mosaic, clip, then calculate NDVI.
var ndvi = naip2015
.mosaic()
.clip(roi)
.normalizedDifference(["N","R"])
.rename("ndvi");

Map.addLayer(ndvi, {min:-0.8, max:0.8}, "NDVI" , false);
print(ndvi,"ndvi");

The last dataset we are bringing in is the NLCD. Add the following code your existing script.

// Add National Land Cover Database (NLCD) with color palette.
var dataset = ee.ImageCollection('USGS/NLCD');
print(dataset, "NLCD");

A print statement showing the structure of the NLCD image collection.

From the print statement we can see that Image 3 is the most recent dataset and that image has a band called landcover. Rather than trying to pull it out from the feature collection, we will call the 2011 NLCD image by its unique ID and select the land cover band.

// Wrap the selected image in ee.Image, which redefines datatype for proper visualization.
var landcover = ee.Image("USGS/NLCD/NLCD2011")
  .select("landcover")
  .clip(roi);
print(landcover, "Landcover Image");
//
Map.addLayer(landcover, {}, "Landcover" , false);

3.6 Generating Random Points

With our three datasets loaded we can now generate a series of potential survey sites. We will do this by generating random points within a given area. We want these sites to be accessible, near the two exclosures, and within the public land boundary. Let us create another geometry feature that we will use to contain the randomly generated points. Hover over the geometry import box and click + new layer. Be sure to name this second geometry feature sampleArea.

An example of what the second sample area geometry feature should look like.

With the geometry feature in place we can create points using the randomPoints() function. Add the following code your existing script.

// Generate random points within the sample area.
var points = ee.FeatureCollection.randomPoints({
region: sampleArea,
points:1000,
seed: 1234}
);
print(points,"points");
Map.addLayer(points, {}, "Points" , false);

We use a dictionary to define the parameters of the function. The region is the area in which the points are created. In our case, we are going to set this to sampleArea. The points parameter defines the number of points to generate. The seed parameter is used to indicate a specific random string of values. Think of this as a unique ID for a set of random values. The seed number (1234 in this example) refers to an existing random list of values. Setting the seed is very helpful because you are still using random values but the process is reproducible. If you are feeling the urge to learn more you can check out this resource.

3.7 Extract Values to Points

To associate the landscape characteristics with the point locations we are going to call the ee.Image.sampleRegions() function. This function is performed on a single image. Rather than calling three times on our three unique images (Elevation, NDVI, and NLCD) we are going to add all those images together to create a multiband image so we can call this function a single time. Add the following code your existing script.

// Add bands of elevation and NAIP.
var ndviElev = ndvi
  .addBands(elev)
  .addBands(landcover);
print(ndviElev, "Multi band image");

With the multiband image set, we will call the sampleRegions() function. Add the following code your existing script.

// Extract values to points.
var samples = ndviElev.sampleRegions({
  collection: points,
  scale: 30 ,
  geometries: true
});
print(samples,'samples');
Map.addLayer(samples,{}, "samples" , false);

Within this function, the collection parameter is the feature collection where the extracted values will be added. In this example it is a point dataset. The scale parameter refers to the pixel size of the data. The geometries parameter states whether or not you want to maintain the x y coordinate pairs for each element in the collection. The default is false, but we set it to true here because we eventually want to show these points on the map, as they will represent suitable sampling site locations.

Segue on Scale: Scale is very important in remote sensing. In fact, there are multiple types of scale. Map scale, the relationship between a measured distance on a map and the actual distance on the landscape is the most common to geographic work. The image extent is sometimes referred to as the scale of the image or the spatial scale. In the case of the scale argument in the function above, scale is referring to the pixel size of the image. In our example the multiband image has two bands with a pixel size of 30 m and one with a pixel size of one meter. It is best practice to always use the largest pixel size when working with data at different scales. This means you are effectively upscaling the one meter image to 30 m. This means you lose a lot of precision in your data, but you can be confident that the number in that upscaled cell is representative of the average value across all the cells. If you go in the opposite direction and downscale an image, you effectively make up the data to fill the gap. Because NAIP imagery has a 1 m pixel size, it contains 30 x 30 more pieces of information(pixels) then a single 30 m pixel. Take a look at the example below to help visual these processes.

Upscaling takes the available data and finds the average value.
| 3 | 7 | 8 |
| 4 | 2 | 2 |
| 1 | 3 | 2 |
average = 3.56

Downscaling takes this single value (3.56) and finds the value for all locations in the grid.
| ? | ? | ? |
| ? | ? | ? |
| ? | ? | ? |

There are many different sets of values that can be averaged to 3.56 and the question becomes which set of values is correct. To add to this complexity, each of the values in the set can be placed in 9 different positions. The likelihood of getting the actual values when down scaling can be coarsely estimated by number of potential combinations that lead to the know average and the factorial of the number spaces in which a value can be placed. With our data were talking about downscaling a single value into 900 different ones. This variability makes it clear that you cannot downscale without making up something about your data. If your data is made up, it does not have much quantitative value.

The result of our sampling function has given us 1,000 potential sites to choose from. We will limit this pool by comparing the measurable data we have at these sites to the averages of those data from our exclosures.

From our print statement we can see that each of our 1,000 point locations have three properties: elevation, landcover, and NDVI. We want to use these values to filter out sites that do not match the conditions of the exclosures. The landcover data is categorical, so it is easy to filter because we know from the metadata on NLCD that the landcover class for aspen forest is a single value (41).

We will use the filterMetadata() function to select all sites that are within aspen forests. Add the following code your existing script.

// Filter metadata for sites that fall with the NLCD deciduous forest layer
var aspenSites = samples
.filterMetadata({name:"landcover", operator:"equals", value: 41});
print(aspenSites, "Sites");

From the print statement we can see this reduces the total number of potential sites from 1,000 to 578. For this study, we are hoping to get down to about 10 potential sites, so there is still a lot of trimming to do.

Filtering based on elevation and NDVI is a bit trickier because both of these variables are continuous data. You want to find sites that have a similar values to those in the exclosures but it does not need to be exactly the same. For this example, we will say that if a value is within +/- 10% of the average value found within the enclosure we will count it as similar. There are a few factors that need to be calculated before we can filter our potential sampling sites.

Average values within exclosures
10% above and below the average

We will work with the NDVI image first and then apply this process to the elevation dataset.

Step 1: Find the average value we are going to apply a mean reducer over the area that is inside of the enclosure. Add the following code your existing script.

// Set the NDVI range.
var ndvi1 = ndvi
  .reduceRegion(
    {reducer:ee.Reducer.mean(),
    geometry: enclosure,
    scale: 1,
    crs:'EPSG:4326'}
    );
print(ndvi1, "Mean NDVI");

The reduceRegion() function takes in an image and returns dictionary where the key is the name of the band and the value is the output of the reducer.

Step 2: Determine the acceptable variability around the mean.

We are relying on simple mathematical functions to find the +/- 10% values. Add the following code your existing script.

// Generate a range of acceptable NDVI values.
var ndviNumber = ee.Number(ndvi1.get("ndvi"));
var ndviBuffer = ndviNumber.multiply(0.1);
var ndviRange = [ndviNumber.subtract(ndviBuffer),
  ndviNumber.add(ndviBuffer)];
print(ndviRange, "NDVI Range");

The first step in this process is all about understanding data types. The ndvi1 is a dictionary so we call the get() function to pull a value based on a known key. We then convert that value to an ee.Number type object so we can apply the math functions. Our output is a list with the minimum and maximum values (+/- 10% of the mean NDVI values).

3.8 Creating Your Own Function

We determined an acceptable range for NDVI but we also need to apply the same process to elevation. In this specific case we are only applying this process twice but it seems like a useful chunk of code that might be handy down the road. Rather than just copying and pasting the code again and again we are going to create a function with flexible parameters so we can apply this useful bit of code efficiently. We will use functions extensively in Module 10.

A function has the following requirements: parameters, actions, and a return value. Here is an example of the structure of a function from the official Google Earth Engine documentation. In this case the statement is referring to an action.

Example Only! do not run in code editor

var myFunction = function(parameter1, parameter2, parameter3) {
  statement;
  statement;
  var value = statement;
  return value;
};

If we apply this structure to our goal of reducing the NDVI by area (with the code we created above) we would construct the following:

Parameters : an image, a geometry object, and pixelSize.
Actions : reduceRegion() function.
A return value : output of the reduceRegion() function.

When using functions, it is important to generate generic terms within your parameters but give some indication of the data type that is required. We want this to be reproducible, so we provide some more information as comments when we define the function. Add the following code your existing script.

/*
This function is used to determine the mean value of an image within a given area.
image: an image with a single band of ordinal or interval level data
geom: geometry feature that overlaps with the image
pixelSize: a number that defines the cell size of the image
Returns a dictionary with the median values of the band, the key is the band name.
*/
var reduceRegionFunction = function(image, geom, pixelSize){
  var dict = image.reduceRegion(
    {reducer:ee.Reducer.mean(),
    geometry: geom,
    scale: pixelSize,
    crs:'EPSG:4326'}
    )
  return(dict)
};

This defines the function, now we can call it on the elevation data set. Add the following code your existing script.

// Call function on elevation dataset.
var elev1 = reduceRegionFunction(elev, enclosure, 30);
print(elev1, "elev1 ");

This is a very clean way to go about coding when you need to apply a process multiple times. Let us define a second function to determine the acceptable variability around the mean.

Parameters : an image, band name, proportion
Actions : multiple steps
A return value : list

Add the following code your existing script.

/*
Generate a range of acceptable values.
dictionary: a dictionary object
key: key to the value of interest, must be a string
proportion: a percentile to define the range of the values around the mean
Returns a list with a min and max value for the given range.
*/
var efffectiveRange = function(dictionary, key, proportion){
  var number = ee.Number(dictionary.get(key))
  var buffer = number.multiply(proportion)
  var range = [number.subtract(buffer),
  number.add(buffer)]
return(range)
};

Let us call the effectiveRange() function on the output of the reduceRegionFunction() function. Add the following code your existing script.

// Call function on elevation data.
var elevRange = efffectiveRange(elev1, "elevation", 0.1);
print(elevRange);

Now that we have an effective range for both our NDVI and elevation values, we can apply an additional set of filters to thin the list of potential sample sites. We can do this by chaining multiple filterMetadata functions together. Add the following code your existing script.

// Apply multiple filters to get at potential locations.
var aspenSites2 = aspenSites
.filterMetadata({name:"ndvi", operator:"greater_than", value: ndviRange[0]})
.filterMetadata({name:"ndvi", operator:"less_than", value: ndviRange[1]})
.filterMetadata({name:"elevation", operator:"greater_than", value: elevRange[0]})
.filterMetadata({name:"elevation", operator:"less_than", value: elevRange[1]});

print(aspenSites2, "aspenSites2");
Map.addLayer(aspenSites2, {}, "aspenSites2" , false);

This filtering process brought us down to less than 100 out of the 1,000 original sites. This is a number that can be whittled down to 10 either by looking at imagery or subjectively selecting sites based on your experience in the field. Also, you could provide a more restrictive range, maybe +/- 5% on elevation and NDVI to get a more limited group of locations.

This informed site selection process is a good first step in ensuring your comparing apples to apples when it comes to your field sampling.

4 Generating Your Own Training Dataset.

As you have been examining this landscape, you may have noticed some misclassifications within the NLCD landcover layer. These types of misclassifications are to be expected in any land cover dataset. While the NLCD is trained to produce classifications of specific landcover assemblages across the United States, the aspen forest that we are examining is included within a much larger grouping of forest types. Another import point is that the NLCD shows landcover as it was detected in 2011. While forests are fairly stable over time, we can expect that some level of change has occurred. These realities get you thinking about the possibility of generating your own aspen land cover product based on a remote sensing model for this specific region. There is a lot that goes into this process that we are not going to cover here. However, we are going to take the first step, which is generating our own presence/absence training dataset.

4.1 Ocular Sampling

Generating your own training data relies on the assumption that you can confidently identify your species of interest using high resolution imagery. We are using NAIP for this process because it is freely available and has a known collection date, allowing us to mark areas of aspen forest as ‘presence’ and areas without aspens as ‘absence’. These data can then be used to train a model of aspen occurrence on the landscape.

4.2 Adding Presence and Absence Points

First, we will need to create specific layers that will hold our ocular sampling points. Adding in presence and absence layers is a rather straightforward process accomplished by creating and placing geometry features on representative locations on the map. Hover over the geometry import box and click + new layer. A geometry feature, geometry will be added below your existing geometry feature functioning as your area of interest. Select the gear icon next to geometry and a pop up will open. Change the import as type to FeatureCollection and then press the Add property button. Fill in the boxes with presence | 1 and press ‘OK’ to save your feature.

An example of changing the parameters for the creation of the presence geometry feature.

Change the Feature Collection name to presence and select a color you enjoy.

Repeat this process to create an absence featurecollection where the added property is: presence | 0

We will use the binary value in the presence column of both datasets to define what that location is referring to, presence: 1 = Yes this is aspen, 0 = No this is not aspen.

Once the feature collections are created, we can sample by selecting the specific feature collection (presence or absence) and using the marker tool to drop points on the imagery. The sampling methodology you use will depend on your study. In this example, green presence points represent aspen forest and blue point are not aspen (absence).

Use the 2015 false color imagery in combination with the NDVI or NLCD to distinguish aspen from other land cover types. Aspen stands are brighter red then most other vegetation types and tend to have a more complex texture than herbaceous vegetation in the imagery. Drop some points in what you perceive to be aspen forest.

Examples of presence and absence locations on the NAIP imagery created using the marker tool. Do your best to select locations that look correct to you.

Feel free to sample as many locations as you would like. Again, the quality of this data will depend on the user’s ability to differentiate the multiple land cover classes present.

4.3 Exporting Points

Currently our point locations are stored in two different features classes. Let us merge these features into one feature class before exporting the data. We can merge the layers without issue because they share the same data type (point geometry feature) and the same attribute data (presence with a numeric data value). Add the following code your existing script.

// Merge presence and absence datasets.
var samples = presence.merge(absence)
print(samples, 'Samples');

Now that the sampling features classes are merged, let us export the features to our Google Drive. When you run the code below, the Task bar in the upper right hand side of the console will light up. Google Earth Engine does not run tasks without you directing it to execute the task from the Task bar. Add the following code your existing script.

// Export presence points.
Export.table.toDrive({
  collection: samples,
  description:'presenceAbsencePointsForForest',
  fileFormat: 'csv'
});

5 Conclusion

In this module we covered identifying locations with similar environmental characteristics and generating our own sampling data. Both processes are simple in theory but can be somewhat complex to make work without having access to all your data in a single place. In both cases we are generating value-added products that are informed by remote sensing but are not inherently remote sensing processes. This ability to be creative regarding how you use remotely sensed data is part of the beauty of the Google Earth Engine platform.

5.1 Complete Code

// Call in NAIP imagery as an image collection.
var NAIP = ee.ImageCollection("USDA/NAIP/DOQQ")
  .filterBounds(roi)
  .filterDate("2015-01-01", "2017-12-31");
print(NAIP);

// Filter the data based on date.
var naip2017 = NAIP
  .filterDate("2017-01-01", "2017-12-31");

var naip2015 = NAIP
.filterDate("2015-01-01", "2015-12-31");

// Define viewing parameters for multi band images.
var visParamsFalse = {bands:['N', 'R', 'G']};
var visParamsTrue = {bands:['R', 'G', 'B']};

// Add both sets of NAIP imagery to the map to compare coverage.
Map.addLayer(naip2015,visParamsTrue,"2015_true",false );
Map.addLayer(naip2017,visParamsTrue,"2017_true",false );

// Add 2015 false color imagery.
Map.addLayer(naip2015, visParamsFalse, "2015_false", false);

// Creating a geometry feature.
var enclosure = ee.Geometry.MultiPolygon(
[
  [
    [-107.91079184,39.012553345],
    [-107.90828129,39.012553345],
    [-107.90828129,39.014070552],
    [-107.91079184,39.014070552],
    [-107.91079184,39.012553345]
  ],
  [
    [-107.9512176,39.00870162],
    [-107.9496834,39.00870162],
    [-107.9496834,39.00950196],
    [-107.95121765,39.00950196],
    [-107.95121765,39.00870162]
  ]
]);

print(enclosure)
Map.addLayer(enclosure, {}, "exclosures");

// Load in elevation dataset clip it to general area.
var elev = ee.Image("USGS/NED")
  .clip(roi);

print(elev, "elevation");
Map.addLayer(elev, {min: 1500, max: 3300}, "elevation", false);

// Apply mosaic, clip, then calculate NDVI.
var ndvi = naip2015
.mosaic()
.clip(roi)
.normalizedDifference(["N","R"])
.rename("ndvi");

Map.addLayer(ndvi, {min:-0.8, max:0.8}, "NDVI" , false);
print(ndvi,"ndvi");

// Add National Land Cover Database (NLCD) with color palette.
var dataset = ee.ImageCollection('USGS/NLCD');
print(dataset, "NLCD");

// Wrap the selected image in ee.Image, which redefines datatype for proper visualization.
var landcover = ee.Image("USGS/NLCD/NLCD2011")
  .select("landcover")
  .clip(roi);
print(landcover, "Landcover Image");

Map.addLayer(landcover, {}, "Landcover" , false);

// Generate random points within the sample area.
var points = ee.FeatureCollection.randomPoints({
region: sampleArea,
points:1000,
seed: 1234}
);
print(points,"points");
Map.addLayer(points, {}, "Points" , false);

// Add bands of elevation and NAIP.
var ndviElev = ndvi
  .addBands(elev)
  .addBands(landcover);
print(ndviElev, "Multi band image");

// Extract values to points.
var samples = ndviElev.sampleRegions({
  collection: points,
  scale: 30 ,
  geometries: true
});
print(samples,'samples');
Map.addLayer(samples,{}, "samples" , false);

// Filter metadata for sites that fall with the NLCD deciduous forest layer
var aspenSites = samples
.filterMetadata({name:"landcover", operator:"equals", value: 41});
print(aspenSites, "Sites");

// Set the NDVI range.
var ndvi1 = ndvi
  .reduceRegion(
    {reducer:ee.Reducer.mean(),
    geometry: enclosure,
    scale: 1,
    crs:'EPSG:4326'}
    );
print(ndvi1, "Mean NDVI");

// Generate a range of acceptable NDVI values.
var ndviNumber = ee.Number(ndvi1.get("ndvi"));
var ndviBuffer = ndviNumber.multiply(0.1);
var ndviRange = [ndviNumber.subtract(ndviBuffer),
  ndviNumber.add(ndviBuffer)];
print(ndviRange, "NDVI Range");

/*
This function is used to determine the mean value of an image within a given area.
image: an image with a single band of ordinal or interval level data
geom: geometry feature that overlaps with the image
pixelSize: a number that defines the cell size of the image
Returns a dictionary with the median values of the band, the key is the band name.
*/
var reduceRegionFunction = function(image, geom, pixelSize){
  var dict = image.reduceRegion(
    {reducer:ee.Reducer.mean(),
    geometry: geom,
    scale: pixelSize,
    crs:'EPSG:4326'}
    )
  return(dict)
};

// Call function on elevation dataset.
var elev1 = reduceRegionFunction(elev, enclosure, 30);
print(elev1, "elev1 ");

/*
Generate a range of acceptable values.
dictionary: a dictionary object
key: key to the value of interest, must be a string
proportion: a percentile to define the range of the values around the mean
Returns a list with a min and max value for the given range.
*/
var efffectiveRange = function(dictionary, key, proportion){
  var number = ee.Number(dictionary.get(key))
  var buffer = number.multiply(proportion)
  var range = [number.subtract(buffer),
  number.add(buffer)]
return(range)
};

// Call function on elevation data.
var elevRange = efffectiveRange(elev1, "elevation", 0.1);
print(elevRange);

// Apply multiple filters to get at potential locations.
var aspenSites2 = aspenSites
.filterMetadata({name:"ndvi", operator:"greater_than", value: ndviRange[0]})
.filterMetadata({name:"ndvi", operator:"less_than", value: ndviRange[1]})
.filterMetadata({name:"elevation", operator:"greater_than", value: elevRange[0]})
.filterMetadata({name:"elevation", operator:"less_than", value: elevRange[1]});

print(aspenSites2, "aspenSites2");
Map.addLayer(aspenSites2, {}, "aspenSites2" , false);

// Merge presence and absence datasets.
var samples = presence.merge(absence)
print(samples, 'Samples');

// Export presence points.
Export.table.toDrive({
  collection: samples,
  description:'presenceAbsencePointsForForest',
  fileFormat: 'csv'
});

Module 9: Generating Your Own Sampling Data

Developed by the Graduate Degree Program in Ecology and the Natural Resource Ecology Laboratory at CSU

Authors: Dan Carver, Peder Engelstad

Updated June, 2021

1 Introduction

2 Background

2.1 About the Data

3 Developing Your Own Sampling Locations

3.1 Region of Interest

3.2 Loading NAIP

3.3 Aspen Exclosures

3.4 Determining Similar Areas for Sampling

3.5 Loading in Datasets

3.6 Generating Random Points

3.7 Extract Values to Points

3.8 Creating Your Own Function

4 Generating Your Own Training Dataset.

4.1 Ocular Sampling

4.2 Adding Presence and Absence Points

4.3 Exporting Points

5 Conclusion

5.1 Complete Code

Module 9: Generating Your Own Sampling Data

Developed by the Graduate Degree Program in Ecology and the Natural Resource Ecology Laboratory at CSU Authors: Dan Carver, Peder Engelstad

Updated June, 2021

1 Introduction

2 Background

2.1 About the Data

3 Developing Your Own Sampling Locations

3.1 Region of Interest

3.2 Loading NAIP

3.3 Aspen Exclosures

3.4 Determining Similar Areas for Sampling

3.5 Loading in Datasets

3.6 Generating Random Points

3.7 Extract Values to Points

3.8 Creating Your Own Function

4 Generating Your Own Training Dataset.

4.1 Ocular Sampling

4.2 Adding Presence and Absence Points

4.3 Exporting Points

5 Conclusion

5.1 Complete Code

Developed by the Graduate Degree Program in Ecology and the Natural Resource Ecology Laboratory at CSU

Authors: Dan Carver, Peder Engelstad