ENVSCI 203

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

Cellular Automata (overview)

- In the previous lectures we have considered mathematical models •Such models have tended to focus on time rather than space - how can/do we model systems where space and time are important? Outcome •An understanding of spatial modelling •The role of feedbacks on pattern emergence •How modelling deals with space and space variability

Why be interested in population growth?

- Project future populations - Human population expected to be 9.8 billion by 2050 - Conservation of species - Sustainable use of resources

Stochastic modelling (overview)

We know that model tend to depend on "parameters" •But they tend to be deterministic ... whereas 'uncertainty' characterises most environmental systems •How do we deal with this? -probability distributions -Monte Carlo and randomisation methods Stochasticity?•Stochastic: "randomly determined; that follows some random probability distribution or pattern, so that its behaviour may be analysed statistically but not predicted precisely" (OED) •Explicitly represents uncertainty in processes and data can analyse it statistically → the way we account for uncertainty

Describe 5 questions we should ask when looking at a model:

What is the scale? Is it a simple or detailed model? Is the model mechanistic or process driven? What are you not looking at? What are the critical assumptions?

3 critical steps in machine learning:

1) Get a data set - strange trends (complex; how do we know something is really an outlier?), maybe you don't need machine learning (look for patterns) oVisualize variables: Look for relationships between variables: issues → outliers? Grain size (large variability) oClean the data (outliers?) oLook for patterns oDo variables need scaling? 2) Choose the algorithm oWhat is the size of the dataset? oCan I get more data (if needed)? oSimple or complicated? Accurate or interpretable? oBlack, white or grey box? oDo I really need ML? oDo we have time constraints? 3) Train and Test the algorithm oData is limited! oYou need to Train and Test your model. oYou can't use the same data for Training AND Testing oHow do we split the data? No clear indication (70-30%?). oOverfitting - we may need more data; you MUST split the data (part for training and testing). No clear indication on how to split data. Trial and error? Overfitting → trying too hard to reproduce trends in the data, prediction loses generality. Machine learns too well, predicts poorly over new data. Must find a balance between under and overfitting

Catastrophic Shifts in ecosystems - Scheffer et al. 2001

All ecosystems are exposed to gradual changes in climate, nutrient loading, habitat fragmentation or biotic exploitation. Nature is usually assumed to respond to gradual change in a smooth way. However, studies on lakes, coral reefs, oceans, forests and arid lands have shown that smooth change can be interrupted by sudden drastic switches to a contrasting state. Although diverse events can trigger such shifts, recent studies show that a loss of resilience usually paves the way for a switch to an alternative state. This suggests that strategies for sustainable management of such ecosystems should focus on maintaining resilience. The notion that ecosystems may switch abruptly to a contrasting alternative stable state emerged from work on theoretical models. - observed large-scale shifts in major ecosystems and their explanations - External conditions to ecosystems such as climate, inputs of nutrients or toxic chemicals, groundwater reduction, habitat fragmentation, harvest or loss of species diversity often change gradually, even linearly, with time10,11. The state of some ecosystems may respond in a smooth, continuous way to such trends (Fig. 1a). Others may be quite inert over certain ranges of conditions, responding more strongly when conditions approach a certain critical level (Fig. 1b). A crucially different situation arises when the ecosystem response curve is `folded' backwards (Fig. 1c). This implies that, for certain environmental conditions, the ecosystem has two alternative stable states, separated by an unstable equilibrium that marks the border between the `basins of attraction' of the states. - When the ecosystem is in a state on the upper branch of the folded curve, it can not pass to the lower branch smoothly. Instead, when conditions change suf®ciently to pass the threshold (`saddle-node' or `fold' bifurcation, F2), a `catastrophic' transition to the lower branch occurs - Another important feature is that to induce a switch back to the upper branch, it is not sufficient to restore the environmental conditions of before the collapse (F2). Instead, one needs to go back further, beyond the other switch point (F1), where the system recovers by shifting back to the upper branch. This pattern, in which the forward and backward switches occur at different critical conditions, is known as hysteresis. - s. The degree of hysteresis may vary strongly even in the same kind of ecosystem. For instance, shallow lakes can have a pronounced hysteresis in response to nutrient loading (Fig. 1c), whereas deeper lakes may react smoothly Effects of stochastic events - If there is only one basin of attraction, the system will settle back to essentially the same state after such events. However, if there are alternative stable states, a sufficiently severe perturbation of the ecosystem state may bring the system into the basin of attraction of another state - Following Holling1 , we here use the term `resilience' to refer the size of the valley, or basin of attraction, around a state, which corresponds to the maximum perturbation that can be taken without causing a shift to an alternative stable state. - In systems with multiple stable states, gradually changing conditions may have little effect on the state of the ecosystem, but nevertheless reduce the size of the attraction basin (Fig. 3). This loss of resilience makes the system more fragile in the sense that can easily be tipped into a contrasting state by stochastic events A system that moves along a strange attractor fluctuates chaotically even in the absence of an external stochastic forcing. These fluctuations can lead to a collision with the boundary of the basin of attraction, and consequently induce a switch to an alternative state. Models indicate that such `non-local bifurcations'13 or `basin boundary collisions'14 may occur in ocean-climate systems15 as well as various ecosystems9 . In practice, it will often be a blend of internal processes and external forcing that generates fluctuations that can induce a state shift by bringing systems with reduced resilience over the boundary of an attraction basin. In view of these permanent fluctuations, the term `stable state' is hardly appropriate for any ecosystem. Nonetheless, for the sake of clarity we use `state' rather than the more correct term `dynamic regime'.

Summary of emerging patterns:

All of these case studies suggest shifts between alternative stable states. Nonetheless, proof of multiplicity of stable states is usually far from trivial. Observation of a large shift per se is not sufficient, as systems may also respond in a nonlinear way to gradual change if they have no alternative stable states. Also, the power of statistical methods to infer the underlying system properties from noisy time series is poor However, mere demonstration of a positive-feedback mechanism is also insuf®cient as proof of alternative stable states, because it leaves a range of possibilities between pronounced hysteresis and smooth response, depending on the strength of the feedback and other factors s some consistent patterns. First, the contrast among states in ecosystems is usually due to a shift in dominance among organisms with different life forms. Second, state shifts are usually triggered by obvious stochastic events such as pathogen outbreaks, ®res or climatic extremes. Third, feedbacks that stabilize different states involve both biological and physical and chemical mechanisms. Perhaps most importantly, all models of ecosystems with alternative stable states indicate that gradual change in environmental conditions, such as human-induced eutrophication and global warming, may have little apparent effect on the state of these systems, but still alter the `stability domain' or resilience of the current state and hence the likelihood that a shift to an alternative state will occur in response to natural or human-induced fluctuations. Implications for management - Ecosystem state shifts can cause large losses of ecological and economic resources, and restoring a desired state may require drastic and expensive intervention - neglect has heavy costs for society - attention tends to focus on precipitating events rather than on the underlying loss of resilience. For example, gradual changes in the agricultural watershed increased the vulnerability of Lake Apopka (Florida, USA) to eutrophication, but a hurricane wiped out aquatic plants in 1947 and probably triggered the collapse of water quality; gradual increase in nutrient inputs and fishing pressure created the potential for algae to overgrow Caribbean corals, but overgrowth was triggered by a conspicuous disease outbreak among sea urchins that released algae from grazer control. Prevention of perturbations is often a major goal of ecosystem management, not surprisingly. This is unfortunate, not only because disturbance is a natural component of ecosystems that promotes diversity and renewal processes56,57, but also because it distracts attention from the underlying structural problem of resilience. The main implication of the insights presented here is that efforts to reduce the risk of unwanted state shifts should address the gradual changes that affect resilience rather than merely control disturbance. The challenge is to sustain a large stability domain rather than to control fluctuations.

Would you model cliff erosion with a deterministic or a stochastic model?

Are we able to predict when a cliff will collapse deterministically? (No, it is random) → if we want to model cliff erosion we must do it stochastically (giving us possible options), environmental systems are not easy to predict.

Example one Artificial Neural Network

Artificial Neural Networks • Definition of Artificial Neural Network • Fundamental concepts • Types of Artificial Neural Networks • What they can do and where they fail What is an Artificial Neural Network? • A neural network is a computational method inspired by studies of the brain and nervous systems in biological organisms. • A Computing system made of a number of simple, highly interconnected processing elements, which process information by their dynamic state response to external input. some applications • pattern association & classification (e.g. credit cards) • regularity detection • image processing • speech analysis • quality assurance • stock market forecasting (I'd suggest not to do it ...) • robot steering • optimization problems → if data does not include all possibilities, predictions will be limited • ANNs attempt to model the way the brain is structured: - 10 billion neurons that communicate via 60 trillion connections ... - Parallel rather than sequential processing • ANNs can be used to approximate: -Any continuous function (one hidden layer) - Any discontinuous function (two hidden layers) • ANNs are composed of the following elements: -Inputs x - Neurons H (in hidden layers) - Outputs y - Weights w Inputs layer → hidden layer (can consist of multiple nodes) → output layer. Multiple inputs may converge to one input → difficult to tell which variables are relevant The number of neuronsper layer (hidden units), number of hidden layers, and the specified connections for each layer comprise the network architecture

Banded vegetation: development under uniform rainfall from a simple cellular automaton model - Dunkerley (1997)

Banded vegetation communities are known from semi-arid and arid landscapes in many parts of the world, in grasslands, shrublands, and woodlands. The origin of the distinctive patterning has been the subject of speculation, a common view being that banding evolves through the decline of more complete vegetation cover because of climatic deterioration or through grazing disturbance. A simple model based on cellular automata is employed to test the hypothesis that plausible mechanisms of water partitioning in spatially unstructured plant communities can bring about the development of banding. It is shown that without any climatic change or external disturbance, strongly developed banding can emerge from an initially random distribution of plants. Physical processes underlying the water partitioning, some of which remain unresearched, are discussed, and management implications noted. The microtopography appears to reflect aggradation of the lower parts of the unvegetated zones, toward which water, sediment, and organic detritus are washed by surface runoff. Overall, therefore, the vegetated bands are steeper than the bare zones. Despite this, the vegetation creates a water sink, while the gentler bare areas are runoff sources. This comes about because of the strongly contrasting surface characteristics of these zones. Despite having very low gradients, the bare areas are runoff source areas because their surfaces are quite impermeable. Various characteristics contribute to this. Often, a surface veneer of stones occupies part of the surface, and this makes the covered fraction unavailable for water entry. More importantly, the unprotected surface develops surface sealing and crusting, as a result of disaggregation and puddling of the surface by raindrop impact. Water shed from the bare zones trickles toward their very flat downslope margins, where deposition of detritus often creates a subdued contour-parallel ridge. This impounds the runoff water, which gathers into shallow ponds that are laterally extensive across the slope. e. Even a few mm of rain has been observed to create such ponds in the landscape shown in Figure 1. From these ponds, water trickles downslope among the plants, where it is absorbed so strongly that none appears to escape from the downslope margin of thband. The development of ponding at the upslope margin of vegetated bands favours the distribution of available water among the plants occupying the upslope margin of the grove and thence, by seepage, among the plants within the grove. The vegetated bands, though they are on average steeper, absorb water strongly because they possess better-structured soils. This character arises because of the additional organic matter from the plants and soil fauna. Burrowing soil fauna may also increase soil porosity. Zones of highest infiltration capacity are located around plant bases, where water may be delivered by stemflow. Vegetated surfaces often have a rougher surface arising from enhanced shrink-swell phenomena and associated collapse pipes (crabholes), shrub mounds, tussock bases, and regolith mounds produced by animal burrowing The general contourparallelism of the bands, though, suggests in all cases that gravity-driven surface water movement plays a key role in pattern development. . In cellular automaton models, the landscape is modelled as a tessellation or mosaic of rectangular cells. The evolution of the cell properties, such as the presence or absence of vegetation, is required to follow a set of rules which reflect the properties of neighbouring cells. In the present case, for example, an empty cell can be set to increase the volume of runoff water received by the next cell downslope. Model construction - The model contained a tessellation of 2500 cells (50 X 50). - A variable initial fraction of these, located at random, was denoted as occupied by plants, the remainder being empty. - the present work was developed with constant rainfall, except for some trials (described below) in which declining rainfall was introduced in order to investigate the effects of climatic deterioration. - The rules for water partitioning were as follows: for a bare cell, 10% of the rainfall was absorbed, but no runon water, all of which was passed on to the cells downslope. - the first few mm of rain wet up the regolith surface, all other water (including any later runoff water from upslope) passing across the surface ith little or no absorption. For cells containing plants, complete water absorption was specified. This was shared among the two neighbouring cells along the contour on either side (i.e., among four neighbouring cells). The nearest neighbouring cells were each allocated 10% and the cells two removed were allocated 5% of the water received by the donor cell. Very minor edge effects arose in the model, since border cells lacked adjacent cells on one or other side, and thus received no water from those cells by seepage or ponding. These effects are very slight and do not significantly perturb the model operation over the bulk of the tessellation. In each iteration, all cells were inspected for soil wetness, represented by accumulated water depth. Cells too dry for plant growth were those with less than 1.2-3.5 times the annual rainfall. These became bare. Once an initial plant cover fraction was set out by random numbers, the model was free-running under the control of the water partitioning and plant survival rules just specified. Results - The bands generally displayed quite sharp upslope margins, along which most cells contained plants. The downslope grove margins, in contrast, were relatively diffuse, with only a minority of cells containing plants. - It was generally observed that the cells at the upslope border of the model became bare within a relatively few iterations, since they received no runon water. Thus, plant cover rapidly thinned out here, but the runoff water that was consequently shed downslope permitted plants receiving this water to survive - Banding was generally aligned across the slope (contour- parallel). In some model runs, however, departures were noted. Chance linking of growing patches sometimes formed banding that displayed some sinuosity, or bands that divided into two across the slope. Lateral sharing: - The significance of the lateral water sharing process built into the model was investigated by reducing the lateral sharing to the two cells immediately adjoining a vegetated cell and the single cell downslope. This had the effect of narrowing the resulting bands in the downslope direction, but did not inhibit the development of the contour-parallel banding - When lateral water sharing was excluded and only downslope flow was permitted (as in the model of Goodspeed and Winkworth (1978)), the result was entirely different. No advantage exists under these rules for adjacent cells both occupied by vegetation, and no linking of cells to form bands was observed. Effect of climate deterioration: - vegetation banding can develop and remain stable provided that the rainfall is also sustained. Declining rainfall is associated with the disintegration of the banding and with increasing band spacing, and is certainly not a factor driving the creation of the pattern, at least in this model Conclusions - The principal conclusion is that the enhanced absorption of surface water in the vicinity of plants, followed by the sharing of this water by seepage from surface ponds and by lateral subsoil seepage, is sufficient by itself to account for the remarkably ordered structure seen in banded vegetation. Some restrictions must be placed on the field conditions where this is likely to apply. The slope must be sufficiently gentle that the ponding can occur

The Hypothetico-Deductive Method

Begin with the outside world, take observations → induction, make a hypothesis → deduce, predict → test with observations and experiment → feedback loop which generates new knowledge if the tests do not support the hypothesis The H-D method exposes alternative theories to facts, selects the best theory, i.e., that which agrees closest with fact, and gives it the name "law." Planning expos- es alternative images of a future possible world to the decision-maker's values, or preferences, and selects the best image, i.e., that with the highest value. The es- sential difference is that science uses fact as its standard for selection, whereas planning uses values

logistic population growth:

Begins with the exponential growth, but also considers the population cannot grow forever Now we will look at populations which do not grow forever but reach a carrying capacity ( K ) K represents the maximum population size that can be supported considering limiting factors such as food, shelter and space → penalty on the population growth depending on the number of individuals Consider the term (1− N/K) as a penalty on the growth of the population depending on the number of individuals in the community - Very crowded communities have a high penalty compared to ones with plenty of space and resources At the point the population reaches the carrying capacity: N will = K, and the fraction N/K will = 1 - The term (1−NK) will collapse to 0 and the change in population will = 0. The equation will be multiplied by 0 and equal 0 - So the population will remain at size K → no change in population If the population is very small, NN is small relative to KK, then the penalty is small - However, as we learned from exponential growth, a population grows in proportion to its size. - A population of 1000 seabirds will produce more eggs than a population of 100. In the logistic growth equation the proportion added to the population decreases as the population grows reaching 0 when N=K The point at which a population is the largest relative to the penalty for its size is at K/2 What this means is the growth rate of a population is fastest at half its carrying capacity As it grows bigger than K/2 the penalty becomes stronger but below K/2 the population is small and the proportion added to the population is small - Fastest growth rate is at half the carrying capacity → population is largest relative to the penalty experienced for the number it can support Example: bacteria in a petri dish - If a population is really high and the carrying capacity is lower, the population will decrease to the carrying capacity If a species is at capacity, its growth rate will increase to maximum if you cut it in half. - Not all species are equal so cutting all in half will not have an equal effect...

Bernoulli, Binomial and Poisson Distribution

Bernoulli Distribution - takes value 1 with success probability p and value 0 with failure probability q = 1 − p. It is useful for representing processes like survival or for data arising from presence-absence habitat surveys. - variables can have a variety of distributions Binomial Distribution - used to obtain the probability of observing x successes in N trials, with the probability of success on a single trial denoted by p. The binomial distribution assumes that p is fixed (i.e. the same) for all trials. Used in environmental management, quality control, consumer sampling - i.e. probability of getting xheads when tossing a coin six times (a 'Bernoulli trial').p = 0.5, N = 6 over 10 000 sets of six tosses. Poisson Distribution - gives the probability of a number of events occurring in a fixed time if these events occur with a known average rate, and are independent of the time since the last event (e.g. times a web server is accessed per minute). - λ is equal to the expected number of occurrences that occur during a given interval - i.e. Poisson distribution with l = 1, based on 10 000 random samples.

How do you classify systems?

Classification of Systems •Many ways to classify systems (e.g. Chorley and Kennedy, 1971) -morphological: emphasis on relationship between components and their strength -cascading: components linked by flows of mass, energy or both (e.g. flow in/out a lake) -process-response: hybrid; emphasises relationship between system form and process (e.g. relationship between wave energy and beach slope angle) -control: process-response + human influence (e.g. process-response + coastal defence)

Scale: Grain vs. Extent

Components of 'Scale' •Grain: the minimum resolution of the data in space or time. fine grain captures the exact location •Extent: the domain over which measurements are recorded -the overall 'area' encompassed by the study (in space and time) - how big is the domain? Size of the area we study → size is important, larger sizes show more interactions. Offshore extent of the model? The further you extend your domain, the better you capture processes. Pattern & Scale •Spatio-temporal interactions generate spatio-temporal pattern •Patterns alter with scale -as we change the scale measured we usually get different results •This scale-dependency limits conclusions to the scale of measurement •Perception of pattern depends upon scale - some features cannot be measured if the extent of measurement is not large enough to contain them. As we change scale of modelling with different domains we will see different things. Applies to both space and time. Sometimes small scales are not useful in understanding how the system works. The Grain-Extent Trade-off •Nature: fine-grained, large extent •In environmental studies grain and extent are negatively correlated •As grain decreases, (measured) information content increases - how does a single point compare with the perimeter? Measurements at a single location are generally un-useful. In environmental studies the extent of detail keeps increasing, but really we have more and more information which does not help us understand how the system as a whole functions. Scaling Environmental Models •What is the smallest extent needed to address the question? •What is the smallest grain needed to address the question? No single 'right' answer - depends on the question! Practicality often trumps detail

Describe pros and cons of process-based models (tough question)

Computer Models: some virtues (used as a planning tool) -Controlled experiments in: -Time -Space -Boundaries -Equation-based -Should include all our knowledge -Study of: -Sensitivity -Processes Thus, mod- eling's proper place is in the planning process for predicting alternative images of the future. As a planning tool, model calibration and sensitivity analyses are not only legitimate, but are demanded, for they can be expected to increase predictive capabilities For example, the deer population model of Medin and Anderson (1979), although an inaccurate predictor by the standards of normal sci- ence, does predict essential information a manager needs to know to set harvests, namely, the number of deer in the future with and without harvest. Regression analysis would have trouble answering this question. Modeling can usually be made to give a direct but inaccurate an- swer to a question, whereas other plan- ning tools like regression analysis can usually be made to give an indirect but more accurate answer The model is an in- formed guess, a mixture of knowledge and error, about a process of nature. Run- ning the model on a computer shows, by deduction, what the informed guess entails. Even when models are left unvalidated or fail to pass strict validation standards, modelers still believe that models are valuable for gaining insights and for sen- sitivity analysis, a process in which the changes in model outputs as a function of given changes in model parameters is gauged; the most critical parameters are then made the most important candidates for obtaining improved estimates. But, a model of doubtful validity can only give doubtful ideas, whether they are ideas called insight or ideas about parameter sensitivit Computer models: some problems - Stability - Boundaries - Initial conditions- Which space and time scale can be addressed? - Have all processes been considered? - Need of calibration - Need of parameterizations It is almost certain that no models can ever predict well enough to be declared valid. Herein lies a dilemma: modeling seems to hold promise for science because models can incorporate many features of the natural world. What are felt to be the essential features of a deer herd (Medin and Anderson 1979) or, indeed, an eco- system (Innis 1978) can be combined in 1 large set of interrelated equations; however, the more equations and parameters a model contains, the more computation that is required, and the greater will be the propagation of errors in the parame- ters from model input to model output (Alonzo 1968). - errors in input to calculations must necessarily increase as calculations are made, and they point out that models linking scores and perhaps hundreds of equations rep- resenting process functions, fueled by as many or more estimated parameters, many often from expert guesses and all with systematic and random error, cannot predict well enough to be declared valid, except by chance, according to the stan- dards of experimental science Summary: - The wildlife science profession needs standards for validating the use of plan- ning tools. Medin and Anderson (1979) suggest that computer simulation models be validated according to their utility to decision-makers. I think this is a correct concept, but is worthless until the profession operationally defines utility. Wildlife science must try the H-D method. Without it the ability to detect errors in pronouncements of laws, the self-correcting feature science must have, is fatally lacking. All learning takes place in a feedback system in which ideas and reality interplay. The method of retroduction coupled with the H-D method is such a feedback system. Uncouple them and the ability to learn, to tell error from truth, is hindered, if not destroyed. By themselves, scientific methods are impotent. Skills in using methods are the catalysts of potency. If, in a half century, the H-D method has been tried and shown to be impotent, then its judges must show that the cause was not the im- potency in the skills and dedication of those who tried it.

Describe "conceptual model" (examples/use reading)

Conceptual Models •Visual or narrative summaries that describe the important components of a system and their interactions -provide a 'mental picture' of the problem •A conceptual model shows: -the empirical variables, -the underlying relationships and interactions between variables •Art or science ? - No matter which model you deal with you will always have a conceptual phase → way of expressing what we know and what we want to achieve Mental picture of how the system works - Used at the beginning of the modelling stage → the modeller might have a very different mental picture of how the system works from another •From problem definition to model requirements •Starts in the mind of the modeller or the client (or both) •Iterative nature •Independent of software •The role of CM in communicating science Conceptual Models: useful? •Understanding the problem situation •Determining the modelling and general project objectives •Identifying the model outputs (responses) •identify the model inputs (experimental factors) •determining the model content (scope and level of detail), •identifying any assumptions and simplifications Examples: - A simple conceptual model of the processes involved in the 'greenhouse effect' (arrows, boxes, pictures etc), a rough sketch - They define the system, if done properly half of the modelling is done → recognise the important parts of the system, model inputs and outputs

Example: coral reefs

Coral reefs are known for their high biodiversity. However, many reefs around the world have degraded. A major problem is that corals are overgrown by fleshy macroalgae. Reef ecosystems seem to shift between alternative stable states, rather than responding in a smooth way to changing conditions. The shift to algae in Caribbean reefs is the result of a combination of factors that make the system vulnerable to events that trigger the actual shift8 . These factors presumably include increased nutrient loading as a result of changed land-use and intensive fishing, which reduced the numbers of large ®sh and subsequently of the smaller herbivorous species. Consequently, the sea urchin Diadema antilliarum, which competes with the herbivorous fish for algal food, increased in numbers. In 1981 a hurricane caused extensive damage to the reefs, but despite high nutrient levels, algae invading the open areas were controlled by Diadema, allowing coral to recolonize. However, in subsequent years, populations of Diademawere dramatically reduced by a pathogen. Because herbivorous fish had also become rare, algae were released from the control of grazers and the reefs rapidly became overgrown by ¯eshy brown algae. This switch is thought to be dif®cult to reverse because adult algae are less palatable to herbivores and the algae prevent settlement of coral larvae

Data-driven models

Data-driven Models •Fully based on observations (not theory or processes) •Clear divide between people using this approach and the ones that don't. But at some level all models rely on parameterizations •Statistical models vs. Machine Learning 2 types: 1. statistical modelling (hypothesis testing) 2. Machine lEarning (no Hypothesis) Collect observations and generalise them (not theory based) •Empirical models are observation-based: 'rules of thumb', regressions, neural networks, etc.

Describe "deduction" (examples/use reading)

Deductive Thought - The existence of 'universal laws of nature' means we can deduce explanations and make predictions using logical arguments •The movement of thought from general to specific; a deductive argument is: 1. built using valid rules of inference 2. one in which the conclusion follows from the premises-for example:a) 'all humans are mortal, Kevin is human, thus Kevin is mortal' (modus ponendo ponens)b) 'a baby can be either a boy or a girl, the baby is not a girl, thus the baby is a boy' (disjunctive syllogism, modus tollendo ponens) - a form of logic that identifies a particular by its resemblance to a set of accepted facts If, like Robinson Crusoe, we come across footprints on the beach of a desert island, we can conclude from our knowledge of the human footprint that another human is or was on the island (deduction) We can explain things with rational, logical reasoning. Follow logic and deduce conclusions. 2 types of logic (black and white, or modus ponendo ponens) Relies on us being correct - Make predictions from the theories

exponential population growth

Density independent growth - dN/dt= rN - This does not tell us the population size → however this can be calculated What is required for a population to grow? How many births and how many deaths? N_t+1 = N_t + B − D + I − E - B = Births - D = Deaths - I = Immigration - E = Emigration If we assume that immigration and emigration are equal, then the change in population size is: ΔN=B−D More births than deaths the population grows More deaths than births the population declines Change in population ( dN ) over a very small interval of time ( dt ) can be described as: dN/dt = B−D - If we let b−d become the constant r, the intrinsic rate of increase, we have the continuous exponential growth equation: dN/dt=rN Births and deaths are also described in rates: B=bN b = instantaneous birth rate [births / (individual * time)] D=dN d = instantaneous death rate [deaths / (individual * time)] SO change in population over time can be described as dN/dt=(b−d)N Change in population is equal to the intrinsic rate of increase ( r ) multiplied by the population size ( N ) - r defines how fast a population is growing or declining r=0 no growth r>0 positive growth r<0 negative growth - The differential equation tells us growth rate not population size Because growth rate is exponential, by taking the natural logarithm of the population size the graphed lines become straight

Example: deserts

Desertification- the loss of perennial vegetation in arid and semi-arid regions- is often cited as one of the main ecological threats facing the world today, although the pace at which it proceeds in the Sahara region may be less than previously thought. Various lines of evidence indicate that vegetated and desert situations may represent alternative stable states. - in determining the stability of perennial plant cover6,31. Perennial vegetation allows precipitation to be absorbed by the topsoil and to become available for uptake by plants. When vegetation cover is lost, runoff increases, and water entering the soil quickly disappears to deeper layers where it cannot be reached by most plants. - As a result, the desert state can be too harsh to be recolonized by perennial plants, even though a perennial vegetation may persist once it is present, owing to the enhancement of soil conditions On a much larger scale, a feedback between vegetation and climate may also lead to alternative stable states. The Sahel region seems to shift back and forth between a stable dry and a stable moister climatic regime. For example, every year since 1970 has been anomalously dry, whereas every year of the 1950s was unusually wet; in other parts of the world, runs of wet or dry years typically do not exceed 2-5 years - A new generation of coupled climate- ecosystem models demonstrates that Sahel vegetation itself may have a role in the drought dynamics, especially in maintaining long periods of wet or dry conditions. The mechanism is one of positive feedback: vegetation promotes precipitation and vice versa, leading to alternative states.

Empiricism vs. Mechanism Simplicity vs. Detail

Empiricism vs. Mechanism •Empirical models are based solely on statistical relationships between variables -acausally link independent to dependent variables -model represents a response (what) Machine learning variables No cause and effect Based on how the variables interact •Mechanistic models are based on understanding of how variables interact -based on understanding of how the independent variables affect the dependent one(s) -model represents a response and a mechanism (what and why) Empirical - what is the response Mechanistic - what is the response and why Simple or Detailed?•Need to find a balance between what we leave in and what we leave out .... - Low level of detail = high errors (sins of omission) - Level of detail is low = errors low → high level of detail = high errors (sins of commission, pile up errors - Put in enough but not too much detail - Human nature: we complicate things, try to over describe processes Population growth models: • One state variable • Two parameters • Six state variables • Eight+ parameters -depends on the question •Determine the required detail subjectively based on its use... •Increasing detail may increase realism but decrease reliability and robustness, by increasing the number of poorly known model parameters. •Simplifying by stochastic parameterization the representation of unresolved processes (statistical distributions of values) improves model reliability and realism in both atmospheric and land-surface modelling. - Increasing detail: may actually increase distance from real world - Usefulness: just enough detail to generate a valuable model

Summary of population growth:

Exponentially growing populations grow proportional to their size indefinitely Logistically growing populations experience a penalty that increases with population size - Growth rate is greatest at half the carrying capacity when the population size is at its largest relative to the density penalty Stochasticity can have a serious effect on populations and deterministic models do not account for stochasticity Differences in continuous and discrete equation - Different notation and descriptions of r - Mathematical differences Many different population growth equations

Feedbacks

Feedback Relationships •In most systems change is not simple or linear •Environmental systems often contain branching and splitting events -multiple inputs influence single processes -multiple outcomes possible depending on certain conditions and feedbacks - Tiny differences may result in much larger consequences in space and time - the sum is larger than the parts Positive and Negative Feedback •Feedbacks occur when a system self-regulates -positive feedback is due to amplification-type processes which, once change starts, maintain it -negative feedback is due to damping processes, which counter-act the effect of change to maintain the system state e.g. fires --> increase in temp --> more fires (positive feedback) - e.g. beach cusps, - Positive and Negative FeedbackPermafrost (temp rise, permafrost soils thaws, thawed soils can release CO2)

Machine Learning:

From Big Data to Machine LearningWhy using ML •Unknown processes •We can use this approach to understand and predict •There is enough data to develop generalized predictors The future (?) •One or more components of a model could be ML predictors •Predictability & UnderstandingUnderstand how ML works •Data selection •Mathematical issues (overfitting, overtaining ...) •Opening and lightening the "black box" •Different approaches mimicking human cognition processes: - Genetic Programming - Artificial Neural Networks •The learning machine needs to be trained (using part of the dataset), calibrated (using another part of the dataset) and tested (using an independent dataset) •Used in real life: credit cards, search engines, smart cars, medical, Try to learn from the data → black box, makes good prediction but hard to infer understanding. People avoid them and do not address the mathematics behind it; use inputs and outputs but do not understand them. The future: growing fast → process-based models may have some elements which are machine learning based. Generates algorithms which are better at predicting. Relationships between variables are improving. Data selection (important). Data intensive → training, calibrating and testing Neural networks: algorithm based model

Population model:

Growth Without Limit This is also known as the exponential growth model N increases with increasing t for all values of t; at t = 0, N = N_0 r = growth rate Nt = no. of individuals at time t N0 = no. of individuals at time t=0 •Long-history of the use of mathematical models in ecology -fisheries science, harvesting, ... -stochastic or deterministic? •The exponential model of population growth is a simple deterministic model Deterministic vs stochastic - K = carrying capacity, - Add stochastic (randomness) → the real world is variable. For the same rd and N0 we always see the same trajectory

Example: The Flocking Model (BOIDS, Craig Reynolds)

In this model: •agents represent birds •agents move according to 3 rules: • separation: steer to avoid crowding local flockmates • alignment: steer towards the average heading of local flockmates • cohesion: steer to move toward the average position of local flockmates Simulates birds (triangles) that move as objects in space and time

How does an ANN work?

In training a neural network Step 1: A training instance is selected and inputted into the neural network. Step 2: The inputs are propagated through the network to the output layer. Step 3: The squared error of the output is computed (desired output minus the network output). Step 4: The error is propagated backwards through the network so that the weight can be updated (BACKPROPAGATION) Step 5: goto Step 1 •This process is repeated until every training instances has been processed • Then the resulting ANN is checked with the validation set if it is better than the previous best ANN then it is saved • The 5 steps are then repeated until a ANN is discovered that doesn't perform better then the current best ANN. Supervised learning Model takes the difference between the value of the prediction and the value of the output and begins to back propagate the error → readjusts the weights of connections (evaluating the error(s)). Runs this data through again. Network works forwards and backwards We stop the network when it appears it is no longer improving performance → may have overtrained → must test it with NEW data Multiple Neurons They can be distinguished by: • their type (e.g. feedforward) • their structure (e.g. backpropagation) • the learning algorithm they use in the optimization - Hopfield network:a symmetric feedback network - Feedforward-Backpropagation ANN

induction and deduction:

Induction and deduction are two, usually different but never contradictory, approaches to problem solving. The problem must be solved by testing the validity of the conclusion or inference, etc. reached from either direction. Induction and deduction are thus valuable, often complementary, tools that facilitate problem solving. Connecting deduction and inductive reasoning Dividing models into conceptual, process-based, data-driven, physical Deductive (processed based) or inductive (data-driven)

Describe "induction" (examples/use reading): data

Inductive Thought - Induction puts observation first, 'laws of nature' second •From particular to general: -compile individual observations to arrive at a generalisation -for example: "we observe that ducks have webbed feet, geese have webbed feet, seals have webbed feet, therefore all animals that live in water have webbed feet [inference]" •Criticism: are the phenomena studied general? will they continue to behave the same? For example, if we examine enough feral cats we can generalize that feral cats are a rich sources of fleas (induction) For example, if we observed over many trials that the amount of edge vegetation in fields was positively correlated with an index of game abundance, we would be using in- duction if we declared a law of association. The more trials observed, the more reliability we'd attribute to the law. Start from the observations from he field or lab and generate predictions → theories No logic, no deduction. We are incapable to make a hypothesis → so we measure and take observations → make an inference, derive something that is general Issues: taking many measurements and still reaching a generalised conclusion, are observations still valid in changing environments at the present? → approach becomes weak in the face of future prediction

discrete form of logistic growth:

Instead of telling us the change in a population at an infinitely small point in time, the discrete equation tells us the size of the population at a given time. - addition to population is less than exponential model The size of the population at the next time-step is equal to: The size of the current population plus the current population multiplied by the discrete growth rate and the density penalty. Unlike the exponential model, the logistic model growth rate is dependent on population size and reaches its peak at half of the carrying capacity

Limitations of dune model:

Limitations: • what is a slab/grid of sand? • what is time? • simplified aerodynamics (what is a shadow area when there are no dunes?) • simplified sediment transport - not detailed (how much sediment are we transporting from every cell to another?), time is not well defined (the distance sand travels is not always the same) What are we learning: • role of wind (speed and direction) • role of sediment supply • role of pre-existing morphology - how wind can effect the formation of dunes. Sediment can cause a switch from transverse to barchan dune → depending on morphology and other conditions etc What else could we do: • add vegetation • assess anthropogenic effects • (change entirely the model?) - is the model too simple? What else could we add in?

Describe "types" of process-based model

Mathematical, computer, simulation - General statements using mathematical equations → parameters in natural systems (constraining these parameters is incredibly difficult) Mathematical Models I -describe the system using the formalisms of mathematics -a range of forms -analytic or computational, agent-based, numerical Example: A Model for Rip Channel Formation (continuity, momentum, energy, phase and sediment conservation) •Analytical solutions might be impossible •Numerical solutions often very difficult •Elegant (in the virtual world) but how far from the real world? •The key question: what are the key assumptions? Example 2: Growth Without Limit, This is also known as the exponential growth model Mathematical Models II •Long-history of the use of mathematical models in ecology -fisheries science, harvesting, ... -stochastic or deterministic? Simulation Models •A simulation is basically an imitation, a model that imitates a real -world process or system •Simulation models attempt to recreate (virtually) the system of interest -many systems are not mathematically tractable, so we use simulation instead -such models can be predictive or heuristic -often detailed, and demand significant investments in time, money, and data! Computer Models (Mathematical? Simulation?)

Why do we care about stochasticity?

Mathematical/Theoretical/Deterministic Models •Random elements are not considered -the real world is stochastic, but this model doesn't represent this -the exponential model is a simplification, but it might be good enough? -remember, utility depends on purpose Stochastic Models - Generate random sequences → depends on the throwing in of numbers - our model includes probability language (stochastic) i.e. What is the chance that it will rain tomorrow?-The weather has two 'states': wet or dry•dry today, dry tomorrow? dry today, wet tomorrow? - We formulate our model as a matrix with the probabilities of each type of weather following each other .. - Because the model is stochastic it generates a different series of 'weather conditions' each time we run it • We analyse these data in terms of

What is a model? (less than 100 words)

Model: a stylised representation or abstraction used to analyse or explain something; tools for the evaluation of hypotheses - models are tools to evaluate hypotheses a tool allowing a quantitative (usually) statement of a scientific hypothesis -model state: a value describing a system at a given point in time -model process: something that causes a state to change over time and/or space Examples of states: • heavy metal concentration on a flat • No. of tuatara on Stewart Island • Height of a dune • Age-sex structure of a population Examples of processes: -• Things that influence the state variables to the left!

Modelling

Modelling •Why do scientists need models? A key step in knowledge-building •Why does everyone need models?Today's world simply wouldn't function Complexity We should continue to use simple models where they capture enough of the core underlying structure and incentives that they usefully predict outcomes. When the world we are trying to explain and improve, however, is not well described by a simple model, we must continue to improve our frameworks and theories so as to be able to understand complexity and not simply reject it. - Elinor Ostrom

How Do We Analyse Stochastic Processes? - monte carlo approach

Monte Carlo Methods •One approach to dealing with stochastic models is Monte Carlo methods -Monte Carlo simulation involves running a model many times -each parameter takes a random deviate from some probability distribution •For each model parameter we define: -distribution type (e.g. uniform, normal, ...) -parameters required to specify this distribution (e.g. mean, variance, ...) -parameter bounds if necessary (i.e. min. and max. acceptable values) •User also selects number of model runs (n) to be conducted •For each model instance: -for each parameter generate a deviate -random deviates are independent -model run is executed and saved for each model instance models are run thousands of times. Define things for the model parameter. Conservative estimate

So, Why Use Models?

Nature and Value of Models •Models are simplifications of the 'real' world-support surrogative reasoning Uses of Models •Provide a formal framework for: 1.synthesis and integration 2.comparison across systems •Scenario evaluation-experimentation may be impossible or problematic -'what-if ?' counterfactuals-design of 'optimal' monitoring programs •Predictions about specific situations and/or scenarios Can teach us a lot about larger systems → framework to compare and synthesise results Scenario evaluation: storm generation → pathway, time period, likely effects, how will certain places be affected? Monitoring programmes, prediction of extreme events and in the future Impacts of the events coming: not always equal to different environments in the same place. What affects the natural system affects humans → we are part of the environmental system

Example: Lakes

One of the best-studied and most dramatic state shifts is the sudden loss of transparency and vegetation observed in shallow lakes subject to human-induced eutrophication The pristine state of most shallow lakes is probably one of clear water and a rich submerged vegetation. Nutrient loading has changed this situation in many cases. - Remarkably, water clarity often seems to be hardly affected by increased nutrient concentrations until a critical threshold is passed, at which the lake shifts abruptly from clear to turbid. With this increase in turbidity, submerged plants largely disappear. - Associated loss of animal diversity and reduction of the high algal biomass makes this state undesired. Reduction of nutrient concentrations is often insuf®cient to restore the vegetated clear state. - Experimental work suggests that plants increase water clarity, thereby enhancing their own growing conditions5 . This causes the clear state to be a self-stabilizing alternative to the turbid situation - The reduction of phytoplankton biomass and turbidity by vegetation involves a suite of mechanisms, including reduction of nutrients in the water column, protection of phytoplankton grazers such as Daphnia against fish predation, and prevention of sediment resuspension. I Whole-lake experiments show that a temporary strong reduction of fish biomass as `shock therapy' can bring such lakes back into a permanent clear state if the nutrient level is not too high

Popper's Critical Rationalism

Popper's Critical Rationalism •Karl Popper argued that: 1. the scientific method can falsify, but not prove, hypotheses 2. hypotheses are accepted as provisionally true because no one has disproven them 3. only falsifiable hypotheses are open to scientific study 4. hypotheses that can't be falsified may be true, but science can't assess this Popper's ridicule of induction was based on the premise that induction requires the observation of every instance of a given phenomenon for the generalization to be true—an obvious impossibility; the fact that all known crows are black, for example, doesn't prove that no white crows exist. - His argument can also be used to make deduction useless for it, too, is based on an incomplete set of known facts. Even if the identified fact resembles the members of the set, how can we be sure that every possible feature of either the unknown or the members of the set itself has been considered? As we will see in what follows, in many of the examples of the way science is practiced, induction is as much a part of this practice as is deduction or any system of logic that serves the purpose of advancing knowledge Popper's Critical Rationalism II - Deductive: we go from theory to observation • Can't prove theory by observations • Popper's approach is a framework in which rational judgements can be made about which theory to accept - The scientific method can only falsify, not prove anything → the only way to move forward is to keep disproving things. All models are wrong, but they are helpful in aiding the disapproval of findings

growth rates:

Population increases exponentially but Growth rate over population size increases proportionally with the population Exponential growth - Populations growing exponentially have a doubling time The doubling time depends on the growth rate r and is not every year - flaw:... Surely no species can grow forever exponentially --> density dependance

Random Variables and Probability Distributions

Probability Rules •There are three fundamental rules (axioms) of probability: 1. all probabilities are between zero and one (0 ≤ p ≤ 1) 2. the sum of probabilities across all possible events is exactly one 3. if two events i and j are mutually exclusive then the probability of i or j = p(i) + p(j) - Random variable can be assigned any number between two extremes Random Variables -Random variable: a variable (X) that has a single value, determined by chance, for each outcome of a procedure (a trial) •discrete random variables can take on only a countable number of distinct values such as 0, 1, 2, 3, 4, ... •continuous random variables can take an infinite number of possible values •Probability distribution: the probability of each value (x) of a random variable (X). linking parameters (discrete, number of favourable events over all possible outcomes). Probability Distributions Discrete •The probability distribution of a discrete random variable is a list of probabilities for each of its possible values •The sample space() is all outcomes of an experiment (e.g., = 1, ..., 6 for a dice) •If there are n possible outcomes in and m are favourable for event i, then P{i} is: m/n •The probability distribution of a discrete random variable X is a function giving the probability p(xi) for every subset of its state space (S) -it must satisfy: sum over all events is one probability for any given event is between zero and one

Differentials- Why Bother?

Rate of change → easier to model the change rather than the specific value - Derivative reduces the change to an infinitely small decimal •We measure variables in terms of: absolute size or rate of change -population size is measured in individuals (N) -rate of change is measured in individuals / time: dN/dt; the derivative •Easier to model the factors that cause variables to change, rather than those that make them 'big' or 'small' dN/dt ? •Thus, it is best to measure change over small intervals (t and t+x): - Population growth rate = (N_(t+x) - N_t)/([t+x] - t) - The derivative is simply this rate over an infinitesimally small interval: growth rate = (dN/dt) - time elapsed (dt) - change in population (dN)

Example two Reconstructing Archaeological Change

Reconstructing The Past •Although it's intuitive to consider models as tools for prediction, they're also often used to reconstruct the past -in archaeology to explore human-environment interactions -in ecology to reconstruct 'baseline' ecosystem conditions -by climate scientists to explore palaeoclimatic conditions The Anasazi •Puebloan (meso-American) •Occupied parts of SW of USA c. 1800BC - c. 1300AD •Developed a rich culture in and around Long Valley, Az., USA •Rapid collapse and abandonment of these sites c. 1300AD - Anasazi means "ancient enemy"; correct name is Ancestral Pueblo - people stayed in the valley for a long time → in 1300 they left, population collapsed → why was there a rapid abandonment of this site? Valley was a very closed system Environmental Data •Palaeoenvironmental reconstructions and ethnographic data -surface geomorphology -archaeological evidence -dendrochronology -packrat middens •Estimates of -annual maize production -food requirements -hydrological conditions Environmental and demographic variability on the southern Colorado Plateaus, AD. 1-1970. A, hydrologic fluctuations and floodplain aggradation-degradation. B, primary (solid) and secondary (dashed) fluctuations in effective moisture as indicated by pollen data. C, decadal tree growth departures in standard deviation units. D, spatial variability in dendroclimate. E, relative population trends, A.D. 1-1450.

Statistical models:

Regression line may be linear, non-linear etc i.e. rainfall versus baseflow height versus shoe size

Decision Trees

Seeing the trees for the (random) forest •Variously know as Decision Trees or Classification and Regression Trees (CART) •Popular Machine Learning algorithm •Aim to split data into homogenous subsets •In doing so, create rules that can be followed for making future predictions •Can be used for classification (predicting discrete value) or regression (predicting continuous value) tasks → keeps splitting the data to classify it better and better → creates rules for future prediction date from the pioneering work of Morgan and Sonquist (1963) in the social sciences - powerful statistical tools for analyzing complex ecological datasets because they offer a useful alternative when modeling nonlinear data containing independent variables that are suspected of interacting in a hierarchical fashion - used to develop habitat models for threatened birds (O'Connor et al. 1996), tortoise species (Anderson et al. 2000), and endangered crayfishes - a form of binary recursive partitioning where classification and regression trees refer to the modeling of categorical and continuous response variables, respectively e.g. Beach erosion: split data into accretion and erosion based on wave height (or, wave period). Then split into period <12< seconds for erosion or accretion (generating a rule) Trees may have a large amount of branches Random Forest •Why use just a single Decision Tree? •In life we commonly use a group to make important decisions, for example: •A jury •An interview panel •Because each member has specific expertise, that when combined can reach a more informed and accurate conclusion •We can apply this same principle by using multiple decision trees to form a Random Forest! Random forest: connecting branches. Every expert knows about the specific topic. Keeps creating branches until it finds the best way to set them up Different branches will look at different parts of the data set → combined to create one prediction.

Why do zebras have stripes

Some hypothesis •Camouflage effect •Confusion effect •Social drivers •Insect effect •Temperature effect --> periodic up and downwelling air flows above the black and white stripes

Example One: Desert dunes

Spatially Explicit Approaches Discrete Space (Grids) - Cellular Automata - pattern in space The numeric value in each cell is a function of: • the state of the cell • the state of surrounding cells • additional spatial data Lateral grid, wind direction is fixed from one direction (simplification) Additional data to tell us the position of each cell compared to the larger scale features they are developing. Rules defined depends on the previous state and the surrounding cells If too much sand accumulates at one cell it will spread out according to the angle of repose (30 degrees) Shadow zone - Over dunes, the wind picks up the sand and transports it downwind, more likely to pick up sand on the exposed side of the dune due to shadow zone Rules control each cell's state. The model is spatial because those rules depend on states of other cells and other spatial data. Desert dune formation: • process-based or data-driven? (process-based) • what type? (computer? simulation?) • simple or detailed ? (simple, very simple) Some simple rules to simulate dune formation: • sand-slabs picked up and transported in the wind direction (always unidirectional in here) • sand is likely to be deposited where sand is present • sand cannot be eroded if in "shadow" zones • angle of repose should not be exceeded Transverse dunes - if sand supply is infinite - overtime, with wind blowing in the same direction, little by little sequences of highs and lows develop → simulating the migration of the dunes. Amount of sand is infinite. Barchan Dunes - If sand supply is limited - → there is not enough sand, sand cannot be transported from locations under or near rock. Less sediment available to develop the dunes, shape of the dunes resembles more of a barchan shape.

Supervised vs unsupervised algorithms:

Supervised: how does it work? I want a ML algorithm that predicts how long it will take me to go to work (by bus) 1.Develop a labeled dataset from experience •Weather •Time I leave home •Day of the week •Holiday? (y/n) •Time of arrival at work (output) 2.Choose an algorithm 3.Train the algorithm 4.Test the algorithm 5.Predict Unsupervised: how does it work?Supermarket product/organization1. Collect a dataset (all are inputs, no outputs!) •All checkout receipts (2 years of data) 2.Choose an algorithm 3.Give dataset to the algorithm 4.Analyze patterns provided by ML: •diapers, beer and flowers •All sorts of purchase behaviour - Clustering

The virtual world

The "real world" learning feedback - real world --> abstractions --> implementation - The loop is a classical negative feedback whereby decision makers compare information about the state of the real world to various goals, perceive discrepancies between desired and actual states, and take actions that (they believe) will cause the real world to move towards the desired state - The feedback loop shown in figure 1 obscures an important aspect of the learning process. Information feedback about the real world is not the only input to our decisions. Decisions are the re- sult of applying a decision rule or policy to information about the world as we perceive it - Abstracting: reducing the system to a concept which may be coded on a computer. Examples of mistakes include the Ozone Layer → we had the instruments to discover it before however humans thought we would never see such low values. The instruments placed by NASA were throwing away measurements which were potentially useful. Implementation: writing the code, creating the model. However, data may be biased, may be unable to be reduced/coded. Process of abstraction generates a number of assumptions and errors However, the real world is complicated → interrelationships between components of the system, systems may be unpredictable or impossible to predict, time delays to seeing effects The real and the virtual world - impediments to learning - real world: • Unknown structure • Dynamic complexity• Time delays• Impossible experiments (data) - abstractions: • Selected• Missing• Biased• Ambiguous •Misperceptions• Unscientific• Biases• Defensiveness - implementation: • Structures/interests• Inconsistency • Inability to include all dynamics into models • How far is the Virtual World from the Real world? - The policies themselves are conditioned by institutional structures, organizational strategies, and cultural norms. These, in turn, are governed by the mental models of the real world we hold reality will always be separate

Modelling Population Growth

The Exponential Model •Mathematical models can provide simple descriptions of population growth •We can use a continuous differential equation, which describes population growth as the change in size (dN) over a very small period of time (dt): (dN/dt) = B - D - Change in size = birth rate - death rate[rates are measured per capita] •B and D are rates (measured per capita) -need to know the number of births (b) and deaths (d) per unit time per individual (N) i.e. B= bN - Thus, we can express the model as: (dN/dt) = (b - d)N The population grows at rate r (r = b-d) -the instantaneous rate of increase -r describes how the population grows •Population projection -we can use the model to predict population size at any time (if we know r and N0) - the analytical solution and Malthus' equation -and the 'doubling' time (tdouble), Depends on growth rate. tdouble = (ln(2))/r

Population growth and collapse in a multiagent model of the Kayenta Anasazi in Long House Valley - Axtell et al. 2002

The archaeological record of Anasazi farming groups from anno Domini 200-1300 provides information on a millennium of sociocultural stasis, variability, change, and adaptation. We report on a multiagent computational model of this society that closely reproduces the main features of its actual history, including population ebb and flow, changing spatial settlement patterns, and eventual rapid decline. The agents in the model are monoagriculturalists, who decide both where to situate their fields as well as the location of their settlements. Nutritional needs constrain fertility. Agent heterogeneity, difficult to model mathematically, is demonstrated to be crucial to the high fidelity of the model. A major impediment to rigorous investigation in archaeology—the inability to conduct reproducible experiments—is one shared with certain other sciences, such as astronomy, geophysics, and paleontology. Computational modeling is providing a way around these difficulties. Social histories unfold in such models by "turning on" each agent periodically and permitting it to interact. Agent models offer intriguing possibilities for overcoming the experimental limitations of archaeology through systematic analyses of alternative histories. Changing the agents' attributes, their rules, and features of the landscape yields alternative behavioral responses to initial conditions, social relationships, and environmental forcing. The Anasazi pattern is defined by an emphasis on black-on-white painted ceramics, plain and textured gray cooking pottery, the development from pithouses to stone masonry and adobe pueblos, and the kiva as the principal ceremonial structure. Considerable spatial variability within the general pattern has led to the recognition of several geographic variants of Anasazi. Long House Valley falls within one of the western Anasazi configurations. The multiagent model is created by instantiating the landscape, reconstructed from paleoenvironmental variables, and then populating it with artificial agents that represent individual families, or households, the smallest social unit consistently definable in the archaeological record - Each family agent is defined by certain attributes (Table 1), including its age, size, composition, and amount of maize storage. Similarly, each agent has specific rules of behavior (Table 2). These rules determine how the households select their planting and dwelling locations. Once all agents are initialized, the model proceeds according to internal clocks (Table 3). Essentially, all agents engage in agricultural activity during each period (1 calendar year) and move their plots or dwellings or both based on their success in meeting nutritional needs. Time series plots and histograms illustrate annual simulated and actual population numbers, aggregation of population, location and size of residences by environmental zone, the simulated amounts of maize stored and harvested, and the number of households that fission, die out, or leave the valley. We modified this earlier model (10) to incorporate greater levels of both agent and landscape heterogeneity. In the previous model all agents had the same ages for the onset of fertility and death. Here, each agent gets a specific value for these ages when it is born, based on sampling from a uniform distribution. A similar procedure was applied to the household fission rate. These changes introduce six adjustable parameters, namely the endpoints of these uniform distributions. Optimizing the model with respect to the eight adjustable parameters yields distinct "best" configurations, based on which norm was used in the simulation. The agent model suggests that even the degraded environment of the 1270-1450 period could have supported a reduced but substantial population in small settlements dispersed across suitable farming habitats located primarily in areas of high potential crop production in the northern part of the valley. The fact that in the real world of Long House Valley, the supportable population chose not to stay behind but to participate in the exodus from the valley indicates the magnitude of sociocultural "push" or "pull" factors that induced them to move (20).q Thus, comparing the model results with the actual history helps differentiate external (environmental) from internal (social) determinants of cultural dynamics. It also provides a clue—in the form of the population that could have stayed but elected to go—to the relative magnitude of those determinants. Richer treatments of household characteristics are possible. For example, in calculating mean household values for size, fissioning, and "death," we have envisioned disaggregating the households into individuals of varying ages in the life course.r Similarly, the average caloric values used can be adjusted for age of individuals within the household. Nonuniform distributions can be explored. It is, however, interesting that even without implementing these refinements, the output from the current model closely reproduces the record of the archaeological survey. Another possibility that can be modeled in future simulations might be a combination of environmental, demographic, and epidemiological factors. That is, synergistic interactions between nutritional stress and precolonial epidemic disease might have decimated the population beyond what our model indicates. In addition, the depressed population may simply have been insufficient to maintain cultural institutions, precipitating a collective decision to leave the valley (26). These are ripe topics for future research. By this criterion, our strictly environmental account of the evolution of this society during this period goes a long way toward explaining this history.

Reading: Learning in and about complex systems

The greatest constant of modern times is change. The dizzying effects of accelerating change are not new. If people had a holistic world view, many argue, they would then act in consonance with the long-term best interests of the system as a whole. Indeed, for some, systems thinking is crucial for the survival of humanity successful approaches to learning about complex dynamic systems require: (1) tools to articulate and frame issues, elicit knowledge and beliefs, and create maps of the feed- back structure of an issue from that knowledge; (2) formal models and simulation methods to assess the dynamics of those maps, test new policies, and practice new skills; and (3) methods to sharpen scientific reasoning skills, improve group processes, and overcome defensive routines for in- dividuals and teams; that is, in the words of Don Schön (1983a), to raise the quality of the "organizational inquiry that mediates the restructuring of organizational theory-in- use." Systems approaches that fail on any of these dimen- sions will not prove useful in enhancing the capabilities of individuals or organizations to understand, operate effectively in, or improve the design of the systems we have created and in which we live, nor can they form the basis for the scientific study of complexity Learning is a feedback process: - All learning depends on feedback. We make decisions that alter the real world, we receive information feedback about the real world, and using that information, we re- vise our understanding of the world and the decisions we make to bring the state of the system closer to our goals For learning to occur, each link in the two feedback loops must work effectively, and we must be able to cycle around the loops quickly relative to the rate at which changes in the real world render existing knowledge obsolete. Yet in the real world, particularly the world of social action, these feedbacks often do not operate well. The costs of error are also asymmetric: it is better to be wrong with the crowd than wrong alone Virtual worlds have several virtues. They provide low- cost laboratories for learning. The virtual world allows time and space to be compressed or dilated. Actions can be repeated under the same or different conditions. One can stop the action to reflect. Deci- sions that are dangerous, infeasible, or unethical in the real system can be taken in the virtual world. Thus controlled experimentation becomes possible, and the time delays in the learning loop through the real world are dramatically reduced. In the real world, the irreversibility of many actions and the need to maintain high performance often override the goal of learning by preventing experiments with untried possibilities ("If it ain't broke, don't fix it"). In the virtual world, one can try strategies that one suspects will lead to poor performance or even (simulated) catastrophe - Obviously, while the virtual world enables controlled experimentation, it does not re- quire the learner to apply the principles of scientific method. Many participants in model- based workshops lack training in scientific method and awareness of the pitfalls in the design and interpretation of experiments. In practice, effective learning from models occurs best—perhaps only— when the decision makers participate actively in the development of the model. Complex dynamic systems present multiple barriers to learning. The challenge of bettering the way we learn about these systems is itself a classic systems problem. Overcoming the barriers to learning requires a synthesis of many methods and disciplines, from mathematics and computer science to psychology and organizational theory. Theoretical studies must be integrated with field work. Interventions in real organizations must be subjected to rigorous follow-up research. There are many reasons for hope. Recent advances in interactive modeling, tools for representation of feedback structure, and simulation software make it possible for any- one to engage in the modeling process. Corporations, universities, and schools are ex- perimenting vigorously. Much further work is needed to test the utility of the tools and protocols, evaluate their impact on individual and organizational learning, and develop effective ways to train others to use them. The more rigorously we apply the principles discussed here to our own theories and our own practices, the faster we will learn how to learn in and about complex systems.

Conceptual modelling for simulation Part I: definition and requirements - Robinson, 2006

The overarching requirement: keep the model simple The central theme is one of aiming for simple models through evolutionary development. principles of modelling, methods of simplification and modelling frameworks. - simple models can be developed faster, - simple models are more flexible, - simple models require less data, - simple models run faster, - the results are easier to interpret since the structure of the model is better understood. Humans: - tendency to try and model every aspect of a system when a far simpler more focused model would suffice Principles of modelling Providing a set of guiding principles for modelling is one approach to advising simulation modellers on how to develop (conceptual) models. For instance, describes six principles of modelling: 1. model simple; think complicated, 2. be parsimonious; start small and add, 3. divide and conquer; avoid megamodels, 4. use metaphors, analogies, and similarities, 5. do not fall in love with data, 6. modelling may feel like muddling through. With more complex models these advantages are generally lost. Indeed, at the centre of good modelling practice is the idea of resorting to simplest explanation possible. Occam's razor puts this succinctly, 'plurality should not be posited without necessity' four-stage approach to conceptual model development, similar to that of Shannon: 1. collect authoritative information on the problem domain; 2. identify entities and processes that need to be represented; 3. identify simulation elements; and 4. identify relationships between the simulation elements. Conceptual modelling is probably the most important aspect of a simulation study. It is also the most difficult and least understood. Over 40 years of simulation research and practice have provided only limited information on how to go about designing a simulation conceptual model. This paper, the first of two, discusses the meaning of conceptual modelling and the requirements of a conceptual model. Founded on existing literature, a definition of a conceptual model is provided. Four requirements of a conceptual model are described: validity, credibility, utility and feasibility. The need to develop the simplest model possible is also discussed. Owing to a paucity of advice on how to design a conceptual model, the need for a conceptual modelling framework is proposed. Built on the foundations laid in this paper, a conceptual modelling framework is described in the paper that follows. Conceptual modelling is the process of abstracting a model from a real or proposed system. It is almost certainly the most important aspect of a simulation project. The main reason for this lack of attention is probably due to the fact that conceptual modelling is more of an 'art' than a 'science' and therefore it is difficult to define methods and procedures. What is conceptual modelling? Conceptual modelling is about abstracting a model from a real or proposed system. All simulation models are simplifications of reality (). The issue in conceptual modelling is to abstract an appropriate simplification of reality (). - Conceptual modelling is about moving from a problem situation, through model requirements to a definition of what is going to be modelled and how. - Conceptual modelling is iterative and repetitive, with the model being continually revised throughout a modelling study. - The conceptual model is a simplified representation of the real system. - The conceptual model is independent of the model code or software (while model design includes both the conceptual model and the design of the code ()). - The perspective of the client and the modeller are both important in conceptual modelling. The conceptual model itself consists of four main components: 1. objectives 2. inputs (experimental factors) 3. outputs (responses) and 4. model content. Why do we need conceptual models if we have computer models? - Minimises the likelihood of incomplete, unclear, inconsistent and wrong requirements - Helps build the credibility of the model. - Guides the development of the computer model. - Forms the basis for model verification and guides model validation. - Guides experimentation by expressing the objectives, experimental factors and responses. - Provides the basis of the model documentation. - Can act as an aid to independent verification and validation when it is required. - Helps determine the appropriateness of the model or its parts for model reuse and distributed simulation ().

Kuhn's View: The paradigm

Thomas Kuhn took a different view ... Normal Science •Kuhn argues most science is 'mopping up' - what he calls normal science •Occasionally there is a crisis (a revolution) and the main paradigm in the field changes •What we see as 'normal science' and what triggers 'revolution' is social as much as scientific

Can we trust a model?

UNCERTAINTIES - Unreliability Methodological Hard to quantify - Ignorance Epistemological Unknown unknowns - Inexactness Technical Studied • Verification (impossible) • Validation (consistency) • The human angle: professional, personal, societal, economic We have technical uncertainties, methodological uncertainties (process of abstraction, difficult to quantify choices made in data selection, the choice of the formula effects how the natural environment responds), ignorance (variables are ignored, effects our ability to understand a system) Human aspect: it should be about models being more "useful" not "better"

Stochastic logistic growth:

Until now, everything we have looked at has been entirely deterministic but is this true of the real world? Environmental stochasticity - Populations go through good and bad times and are not constant - We can represent this by adding variance to the growth rate rd - Variability can also be included in KK, the carrying capacity! We will focus on environmental stochasticity Demographic stochasticity - By chance a population might have a run of births or a run of deaths - Demographic stochasticity includes the probability of births and deaths in the parameter r Even if average growth rate is positive some stochasticity can drive extinction

Cellular Automata Models

• "discrete" model (in time and space) • each cell is characterized by a finite number of states • simple "local" rules to define "states" (red=alive, white=dead) • same rules apply to every cell • rules usually depend on the previous "state" of the same and neighbouring cells Subcategory of simulation models - Many simplifications: we assume the cell can go through a finite number of states i.e. alive or dead. The state of the cell depends on the neighbouring cell and on the previous state of the cell (memory) → it can change states - Accounts for space and time

What is Machine Learning?

•"Field of study that gives computers the ability to learn without being explicitly programmed" (A. Samuel, 1959) •Learning is a process by which a system improves performance from experience •Machine Learning is concerned with computer programs that automatically improve their performance through experience (H. Simon, Nobel Prize in Economics 1978) •A branch of Artificial Intelligence Feed the machine with data and the output → model Why ML? •Develop algorithms that self-adapt and self-customize to individual users (Spotify) •Data mining. We now have large databases (big data!) and can use to "discover" (location of beer and diapers in supermarkets). •Ability to perform human-like tasks (recognise characters/images, drive cars) •Perform detailed and complicated (and pricy) operations •Very accurate (but in conflict with simplicity) •Do we always need to learn? Discover association between data Algorithms that immediately recognise the surrounding and can make human-like decisions Accurate, but not simple Why now? •A lot of data •Faster computers •Industry support •Better algorithms - algorithms are demanding mathematically, more support put into the study of machine learning → still don't get enough out of its advantages A more modern example: Self-driving cars (& Jazz) How far can this go? A human person still needs to make humans decisions. Not straightforward to set up and are not perfect i.e. cannot differentiate between a chihuahua dog and a blueberry muffin No panic, they are not perfect! •Need a lot of data •Not immediate to set up •Often difficult to interpret •Time consuming

Alternative Stable States

•A change in the conditions (perturbation) can result in a rapid, often unpredictable, shift in the state of the system •How big does the perturbation needs to be? •"Resilence" is the ability of the system to "resist" to the rapid shifts •Tipping point: the moment when the system shifts Without clear indication, significant change in system Does not necessarily require a large perturbation Nature of Change in Systems - One 'equilibrium' exists for any given environmental 'condition' - Multiple equilibria may exist for the same conditions, the dashed equilibrium is an unstable border between basins of attraction for two alternate states - as the conditions get worse, so does the state of the system (linear). In reality → there are sudden changes (state is maintained), followed by a sudden drop following perturbation. Dashed line is an unstable border (environmental state does not exist) Hysteresis: - effort required to bring the system back to normal (additional work to be done). Example: overfishing → pressure on coral reef system increases with amount of fishing (fine balance between predation and environment). For one condition you can have multiple equilibrium A - If the system is close to F1 a small change in conditions may drive rapid change (forward shift). Backward shift only occurs if the system moves to F2 (hysteresis). B - Perturbation can drive the system to an alternate SS if it's large enough to move it across the attraction basin. - Perturbation often results in an unwanted state

What are model factors?

•A model's factors consist of constants, parameters, and variables Constants: Fixed values - e.g., the speed of light or pie - Parameters change from one situation to another For linear models change in y is proportional to change in x - this is not the case for non-linear models. - exponential change is still linear

"Systems"

•A system is: -"a set or assemblage of things connected, associated, or interdependent, so as to form a complex unity" -"a set of elements together with relations between the elements and their states" -"a non-random arrangement of matter-energy in a region of physical space-time which is non-randomly organized into co-acting interrelated subsystems or components" - How we deal with the complexity of natural environments → systems approach, an ensemble of interacting units with exchanges of energy and matter A system is: (i) composed of interacting sub-units, and (ii) has reasonably well-defined boundariesThe interconnection determine the behaviour Characteristics of Systems •Hallmarks of systems are that they/their: -interdependent parts form a unit (the system) -components are structured or organised -exhibit functional andstructural relationships between units -function implies flow or transfer of material, energy or information -require the presence of some driving force, or source of energy The different interconnected components interact to form a function which is different to the function of the individual components We would like to be able to close the system, and we try to isolate it to see how the system functions. However deducing boundaries are difficult. Isolated systems are rare → river margin is not a definite boundary What are the components of the system which interact? Process of simplification

Pros and cons of ANN

•Advantages: - Handle partial lack of system understanding - Create "adaptive" models (model that can "learn") • Limitations: - results are as good as the training data set - complicated (black box?) • Summary: -they come in all shapes and sizes -are powerful predictors: a strength - are powerful predictors: a limitation The more data that comes in, the more it can learn → network will get better and better Neural networks need quality data in large quantities. Black box is complicated, difficult to untangle big networks Neurons (nodes) are all placed in the hidden layer (this is where information converges and transforms) - nodes represent each group of observations, split into two child nodes, a process through which the original node becomes a parent node. - The term "recursive" refers to the fact that the binary partitioning process can be applied repetitively. Thus, each parent node can give rise to two child nodes and, in turn, each of these child nodes may themselves be split, forming additional children. Inputs: machine is given information

What controls the distribution of tropical forest and savanna? - Murphy and Bowman 2012

•Alternative states •Role of models/data -Forest and savanna biomes dominate the tropics, yet factors controlling their distribution remain poorly understood. - the transition from savanna to forest cannot be explained in terms of climate alone (Generally, forest replaces savanna when annual rainfall exceeds about 1,500 mm) - the primacy of a single factor in controlling forest and savanna distribution. Broadly, these factors divide into resource‐based (or 'bottom‐up') and disturbance‐based ('top‐down' ) controls, including the role of soils (Askew et al. 1970), topography and drainage (Beard 1953), and fire. single factors cannot explain the distribution of forest and savanna biomes at local and continental scales. Rather, the controls are emergent from a web of feedbacks amongst biological and environmental variables. Climate is clearly important, but extensive savannas in some high rainfall areas suggest a decoupling of climate and vegetation. In some situations edaphic factors are important, with forest often associated with high nutrient availability. Fire also plays a key role in limiting forest, with fire exclusion often causing a switch from savanna to forest. These observations can be captured by a broad conceptual model with two components: (1) forest and savanna are alternative stable states, maintained by tree cover‐fire feedbacks, (2) the interaction between tree growth rates and fire frequency limits forest development; any factor that increases growth (e.g. elevated availability of water, nutrients, CO2), or decreases fire frequency, will favour canopy closure. This model is consistent with the range of environmental variables correlated with forest distribution, and with the current trend of forest expansion, likely driven by increasing CO2 concentrations. Forests and savannas dominate the tropical landscapes of Africa, Australia, South America and Asia, covering about 15 and 20% of the Earth's land surface, respectively (Grace et al. 2006; Fig. 1). These biomes have strikingly different ecological structure and function. Tropical forests are characterised by dense tree cover, typically with a high diversity of trees, lianas, and epiphytes, and competition for light primarily drives dynamics and structural complexity of the vegetation. Savannas consist of an open tree layer with a continuous grassy ground layer, typically dominated by shade‐intolerant species possessing the C4 photosynthetic pathway. The grass biomass of savannas often supports high densities of large grazers and provides fuel for frequent fires. Tropical forests and savannas are globally important centres of biodiversity, reflecting complex evolutionary histories involving co‐evolution, niche differentiation, as well as vicariant speciation and convergent evolution amongst continents. The interface between these biomes has been postulated as an important theatre for evolutionary diversification, including our own ancestors - Determining the environmental controls of tropical forest and savanna biomes is a prerequisite for a comprehensive understanding of the global carbon cycle. - Tropical forests are one of the most carbon dense biomes on the planet, storing on average 320 t C ha−1, representing about 25% of the carbon stored in the world's vegetation and soil organic matter - In contrast, savannas store much less carbon, averaging 120 t C ha‐1, representing about 15% of the global total - Tropical savannas are the most frequently and extensively burnt ecosystem on Earth, accounting for around 44% of global carbon emissions by fires (van der Werf et al. 2010). Although tropical forests are the least flammable of any biome, fires associated with tropical deforestation have contributed about 20% of anthropogenic radiative forcing since the industrial revolution The question of what processes allow trees and grasses to co‐exist (in a savanna), is fundamentally similar to the question of what processes allow trees to form a closed canopy and exclude grasses (in a forest); metaphorically, these questions represent opposites sides of the same coin. propose a new conceptual model to explain the relative distribution of tropical forest and savanna, based on alternative stable state theory and tree growth‐fire interactions. Although humans have had a massive impact on tropical forest distribution - by directly removing or degrading forest canopies (e.g. deforestation, logging), or modifying other controls such as fire activity and resource availability (e.g. nutrient deposition, CO2) - here we specifically focus on relatively natural, intact systems and deliberately avoid a detailed review of the impacts of humans on forest and savanna distribution. Soils (resource based control) - ability of vegetation to directly influence soils. This vegetation-soil feedback relates to the fundamentally different nutrient cycles, ecophysiology, hydrology and soil biota between the two biomes. Forest soils typically have higher organic matter and nitrogen content, both largely originating from the vegetation itself, than would develop under savanna on equivalent parent materials, because of greater inputs of litter and higher density of deep‐rooted trees which are able to access sub‐soil nutrients (McCulley et al. 2004). In savannas, the dominance of shallow‐rooted grasses and the higher frequency of fires and grazing results in more open nutrient cycles that are less efficient at accumulating nutrients and soil organic matter Fire (disturbance based control) - role of fire in shaping forest distribution is most apparent where forest patches exist within a highly flammable savanna matrix. In such landscapes, forest is frequently restricted to topographic settings protected from fire and some ecologists have used this pattern to emphasise the primacy of fire in limiting forests. However, topography that confers fire protection is often highly confounded with 'bottom‐up' factors, especially water availability, making it extremely difficult to attribute ultimate causation. - Forest trees are generally considered more susceptible to both topkill (death of aboveground parts) and whole‐tree mortality following fire than savanna trees, although many forest species can resprout following a single fire --> Recently in Brazil, Hoffmann et al. (2009) found that forest and savanna trees adjacent to forest‐savanna boundaries had similar rates of whole‐tree mortality following fire, but forest trees were more likely to be topkilled than savanna trees of similar stem diameter. They attributed this effect to greater bark thickness for a given stem diameter in savanna trees, with the implication that forest trees would take much longer to reach fire‐resistant sizes assuming similar growth rates. - Perhaps the most compelling evidence that fire controls the distribution of forest is the results of fire‐exclusion experiments that have shown forest species invading fire‐protected savanna, especially in high rainfall areas (San Jose & Farinas 1983; Moreira 2000; Woinarski et al. 2004), with a complete biome shift from grassy to closed canopy sometimes occurring within a few decades (Trapnell 1959; Swaine et al. 1992; Louppe et al. 1995). Co2 and Dynamic Forest-Savanna Boundaries - First, elevated CO2 would increase tree growth rates, allowing them to rapidly recover following disturbance such as fire. Second, though elevated CO2 increases carbon assimilation in plants possessing the C3 pathway (most woody plants), plants possessing the C4 pathway (most savanna grasses) may be relatively unresponsive. Third, elevated CO2 would increase whole plant water use efficiency, reducing transpiration by shallow‐rooted species and increasing percolation of soil water to deeper soil layers, favouring establishment and persistence of deep‐rooted woody plants. Conceptual Model - Forest and savanna as alternative stable states - sharp spatial boundaries between alternative states (Schröder et al. 2005). Indeed, this is one of the most conspicuous and fascinating characteristics of forest-savanna boundaries. Boundaries often span just a few metres, accompanied by extremely abrupt changes in tree cover, light availability, temperature, grass abundance and fire activity - alternative states are stabilised by strong biological feedbacks. In the case of forest and savanna, the feedbacks are numerous and well documented (Fig. 5). Foremost, closed forest canopies have a strong suppressive effect on fire, by: (1) limiting grassy fuel loads, by reducing light availability at ground level (Hoffmann et al. 2009) and (2) decreasing the severity of fire weather at ground level, by increasing relative humidity and decreasing temperature and wind speed (Cochrane 2003). In turn, fire has the effect of reducing tree cover, by killing individual stems (Hoffmann et al. 2009) and reducing tree growth rates (Murphy et al. 2010b) - abrupt shifts between ecosystem states should be possible if stabilising feedback processes are interrupted. This is consistent with the dynamic nature of forest‐savanna boundaries at a range of temporal scales Conclusions: - The factors that control the distribution of tropical forest and savanna have puzzled ecologists for over a century, and yet there is still only limited consensus on the relative importance of 'bottom‐up' (resource‐dependent) and 'top‐down' (disturbance‐dependent) controls, and the exact mechanisms by which these factors operate. - The recent appreciation of the interplay between fire and tree cover, especially within the savanna biome, has led us to propose the 'tree growth‐fire interaction' model that has two key components. First, forest and savanna exist as alternative stable states, primarily held in check by a strong negative feedback between the forest canopy and fire activity. Strong evidence for this is provided by fire‐exclusion experiments that have led to biome switches from savanna to forest within a few decades. Second, the interaction between tree growth rates and fire frequency determines the likelihood of a forest canopy forming to displace flammable savanna. Any factor that promotes tree growth, such as water or nutrient availability, will increase the likelihood of forest trees recruiting, maturing and forming a closed canopy in the interval between destructive savanna fires, as will any factor that increases the fire interval (such as topographic fire protection, low rainfall seasonality)

Challenges of "Space"

•Analysing dynamic spatial patterns and processes is difficult: -spatial data are costly -many numerical techniques aren't designed to deal with spatial variability -our understanding of spatial processes is poor Extending point measurements to large spatial extents costs money, may be prohibited and is challenging Simulation Models •Many dynamic systems aren't tractable, so we resort to simulation -try to virtually recreate the system of interest -often (not always) highly system-specific and highly detailed -Sometimes highly general and very abstracted - may be general or specific

Conclusions:

•Building and thinking about models involves a suite of difficult trade-offs -what scale is the model focused on? -how mechanistic is it? in what detail are specific processes modelled? -how do process and pattern relate to each other in the model? top-down or bottom-up?

Assumptions of the logistic model:

•Constant carrying capacity (K) - carrying capacity is constant (resources are always there and don't change) -to achieve a S-shaped (logistic) curve, Kmust be constant: resource limits don't change over time •Linear density-dependence- each individual added has the same effect on the per capita rate of growth -per capita growth rate is fastest when N ~ 0 and declines to a minimum as N → K The exponential and logistic models provide simple examples of how such mathematical models are used

Example Two Striped vegetation in semi-arid regions

•Dunkerly (1997) used a spatial model to explore emergence of brousse tigrée -grid-based with each cell representing a small area of ground: bare vs. vegetated -rainfall each year (constant or variable) -water flows down slope and is captured or runs-on depending on cell's state Bare areas are horizontal in elevation profile. Areas where ponding layers results when it rains, poor infiltration, no planting can grow → trickles down from the edge and is absorbed by the steeper area → feedback A Simple Spatial Model •Dunkerly (1997) explores formation of brousse tigrée using a spatial model - Model to explore why the brousse tigree emerges → partitioning water between cells - Grid with a certain slope, over this grid you can have cells which are bad or vegetated. At each cell you have a balance of water inside, and water which runs down every cell and possibly rainfall - The water which flows down tends to be captured by vegetation → bare areas do not absorb the water - When water runs out to a vegetated cell, 55% of this water is absorbed (sustaining vegetation) → rest of water is spread around i.e. 10%to each neighbouring cell and 5% to cousin cells Bare cells: only 10% is infiltrated •This is how a vegetated call accumulates water •Bare cells: No runon, 10% infiltration Darker areas are vegetated, whiter areas are bare soil → plants can self-organise into this pattern Areas slightly flatter, slight edge developing (vegetation areas are steeper elevation profiles in between). Slope is steeper. Areas where there is a lot of infiltration, soils have adapted. Base model: forming of vegetated stripes overtime → re-used the amount of water which could be given to neighbouring cells → altered neighbourhood (we can still see the pattern but have lost some regularity and continuity of the pattern) If we removed completely the passing of water to lateral cells then we do not see the pattern → no stripes form (no lateral feedback) Rather than looking at the pattern through a climate top-down approach, we look at the pattern from a process-based approach

Systems' boundaries

•Environmental systems often don't have clear boundaries •Truly isolated systems are the exception-deep-sea vents, lakes, the Earth, the Universe are possible examples. Others? System Boundary Properties •Open: transfer of matter and energy-e.g. free-living populations of animals or plants •Closed: transfer of energy only-e.g. bacterial population growing in a beaker •Isolated: no energy or matter transferred into or out of the system-truly isolated systems don't exist in reality?-the concept can serve as a useful model...

What is a "physical" model?

•Hardware models are scaled physical reconstructions: flumes, wind-tunnels, etc. - Replication of the real world: difficult to do on large scales in laboratories - Vegetation effects - Estuarine evolution: distance from the real world is huge, replicating the movement of sediment is difficult

The Logistic Model

•In reality, there is a 'saturation point' for every species in a given environment, called its carrying capacity (K) •At this point (K) the population stops growing (i.e. dN = 0) and thus b = d •Population growth switches from being exponential to logistic Population grows fast at the beginning → approaches the saturation value, slows down, may stop growing and decrease to zero once it has reached the carrying capacity Number of individual is STILL increasing, however just at a slower rate •Building on the exponential model we can describe population growth using the logistic model: - (dN/dt) = rN*(1-(N/K)) where: K = carrying capacity, r = intrinsic growth rate, N = population size - Carrying capacity → population cannot grow anymore (birth = death) - Logistic approach → variation of the population size over time = the growth rate, times the population, times the ratio of the carrying capacity to the population size Population initially grows rapidly before slowing as N → K - Note that at K, dN/dt = 0 (i.e. no population growth)

abduction:

•Infer the best explanation: -compile available observations to arrive at one or many hypothesis. Choose the simples/most likely -for example: "since if it rains, the grass gets wet, one can abduce (hypothesize) that it probably rained" •Criticism: Conclusions are plausible but uncertainty is still present The method of retroduction (Hanson 1965) is useful for finding research hypotheses about processes that are explanations or reasons for facts. For example, if we ob- served birds caching seeds more on south slopes than on north slopes (facts), and our best guess for the reason of this behavior (our research hypothesis) was that south slopes tended to be freer of snow than north slopes, we would be using the method of retroduction to generalize a research hypothesis about a process providing a reason for the observed facts of bird behavior. The method of retroduction is the method of circumstantial evidence used in courts of law. Retroduction is not always reliable, because alternative research hypotheses can often be generated from the same set of facts - the method of retroduction was invented. It is reliable enough to be used in courts of law but, by itself, it is not reliable enough for science. Science has the most stringent standards of all endeavors. Start from observations → may make one sensible hypothesis and many others (a range) Criticism: heavy reliance on observations is location dependant → does not reflect other areas well (different selection pressures etc) No definitive answer → we should favour the most plausible answer, and make a range of predictions

Why Use Monte Carlo Analysis?

•It is a tool for combining distributions, and thus estimating more than just summary statistics Applications: •Physical Sciences •Engineering •Finance and business •Computer graphics •... Example: - •Sunburn lotion additive is a potential irritant •Lotion samples estimate irritant's concentration: -mean: 0.02 mg chemical -standard deviation: 0.005 mg chemical •Tests show probability of irritation given use: -low freq. of effect per mg exposure = 5/100/mg (0.05.mg-1) -high freq. of effect per mg exposure = 10/100/mg (0.1.mg-1) -mean 0.075 mg-1 Analytical Results •We can calculate a single estimate of risk based on these values -Risk = irritant concentration X exposure •We can also calculate more conservative (cautious) estimates -conservative estimate: use upper percentiles •For a 'real' application 10 000 draws is a minimum! p(irritate)= picking a random number from the gaussian and uniform distributions and multiplying them (first draw). Selecting random numbers from the distributions, store the result. After ten draws you can calculate the mean etc

Outcomes of cellular automata:

•Key findings: -need the lateral and downward displacement of water and nutrients for stripes to form -system robust against temporal variability •Question: how easy would it be to explore these issues without a model?•Think of the dune model too! •Spatial simulation models are receiving much attention and use •Spatial models incur added costs in design, parameterisation, etc. -swapping one black box for another? -best to use them in an experimental way? -prediction vs. learning

Summary of ML

•Machine Learning is powerful •Great predictors •Come in all forms and shapes •Not trivial to set up •Is "insight" the future? When are you likely to use ML? When u have a lot of data, when you do not need to understand how the system works and are only interested in predictions For environmental science, we need more supervised learning

Summary of stochasticity:

•Most systems and processes have a strong stochastic component •These types of processes are analysed using a suite of specific tools -random variables, probability distributions and functions -Monte Carlo-type analysis •Analysis relies on using computers and numerical methods There is not one fixed value → we try to describe the distribution to account for stochasticity Monte carlos analysis focussing question: state background for stochasticity → then go into analysis details

Crucial assumptions of the population model:

•No immigration (I) or emigration (E) -closed population and system; change is a function of local births and deaths •Constant birth (b) and death (d) rates -unlimited resource availability -No stochastic effects •No age, size or genetic structure -all individuals are exactly equal The 'No Change' Assumption? - Generates a variety of curves → study the statistics of the outcomes - deterministic: (r_d) - ... What happens if conditions vary through time (i.e. r is not constant)? - stochastic t is a random number changing every time step t - We do not want the rate of growth to be a constant → make the 'r' random (we know there is more complexity in the real world) - Distribution of numbers (strength of stochasticity) → reduce the distance between real and virtual world - Helps us better develop what we would see in nature

What is 'Science'?

•Science is: 1. the systematic observation of natural events and conditions in order to discover facts about them and to formulate laws and principles based on these facts 2. the organised body of knowledge that is derived from such observations and that can be verified or tested by further observation' - Science is 'organised simplification'? How do we generate 'reliable' knowledge? -Empiricism: knowledge comes only or primarily from sensory experience -Induction:moving from specific instances to general statements -Deduction: moving from general to specific instances - Abduction/Retroduction: movingfrom an observation to a hypothesis that accounts for the observation -Hypothetico-deduction: formulating a hypothesis in a form that could conceivably be falsified by a test on observable data. No single, identifiable method applies to all branches of science; the only method, in fact, is whatever the scientist can use to find the solution to a problem. This includes induction, a form of logic that identifies similarities within a group of particulars, and deduction, a form of logic that identifies a particular by its resemblance to a set of accepted facts. Both forms of logic are aids to but not the solution of the scientist's problem

Agent-based modelling (overview)

•Space is the place •We need to develop techniques that allows us to deal with space & time What is an agent? • agents can be individuals or "units" (eg, a family) • agents take "decisions" according to set of predefined rules (can be probabilistic) • agents move and interact in the (model) world • collective behavior (emergence) can arise depending on the feedbacks • used to model virus spread, urbanization, videogames, many economic applications, rebellion ... A car may be an agent Simulate the process of decisions and expansion in the future Large number of applications Virus: grey people are immune, green people are healthy, red people are sick → each person is an agent

Summary for simulation models:

•Spatial simulation models are receiving much attention and use •Very broad category •Prediction vs. Learning

Systems analysis:

•Systems analysis is the 'study of the composition and functioning of systems' •Systems analysis and modelling involve simplification to understand the functioning of a system - Dealing with the connected parts of the system → what are the key components Assumptions: identifying the correct components which drive change in the system Systems Theory Assumptions •Treating the environment as a system assumes that:-we can sub-divide the real world into discrete functional units-these units have a behaviour that is predictable and simpler than the whole-the inputs and outputs and inter-relationships between the components can be determined System 'Colours' - Black box: only inputs and outputs understood - Grey box: inputs and outputs understood, with some knowledge of components. - White box: whole individual relationship understood, including individual components and their stores and flows

Ancestral Pueblo Model contd...

•The Artificial Ancestral Pueblo model was developed to explore why the population collapsed -human activity in the landscape -changing environmental conditions through time and space in Long Valley •Example of an Agent-Based model (ABM) -agents are autonomous goal-seeking entities •in the AA Model the agents are 'household units' -directly represents human decision-making • The AA model is built on a series of rules describing the way in which agents behave in response to the environmental conditions i.e. households fission when a daughter reaches 15, households move when maize storage falls below the amount needed to sustain the household - Attributes define the agent i.e. 5 rooms/1 pit-house = 1 agent • The AA model mimics temporal dynamics quite well • early models with limited heterogeneity less successful •fails to predict collapse adequately: social factors? - population drops and stabilises → according to the model, the population could still survive in the valley and live there Early models with less stochasticity were not as successful → humans make different decision Possible explanations: disease? Something social not in the rules defined: large number of activity, followed by reducing half of the population → not enough people, activity was TOO low → other half of the population leaves because they cannot sustain themselves •The model can be a tool for learning rather than prediction-generates as many questions as answers: the question of why Long House Valley was abandoned so rapidly remains •The model is used to synthesise disparate sources of data and understanding Data: combining a huge amount of environmental data with social data about households, as well as pollen data etc → Community effort to reach an outcome

Probability distributions - continuous

•The continuous probability distribution is called the probability density function •If f(x) is a probability density function then : -the total probability for all possible values of the continuous random variable X is 1 -the probability density function can't be negative Uniform Distribution Example: •discrete: head or tail •continuous: exact time of arrival of the bus, spinner position, temperature - value is comprised between two values, equal chance of getting any number between two values Normal/Gaussian Distribution - peak in the middle of the x axis

'Classical' Scientific Method

•The scientific method is the logical scheme used by scientists trying to answer the questions posed within science: 1. formulate theories 2. develop ways to test and produce them 1. Pose the question in the context of existing knowledge (theory + observation)-a new question that old theories are not capable of answering or a novel one that extends existing theory (asking the right question) 2. Formulate a hypothesis as a tentative answer to the question(s) 3. Deduce consequences of the hypothesis and make predictions 4. Test the hypothesis in a specific new experiment or theory field-test whether the new hypothesis fits into the existing world-view (1)-if the hypothesis leads to contradictions and demands change, test it carefully!-2 3 4 is repeated with modifications of the hypothesis until agreement is reached (which leads to 5)-if major discrepancies are found the process must start from the beginning 5. When consistency is obtained the hypothesis becomes a theory-provides a coherent set of propositions that define a new theoretical concept-theory is now subject to a process of 'natural selection' among competing theories 6. The theory becomes a framework within which observations and/or theoretical facts are explained and predictions are made -state 1 includes the new theory/modified old theory Science is in a state of permanent change; it's provisional.

What is the difference between a theory and an hypothesis (examples)?

•Theory: systematic statement of principles; a formulation of apparent underlying principles of phenomena that have been verified to some extent. Theory is more important than the facts, Einstein told Heisenberg, because theory tells us what the facts mean. •Hypothesis: an unproven theory or supposition, tentatively accepted to explain certain facts, or as a basis for further research. a statement about the way a system 'works' The term theory means a broad, general conjecture about a process. For example, the Lotka- Volterra competition equations (Emlen 1973) represent a theory about the pro- cess of competition between 2 animal species. A research hypothesis is a theory that is intended for experimental test; it has the logical content of the theory, but is more specific because, for example, the location and animal species must be specified. A research hypothesis must be tested indirectly because it embodies a process, and experiments can only give facts entailed by a process For example, consider the question how salmon find their way upstream to their home spawning grounds. The an- swer, "salmon navigate by vision alone," is a research hypothesis (H), i.e., a con- jecture about a process of navigation. A test consequence (C) is "a group of salm- on that has been captured and blinded as they begin their upstream migration will not reach their home tributary spawning grounds in numbers greater than expect- ed by chance, whereas a nonblinded control group of equal size that was spawned in the same tributory as the blinded fish will return to their tributary in numbers greater than expected by chance." The fact of the test consequence C must then be obtained by experiment, e.g., tagging smolts before their migration to the lake or ocean, recapture of those returning to spawn, and subsequent recapture of blinded and control-group salmon after they have swum upstrea

Stages of Systems' Analysis: - System behaviour

•There are distinct phases in the systems analysis process -lexical phase: defining the system -parsing phase: defining the links -modelling phase: constructing the model -validation phase: testing the model •This process is iterative 'real/virtual' world conceptual model formalised model 'real/virtual' world System Behaviour •Environmental systems are dynamic -equilibrium (stationary) -positive feedback (runaway change) -negative feedback (oscillation; damping) -chaotic dynamics What is 'Equilibrium'?•Equilibrium does notmean that a system is unchanging or immutable-may be both dynamicandscale-dependent •Equilibrium systems return to the same state after perturbation - statistical equilbirium: System varies around some average value in thelong-term, with some change. - response to perturbation: System returns to pre-perturbation conditions. How long? Issues of scale? Equilibrium means there is a balance between the forces acting on the system → does not mean it is 'still'. May vary in time length (scale in space and time). If you change something in the system it will naturally go back to its original state. Statistical equilibrium: not static, some variability, however there is a mean value. Oscillations may be so large that this is a contested definition. Issue of scale, how long to return to original state?

Rationale for Modelling

•There is a need to: 1.look into the future 2.understand impacts of events that have not happened (yet) 3.understand impacts of human behaviour 4.understand impacts on human behaviour Summary •Models are abstractions of some 'real' entity of interest •Models have a number of uses: -understanding, extrapolation, inference...-formal frameworks for organising ideas •Many different types of models-Conceptual, physical, data-driven, process-based-And many subcategories: mathematical, simulation, statistical, machine learning ...

Linking Pattern and Process Top-down or bottom-up?

•Two ways to view systems (and models): (i) Bottom-up approach -start at the lowest level and explore system dynamics arising from interactions between the entities at these levels: pattern from process (ii) Top-down approach -development + application of a general framework to different systems: process from pattern • The top-down approach begins with characteristics at the largest (spatial) scale and attempts to explain phenomena at each lower scale from the understanding achieved at the higher level. • The bottom-up approach starts with the 'parts' of a system • We try to understand how the system's properties emerge from the interaction among these parts Bottom-up Approach •We want to know how the system will behave (pattern) given our knowledge of its components (process) - interaction between entities at the lowest possible level. We want to know how the system behaves given our knowledge of the components Top-down Approach •We have repeated descriptions of the system (recurrent patterns) and we want to generalise them (process) •Can become a test for process-based models! - look at the universal framework which govern systems (large scale patterns) → extract processes. Begin at the larger scale. Pattern --> Process - Process from pattern: population size of Steller's Sea Lion. Why the crash since the mid 1970s? Harvesting? Competition? Habitat loss? Pollution? - can be quite difficult, pinpointing exact causes is challenging "It would be naïve to imagine that there is a best approach to the development of models for identifying and predicting the behaviour of environmental systems any more than there is a best model for a given problem." Beck

Which type of ML algorithm?

•What is the size and dimensionality of the dataset? •Is the data linearly separable? •How much do I care about computational efficiency? •Do I care about interpretability or should it "just work well"? Types of ML algorithms •Supervised learning -Uses training data with desired outputs (labels). The algorithm learns what the correct answer is. •Unsupervised learning -Uses training data without desired outputs. The algorithm has to find patterns in the data on its own. No correct answer given to the algorithm •Semi-supervised learning, Reinforcement learning, Transfer learning Supervised or Unsupervised?•Supervised learning -User can optimize performance based on experience -Predicts unforeseen events based on experience - give training data AND outputs to the machines. Machine learns from past experiences. •Unsupervised learning -Finds unknown patterns in data -Great to classify/categorize/cluster data we don't understand -Use data with minimal intervention - we do not understand the daat to begin with (minimal intervention). Not a labeled data set. No outputs to the algorithm. Clustering.

Tipping points

•When a small change/disturbance causes a large difference to the system •Usually rapid •Role of hysterisis Conclusions •Building and thinking about models involves a suite of difficult trade-offs -what scale is the model focused on? -how mechanistic is it? in what detail are specific processes modelled? -how do process and pattern relate to each other in the model? top-down or bottom-up?

Describe the approach/philosophy of process-based models?

•Why ? -We want to understand, quantify and predict - We can repeat and test it in different environments → modify equations Make testable predictions Directly see your assumptions •Types of process-based models -deterministic -stochastic (randomness) •They typically use systems of mathematical equations to represent processes •They often rely on parameterizations •The importance of being "quantitative" •Come in various shapes and forms •Plenty of confusion about terms •Based on theoretical understanding •Process-based models: Mathematical/Computer/Analytical/Simulation/ Mechanistic

Why use maths?

Why Use Maths? •Two broad reasons: 1. to expose/make us think about our (faulty) assumptions 2. to provide testable consequences or qualitatively/quantitative new insights We can repeat and test it in different environments → modify equations Make testable predictions Directly see your assumptions Utility of Strategic (Theoretical) Models •Are designed for general insight into processes •Frame questions for focussed (tactical) models to address -can be reproduced -should be falsifiable -should make strong and testable predictions

Assumptions of the logistic model

assumes the population is able to occupy N=K indefinitely (deterministic model, whatever the input the output it the same) - assumes resources are plentiful (additions and restrictions) - assumes all organisms have the same number of offspring - assumes an isolated system (climate etc)

Computer models

example of bay: barriers, deepwater, bay (colours indicate depth) → movement and flow of water. Model shows the evolution of the bay over time, delta formulation, complex formation of a network of channels (important source of connectivity of the system) → channels grow in size over time Virtues: change tidal range, sediment size etc → you can control time and space. Boundaries (could change sea level, sediment input etc) → model allows for simulation under many different conditions. Allows you to move from simple to complex scenarios. Should include our knowledge (advanced). Test scenarios, different conditions i.e. storms, flooding. Study of sensitivity to formulas (simulation with waves or without waves etc) Problems: unstable, crashes → lots of coding. Possibility of boundaries may lead to errors. Initial conditions, the model requires a lot of data. As time increases, the model may detach more and more from reality. Models are different depending on place (equations are the same, numerical schemes are different. Predictions from a model are not reality → model is distant from the real world). Does not account for longshore drift etc.

Overfitting: the ML headache

oOverfitting implies lack of generality and leads to poor predictions oML algorithms are VERY powerful oML algorithms can be TOO powerful oLet's not forget underfitting!

Cellular Automata: Game of Life

simple rules to simulate life, growth and disasters. Identify the emergence and evolution of a variety of patterns. The system is not reversible, there are no linearities. Used to predict spread of viruses, traffic etc •one of the best examples of cellular automata modelling (and it's not a game, no shooting ...) • simple "local" rules to simulate "life"(unbounded growth and global disaster) • pattern development & emergence • the system is not reversible • applications include: traffic, spreading of viruses, cell or animal behaviour ... 3 simple rules: 1. Birth 2. Survival 3. Death Birth: A dead cell with exactly three live neighbors becomes a live cell • Survival: A live cell with two or three live neighbors stays alive • Death: In all other cases, a cell dies or remains dead (overcrowding or loneliness). 2 examples: - "blinker": A live cell with only one live neighbour dies. "Blinker" refers to bringing dead cells back to life and oscillating back to death. - "block" (still life) Patterns continuously change in time → we do not need complicated processes to reproduce patterns

How do we build a model?

some key questions •Building and using models involves a series of decisions and trade-offs -scale: grain vs. extent -causality: empiricism vs. mechanism? •We'll also explore -the key trade-off between simplicity and complexity/detail/realism -links between patterns and processes -Top-down or bottom-up?

Growth With Limit: discrete logistic model

stochastic difference is in the term Rd cf: exponential models have no limit

discrete version of the exponential equation:

tells us the number added to the population per time-step: N_t+1= N_t + r_dN_t r_d = discrete growth factor


संबंधित स्टडी सेट्स

LAB EXERCISE 9.2 Determining Elevations from Topographic Maps

View Set

Exam 2 Chapter 13.4 Osteoarthritis

View Set

California Real Estate Chapter 16

View Set

Sociology of Substance Abuse Final

View Set

TAX : Midterm -- Chapter questions and quiz questions

View Set