info gnuplot

3.6.2 short introduction

fit is used to find a set of parameters that ’best’ fits your data to your user-defined function. The fit is judged on the basis of the sum of the squared differences or ’residuals’ (SSR) between the input data points and the function values, evaluated at the same places. This quantity is often called ’chisquare’ (i.e., the Greek letter chi, to the power of 2). The algorithm attempts to minimize SSR, or more precisely, WSSR, as the residuals are ’weighted’ by the input data errors (or 1.0) before being squared; see ‘fit error_estimates‘ for details.

That’s why it is called ’least-squares fitting’. Let’s look at an example to see what is meant by ’non-linear’, but first we had better go over some terms. Here it is convenient to use z as the dependent variable for user-defined functions of either one independent variable, z=f(x), or two independent variables, z=f(x,y). A parameter is a user-defined variable that fit will adjust, i.e., an unknown quantity in the function declaration. Linearity/non-linearity refers to the relationship of the dependent variable, z, to the parameters which fit is adjusting, not of z to the independent variables, x and/or y. (To be technical, the second {and higher} derivatives of the fitting function with respect to the parameters are zero for a linear least-squares problem).

For linear least-squares (LLS), the user-defined function will be a sum of simple functions, not involving any parameters, each multiplied by one parameter. NLLS handles more complicated functions in which parameters can be used in a large number of ways. An example that illustrates the difference between linear and nonlinear least-squares is the Fourier series. One member may be written as

     z=a*sin(c*x) + b*cos(c*x).

If a and b are the unknown parameters and c is constant, then estimating values of the parameters is a linear least-squares problem. However, if c is an unknown parameter, the problem is nonlinear.

In the linear case, parameter values can be determined by comparatively simple linear algebra, in one direct step. However LLS is a special case which is also solved along with more general NLLS problems by the iterative procedure that ‘gnuplot‘ uses. fit attempts to find the minimum by doing a search. Each step (iteration) calculates WSSR with a new set of parameter values. The Marquardt-Levenberg algorithm selects the parameter values for the next iteration. The process continues until a preset criterion is met, either (1) the fit has "converged" (the relative change in WSSR is less than FIT_LIMIT), or (2) it reaches a preset iteration count limit, FIT_MAXITER (see variables). The fit may also be interrupted and subsequently halted from the keyboard (see fit). The user variable FIT_CONVERGED contains 1 if the previous fit command terminated due to convergence; it contains 0 if the previous fit terminated for any other reason.

Often the function to be fitted will be based on a model (or theory) that attempts to describe or predict the behaviour of the data. Then fit can be used to find values for the free parameters of the model, to determine how well the data fits the model, and to estimate an error range for each parameter. See ‘fit error_estimates‘.

Alternatively, in curve-fitting, functions are selected independent of a model (on the basis of experience as to which are likely to describe the trend of the data with the desired resolution and a minimum number of parameters*functions.) The fit solution then provides an analytic representation of the curve.

However, if all you really want is a smooth curve through your data points, the smooth option to ‘plot‘ may be what you’ve been looking for rather than fit.

This document was generated on November 19, 2011 using texi2html 5.0.