gmtregress(1) GMT gmtregress(1)

## NAME

gmtregress - Linear regression of 1-D data sets

## SYNOPSIS

gmtregress[table] [-Amin/max/inc] [-Clevel] [-Ex|y|o|r] [-Fflags] [-N1|2|r|w] [-S[r] ] [-Tmin/max/inc|-Tn] [-W[w][x][y][r] ] [-V[level] ] [-aflags ] [-bbinary ] [-dnodata ] [-eregexp ] [-ggaps ] [-hheaders ] [-iflags ] [-oflags ]Note:No space is allowed between the option flag and the associated arguments.

## DESCRIPTION

gmtregressreads one or more data tables [orstdin] and determines the best linear regression modely=a+b*xfor each segment using the chosen parameters. The user may specify which data and model compo- nents should be reported. By default, the model will be evaluated at the input points, but alternatively you can specify an equidistant range over which to evaluate the model, or turn off evaluation com- pletely. Instead of determining the best fit we can perform a scan of all possible regression lines (for a range of slope angles) and examine how the chosen misfit measure varies with slope. This is particularly useful when analyzing data with many outliers. Note: If you actually need to work with log10 ofxoryyou can accomplish that transforma- tion during read by using the-ioption.

## REQUIRED ARGUMENTS

None

## OPTIONAL ARGUMENTS

tableOne or more ASCII (or binary, see-bi[ncols][type]) data table file(s) holding a number of data columns. If no tables are given then we read from standard input. The first two columns are expected to contain the requiredxandydata. Depending on your-Wand-Esettings we may expect an additional 1-3 columns with error estimates of one of both of the data coordinates, and even their correlation.-Amin/max/incInstead of determining a best-fit regression we explore the full range of regressions. Examine all possible regression lines with slope angles betweenminandmax, using steps ofincdegrees [-90/+90/1]. For each slope the optimum intercept is determined based on your regression type (-E) and misfit norm (-N) settings. For each segment we report the four columnsangle,E,slope,intercept, for the range of specified angles. The best model parameters within this range are written into the segment header and reported in verbose mode (-V).-ClevelSet the confidence level (in %) to use for the optional calcula- tion of confidence bands on the regression [95]. This is only used if-Fincludes the output columnc.-Ex|y|o|rType of linear regression, i.e., select the type of misfit we should calculate. Choose fromx(regressxony; i.e., the mis- fit is measured horizontally from data point to regression line),y(regressyonx; i.e., the misfit is measured verti- cally [Default]),o(orthogonal regression; i.e., the misfit is measured from data point orthogonally to nearest point on the line), orr(Reduced Major Axis regression; i.e., the misfit is the product of both vertical and horizontal misfits) [y].-FflagsAppend a combination of the columns you wish returned; the out- put order will match the order specified. Choose fromx(observedx),y(observedy),m(model prediction),r(residual = data minus model),c(symmetrical confidence interval on the regression; see-Cfor specifying the level),z(standardized residuals or so-calledz-scores) andw(outlier weights 0 or 1; for-Nwthese are the Reweighted Least Squares weights) [xymr-czw]. As an alternative to evaluating the model, just give-Fpand we instead write a single record with the model parametersnpointsxmeanymeananglemisfitslopeinterceptsigma_slopesigma_intercept.-N1|2|r|wSelects the norm to use for the misfit calculation. Choose among1(L-1 measure; the mean of the absolute residuals),2(Least-squares; the mean of the squared residuals),r(LMS; The least median of the squared residuals), orw(RLS; Reweighted Least Squares: the mean of the squared residuals after outliers identified via LMS have been removed) [Default is2]. Tradi- tional regression uses L-2 while L-1 and in particular LMS are more robust in how they handle outliers. As alluded to, RLS implies an initial LMS regression which is then used to identify outliers in the data, assign these a zero weight, and then redo the regression using a L-2 norm.-S[r] Restricts which records will be output. By default all data records will be output in the format specified by-F. Use-Sto exclude data points identified as outliers by the regression. Alternatively, use-Srto reverse this and only output the out- lier records.-Tmin/max/inc|-TnEvaluate the best-fit regression model at the equidistant points implied by the arguments. If-Tnis given instead we will resetminandmaxto the extremex-values for each segment and deter- mineincso that there are exactlynoutput values for each seg- ment. To skip the model evaluation entirely, simply provide-T0.-W[w][x][y][r] Specifies weighted regression and which weights will be pro- vided. Appendxif giving 1-sigma uncertainties in thex-obser- vations,yif giving 1-sigma uncertainties iny, andrif giving correlations betweenxandyobservations, in the order these columns appear in the input (after the two required and leadingx,ycolumns). Giving bothxandy(and optionallyr) implies an orthogonal regression, otherwise givingxrequires-Exandyrequires-Ey. We convert uncertainties inxandyto regression weights via the relationship weight = 1/sigma. Use-Wwif the we should interpret the input columns to have precomputed weights instead. Note: residuals with respect to the regression line will be scaled by the given weights. Most norms will then square this weighted residual (-N1is the only exception).-V[level] (morea|) Select verbosity level [c].-acol=name[^<i>a|] (morea|) Set aspatial column associationscol=name.-bi[ncols][t] (morea|) Select native binary input.-bo[ncols][type] (morea|) Select native binary output. [Default is same as input].-d[i|o]nodata(morea|) Replace input columns that equalnodatawith NaN and do the reverse on output.-e[~]^<i>apattern^<i>a|-e[~]/regexp/[i] (morea|) Only accept data records that match the given pattern.-g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (morea|) Determine data gaps and line breaks.-h[i|o][n][+c][+d][+rremark][+rtitle] (morea|) Skip or produce header record(s).-icols[+l][+sscale][+ooffset][,^<i>a|] (morea|) Select input columns and transformations (0 is first column).-ocols[,a|] (morea|) Select output columns (0 is first column).-^or just-Print a short message about the syntax of the command, then exits (NOTE: on Windows just use-).-+or just+Print an extensive usage (help) message, including the explana- tion of any module-specific option (but not the GMT common options), then exits.-?or no arguments Print a complete usage (help) message, including the explanation of all options, then exits.

## ASCII FORMAT PRECISION

The ASCII output formats of numerical data are controlled by parameters in your gmt.conf file. Longitude and latitude are formatted according to FORMAT_GEO_OUT, absolute time is under the control of FOR- MAT_DATE_OUT and FORMAT_CLOCK_OUT, whereas general floating point val- ues are formatted according to FORMAT_FLOAT_OUT. Be aware that the for- mat in effect can lead to loss of precision in ASCII output, which can lead to various problems downstream. If you find the output is not written with enough precision, consider switching to binary output (-boif available) or specify more decimals using the FORMAT_FLOAT_OUT set- ting.

## EXAMPLES

To do a standard least-squares regression on thex-ydata in points.txt and return x, y, and model prediction with 99% confidence intervals, try gmt regress points.txt -Fxymc -C99 > points_regressed.txt To just get the slope for the above regression, try slope=`gmt regress points.txt -Fp -o5` To do a reweighted least-squares regression on the data rough.txt and return x, y, model prediction and the RLS weights, try gmt regress rough.txt -Fxymw > points_regressed.txt To do an orthogonal least-squares regression on the data crazy.txt but first take the logarithm of both x and y, then return x, y, model pre- diction and the normalized residuals (z-scores), try gmt regress crazy.txt -Eo -Fxymz -i0-1l > points_regressed.txt To examine how the orthogonal LMS misfits vary with angle between 0 and 90 in steps of 0.2 degrees for the same file, try gmt regress points.txt -A0/90/0.2 -Eo -Nr > points_analysis.txt

## REFERENCES

Draper, N. R., and H. Smith, 1998,Appliedregressionanalysis, 3rd ed., 736 pp., John Wiley and Sons, New York. Rousseeuw, P. J., and A. M. Leroy, 1987,Robustregressionandoutlierdetection, 329 pp., John Wiley and Sons, New York. York, D., N. M. Evensen, M. L. Martinez, and J. De Basebe Delgado, 2004, Unified equations for the slope, intercept, and standard errors of the best straight line,Am.J.Phys.,72(3), 367-375.

## SEE ALSO

gmt(1),trend1d(1),trend2d(1)

## COPYRIGHT

2017, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe 5.4.2 Jun 24, 2017 gmtregress(1)

gmt5 5.4.2 - Generated Wed Jun 28 16:32:36 CDT 2017