manpagez: man pages & more
man gmtregress(1)
Home | html | info | man
gmtregress(1)                         GMT                        gmtregress(1)




NAME

       gmtregress - Linear regression of 1-D data sets


SYNOPSIS

       gmtregress  [  table ] [  -Amin/max/inc ] [  -Clevel ] [  -Ex|y|o|r ] [
       -Fflags ] [  -N1|2|r|w ] [   -S[r]  ]  [   -Tmin/max/inc  |   -Tn  ]  [
       -W[w][x][y][r] ] [  -V[level] ] [ -aflags ] [ -bbinary ] [ -dnodata ] [
       -eregexp ] [ -ggaps ] [ -hheaders ] [ -iflags ] [ -oflags ]

       Note: No space is allowed between the option flag  and  the  associated
       arguments.


DESCRIPTION

       gmtregress  reads one or more data tables [or stdin] and determines the
       best linear regression model y = a + b* x for each  segment  using  the
       chosen  parameters.   The  user may specify which data and model compo-
       nents should be reported.  By default, the model will be  evaluated  at
       the  input  points,  but  alternatively  you can specify an equidistant
       range over which to evaluate the model, or  turn  off  evaluation  com-
       pletely.   Instead of determining the best fit we can perform a scan of
       all possible regression lines (for a range of slope angles) and examine
       how  the chosen misfit measure varies with slope.  This is particularly
       useful when analyzing data with many outliers.  Note: If  you  actually
       need  to  work with log10 of x or y you can accomplish that transforma-
       tion during read by using the -i option.


REQUIRED ARGUMENTS

       None


OPTIONAL ARGUMENTS

       table  One or more ASCII (or binary, see -bi[ncols][type])  data  table
              file(s) holding a number of data columns. If no tables are given
              then we read from standard input.  The  first  two  columns  are
              expected  to  contain  the  required x and y data.  Depending on
              your -W and -E settings we may expect an additional 1-3  columns
              with error estimates of one of both of the data coordinates, and
              even their correlation.

       -Amin/max/inc
              Instead of determining a best-fit regression we explore the full
              range  of  regressions.   Examine  all possible regression lines
              with slope angles between  min  and  max,  using  steps  of  inc
              degrees  [-90/+90/1].   For  each slope the optimum intercept is
              determined based on your regression type (-E)  and  misfit  norm
              (-N)  settings.   For  each  segment  we report the four columns
              angle, E, slope, intercept, for the range of  specified  angles.
              The best model parameters within this range are written into the
              segment header and reported in verbose mode (-V).

       -Clevel
              Set the confidence level (in %) to use for the optional calcula-
              tion  of  confidence bands on the regression [95].  This is only
              used if -F includes the output column c.

       -Ex|y|o|r
              Type of linear regression, i.e., select the type  of  misfit  we
              should calculate.  Choose from x (regress x on y; i.e., the mis-
              fit is measured  horizontally  from  data  point  to  regression
              line),  y  (regress  y on x; i.e., the misfit is measured verti-
              cally [Default]), o (orthogonal regression; i.e., the misfit  is
              measured  from  data  point orthogonally to nearest point on the
              line), or r (Reduced Major Axis regression; i.e., the misfit  is
              the product of both vertical and horizontal misfits) [y].

       -Fflags
              Append  a combination of the columns you wish returned; the out-
              put order  will  match  the  order  specified.   Choose  from  x
              (observed  x), y (observed y), m (model prediction), r (residual
              = data minus model), c (symmetrical confidence interval  on  the
              regression;  see  -C  for specifying the level), z (standardized
              residuals or so-called z-scores) and w (outlier weights 0 or  1;
              for  -Nw  these are the Reweighted Least Squares weights) [xymr-
              czw].  As an alternative to evaluating the model, just give  -Fp
              and  we  instead write a single record with the model parameters
              npoints xmean ymean angle  misfit  slope  intercept  sigma_slope
              sigma_intercept.

       -N1|2|r|w
              Selects  the  norm  to  use  for the misfit calculation.  Choose
              among 1 (L-1 measure; the mean of  the  absolute  residuals),  2
              (Least-squares;  the mean of the squared residuals), r (LMS; The
              least median of the squared residuals), or  w  (RLS;  Reweighted
              Least  Squares: the mean of the squared residuals after outliers
              identified via LMS have been removed) [Default  is  2].   Tradi-
              tional  regression  uses L-2 while L-1 and in particular LMS are
              more robust in how they handle outliers.   As  alluded  to,  RLS
              implies an initial LMS regression which is then used to identify
              outliers in the data, assign these a zero weight, and then  redo
              the regression using a L-2 norm.

       -S[r]  Restricts  which  records  will  be output.  By default all data
              records will be output in the format specified by -F.  Use -S to
              exclude  data  points  identified as outliers by the regression.
              Alternatively, use -Sr to reverse this and only output the  out-
              lier records.

       -Tmin/max/inc | -Tn
              Evaluate the best-fit regression model at the equidistant points
              implied by the arguments.  If -Tn is given instead we will reset
              min  and max to the extreme x-values for each segment and deter-
              mine inc so that there are exactly n output values for each seg-
              ment.   To  skip  the  model evaluation entirely, simply provide
              -T0.

       -W[w][x][y][r]
              Specifies weighted regression and which  weights  will  be  pro-
              vided.  Append x if giving 1-sigma uncertainties in the x-obser-
              vations, y if giving 1-sigma uncertainties in y, and r if giving
              correlations  between  x  and y observations, in the order these
              columns appear in the input (after the two required and  leading
              x,  y  columns).  Giving both x and y (and optionally r) implies
              an orthogonal regression, otherwise giving x requires -Ex and  y
              requires -Ey.  We convert uncertainties in x and y to regression
              weights via the relationship weight = 1/sigma.  Use -Ww  if  the
              we  should  interpret  the  input  columns  to  have precomputed
              weights instead.  Note: residuals with respect to the regression
              line  will be scaled by the given weights.  Most norms will then
              square this weighted residual (-N1 is the only exception).

       -V[level] (more a|)
              Select verbosity level [c].

       -acol=name[^<i>a|] (more a|)
              Set aspatial column associations col=name.

       -bi[ncols][t] (more a|)
              Select native binary input.

       -bo[ncols][type] (more a|)
              Select native binary output. [Default is same as input].

       -d[i|o]nodata (more a|)
              Replace input columns that equal nodata  with  NaN  and  do  the
              reverse on output.

       -e[~]^<i>apattern^<i>a | -e[~]/regexp/[i] (more a|)
              Only accept data records that match the given pattern.

       -g[a]x|y|d|X|Y|D|[col]z[+|-]gap[u] (more a|)
              Determine data gaps and line breaks.

       -h[i|o][n][+c][+d][+rremark][+rtitle] (more a|)
              Skip or produce header record(s).

       -icols[+l][+sscale][+ooffset][,^<i>a|] (more a|)
              Select input columns and transformations (0 is first column).

       -ocols[,a|] (more a|)
              Select output columns (0 is first column).

       -^ or just -
              Print  a  short  message  about  the syntax of the command, then
              exits (NOTE: on Windows just use -).

       -+ or just +
              Print an extensive usage (help) message, including the  explana-
              tion  of  any  module-specific  option  (but  not the GMT common
              options), then exits.

       -? or no arguments
              Print a complete usage (help) message, including the explanation
              of all options, then exits.


ASCII FORMAT PRECISION

       The ASCII output formats of numerical data are controlled by parameters
       in your gmt.conf file. Longitude and latitude are  formatted  according
       to   FORMAT_GEO_OUT,  absolute  time  is  under  the  control  of  FOR-
       MAT_DATE_OUT and FORMAT_CLOCK_OUT, whereas general floating point  val-
       ues are formatted according to FORMAT_FLOAT_OUT. Be aware that the for-
       mat in effect can lead to loss of precision in ASCII output, which  can
       lead  to  various  problems  downstream.  If you find the output is not
       written with enough precision, consider switching to binary output (-bo
       if  available) or specify more decimals using the FORMAT_FLOAT_OUT set-
       ting.


EXAMPLES

       To do a standard least-squares regression on the x-y data in points.txt
       and  return  x,  y, and model prediction with 99% confidence intervals,
       try

              gmt regress points.txt -Fxymc -C99 > points_regressed.txt

       To just get the slope for the above regression, try

              slope=`gmt regress points.txt -Fp -o5`

       To do a reweighted least-squares regression on the data  rough.txt  and
       return x, y, model prediction and the RLS weights, try

              gmt regress rough.txt -Fxymw > points_regressed.txt

       To  do an orthogonal least-squares regression on the data crazy.txt but
       first take the logarithm of both x and y, then return x, y, model  pre-
       diction and the normalized residuals (z-scores), try

              gmt regress crazy.txt -Eo -Fxymz -i0-1l > points_regressed.txt

       To examine how the orthogonal LMS misfits vary with angle between 0 and
       90 in steps of 0.2 degrees for the same file, try

              gmt regress points.txt -A0/90/0.2 -Eo -Nr > points_analysis.txt


REFERENCES

       Draper, N. R., and H. Smith, 1998,  Applied  regression  analysis,  3rd
       ed., 736 pp., John Wiley and Sons, New York.

       Rousseeuw,  P. J., and A. M. Leroy, 1987, Robust regression and outlier
       detection, 329 pp., John Wiley and Sons, New York.

       York, D., N. M. Evensen, M. L. Martinez,  and  J.  De  Basebe  Delgado,
       2004,  Unified  equations for the slope, intercept, and standard errors
       of the best straight line, Am. J. Phys., 72(3), 367-375.


SEE ALSO

       gmt(1), trend1d(1), trend2d(1)


COPYRIGHT

       2017, P. Wessel, W. H. F. Smith, R. Scharroo, J. Luis, and F. Wobbe



5.4.2                            Jun 24, 2017                    gmtregress(1)

gmt5 5.4.2 - Generated Wed Jun 28 16:32:36 CDT 2017
© manpagez.com 2000-2021
Individual documents may contain additional copyright information.