The Chemist | Journal of the American Institute of Chemists

TABLE OF CONTENTS

EDITORIAL

ARTICLE #1 -
RECENT ADVANCES IN CARBOHYDRATE CHEMISTRY AND APPLICATION

ARTICLE #2 -
MARINE MICROBIAL SIDEROPHORES: REACTIVITY AND STRUCTURAL DIVERSITY

ARTICLE #3 -
SURFACE CHEMISTRY AND BIOMOLECULE DENSITY IMPACT ADSORBED CELLULASE ACTIVITY

ARTICLE #4 -
ECO-FRIENDLY METHOD OF SYNTHESIS CeO₂ NANOPARTICLES BY WATERCRESS PLANT EXTRACT FOR REMOVAL OF CIBACRON RED DYE FROM AQUEOUS SOLUTIONS

ARTICLE #5 -
ESTIMATION OF NONLINEAR REGRESSION PARAMETERS PRECISION

ARTICLE #6 -
ESTIMATION OF SERUM TUMOR MARKERS AND SOME BIOCHEMICAL PARAMETERS OF BREAST CANCER PATIENTS

ARTICLE #7 -
FORMULATION AND SUSTAINED-RELEASE OF VERAPAMIL HYDROCHLORIDE TABLETS

ARTICLE #8 -
INTERVIEW: A CONVERSATION WITH EMMA SAGARESE

The Chemist Volume 94 | Number 1

ISSN 1945-0702

Estimation of Nonlinear Regression Parameters Precision

Prasanth Sambaraju

Independent Researcher
Area of Chemistry: Chemistry Education

(E-mail: prasanth.sambaraju@gmail.com)

Abstract: The aim of this paper is to estimate the parameter precision values in nonlinear regression using Microsoft Excel. In case of simple linear regression, these values can be obtained from regression analysis in Excel. Commercially available programs output these values, but the main limitation is that they force the users to adopt a black box approach. The procedure described here can be adopted to any nonlinear dataset assuming that parameters obtained from solver accurately describe the nonlinear regression model. For one of the nonlinear datasets from NIST Statistical reference datasets, the parameters are initially estimated by using Solver and then by using Finite differences method parameter precision values are calculated. The results obtained were found to be comparable with the reported values.

Key Words: Nonlinear regression, solver, least squares, perturbation

Introduction

Regression analysis deals with the relationship between one or more independent variables and a dependent variable. Regression analysis is performed by selecting a suitable function which accurately describes the relationship between the two and estimator to calculate the parameter values. These regression parameters are calculated by least squares method. In linear regression [1], either the parameters are linear or the function describing the model is linear. Nonlinear regression methods are applied when the relationship between dependent and independent variables in is not linear. Nonlinear regression relies on iterative procedure to find the best fit. The process [2] starts with initial values for each parameter, and then by using least squares fitting, the best fit parameter values which minimizes the sum of squared residuals are estimated. Microsoft Excel Solver Add-in estimates parameters for nonlinear regression by least squares method, but it doesn't estimate their precision. The paper aims to use finite difference method to estimate these values.

Nonlinear Regression

A model is considered to be nonlinear if any of the partial derivatives with respect to any of the model parameters are dependent on any other model parameters, any of the derivatives do not exit or are discontinuous. A nonlinear model can be expressed as (equation 1):

where y is the vector of responses, f is the function used to describe the model, θ is vector of model parameters, t is the predictor variable and ε is the vector of residuals [3].

Rat43 dataset from National Institute of Standards and Technology (NIST) website [4], was selected to estimate the parameter precision by finite difference method. This dataset was selected, as the data can be fitted by using nonlinear regression method and is included in the higher level of difficulty group.

The nonlinear regression model described by equation 2 was used to fit the data, where response variable (y) is the dry weight of onion bulbs, whereas predictor variable (t) is growing time [4]. The model parameters were estimated by using both starting values and the results are given in Table 1.

Table 1. Rat43 Dataset Values

Solver Implementation to Estimate Parameter Values (Table 2)

a) Data is entered in Excel, growing time (t) in column A and onion bulb dry weight (y) in column B.
b) Initial values for b1, b2, b3 and b4 given in NIST website were used [4].
c) By substituting different values of t, initial values of b1, b2, b3 and b4 in equation (2) different calculated y values were obtained.
d) Sum of squared errors (SSE) was obtained by adding the squared differences between y and y calculated value. SUMXMY2 function in Microsoft Excel was used to calculate the SSE value. In the parameters worksheet, cells B2 to B16 contains y values, cells C2 to C16 contains y calculated values which were obtained by using start 1 initial estimates. SSE value was obtained by using the formula “SUMXMY2 (C2:C16, B2:B16)”.
e) Best fitted parameter values were obtained by minimizing the SSE value using generalized reduced gradient (GRG) nonlinear method.
f) The process was also repeated by using second set of start values as given in the NIST website [4].
g) Results obtained from both start values were in agreement with the reported values [4].

Table 2. Model Parameter Results Obtained After Least Square Fitting Using Excel Solver

Finite Difference Method

For a function with two or more independent variables, the partial derivative of that function with respect to a particular variable is the derivative of that function with respect to that variable, while holding the other variables constant [5]. The partial derivative term of each data point can by calculated by numerical differentiation. The parameter term (b1) is varied by a small amount from its optimized value while the other parameters terms are held constant. This variation of a parameter by a small amount is called perturbation. The partial derivatives are calculated by using the formula in Equation 7 [6] for all data points. Then the process is repeated for all the parameter terms in the model [6]. The Jacobian matrix (J) is constructed from all these partial derivative terms and is given by the following equation (3) [7], where m is the number of nonlinear parameters.

The parameter precision can be estimated from the matrix of partial derivatives (E), which is obtained by multiplying transpose of J with itself (E = JT * J) [7] and is given in equation 4.

The square roots of the diagonal elements of was obtained from solver implementation the above E matrix inverse when multiplied and df is the degrees of freedom (difference by root residual standard deviation (RSD) between number of observation and number yields the precision for the respective parameter (equations 5 and 6) [7], where SSE value of parameters in the model).

The individual partial derivatives for the J matrix can be calculated from the following equations (7) [7], where δ (perturbation) can be 10-6 or 10-7 [7]. The above calculations are performed in Microsoft Excel to obtain the parameter precision values for all the parameters.

Results and Discussion

The partial derivative calculations for the first parameter using parameter values obtained from Start 1 values are shown in Figure 1. Values in Column A9 to B23 corresponds to the raw data, values in C9 to C23 correspond to calculated values obtained after solver optimization. For values in D9 to D23, to the first parameter (β1) δ (10-6) was added and then its corresponding y (independent variable) values are calculated. The values in E9 to E23 are calculated as per equation 7. The calculations are repeated for all the parameters to obtain the partial derivative matrix (J), as given in equation 3.

Figure 1. Excel Sheet Showing Partial Derivative Calculation One Parameter (β1)

*** to view the complete Excel file, go to the appendix ***

The Jacobian matrix in cells N9 to N23 are obtained by pasting the values of E9 to E23 for first parameter. The values O9 to Q23 in the matrix are obtained similarly from the other three parameters, and the resulting matrix is shown in Figure 2.

Figure 2. Jacobian Matrix

The result in Figure 3 is obtained by taking inverse of matrix E. The first parameter precision is obtained by multiplying the square root of value in N28 and RSD value (equation 8) [7].

Figure 3. Inverse of Matrix E

The results obtained were comparable with the reported values [4] and are given in Table 3.

Table 3. Comparison Standard Deviation of Parameter Estimate Values

Conclusions

The parameter precision results obtained from the finite difference method were comparable to the reported values. Finite difference method offers a convenient approach to estimate nonlinear regression parameter precision values. Even though the method is robust in calculating precision values, limitations due to solver, like lack of convergence during optimization or parameter values inaccurately describing the model, can be encountered. For example, in the case of MGH10 dataset (which is available in the NIST website), the parameter values obtained after solver optimization are incorrect, which in turn leads to inaccurate parameter precision values.

References

Frank IE. Chemom. Intell. Lab. Syst., . 1995, 27, 1-9.
Kemmer G, Keller S. Nat. Protoc., 2010, 5, 267-281.
Peter LB in Pharmacokinetic-Pharmacodynamic Modeling and Simulation, 2nd ed., Springer, New York, 2011, ch. 3, pp 101-130.
NIST StRD Nonlinear regression. Accessed 2022 Jul 14. https://www.itl.nist.gov/div898/strd/nls/nls_main.shtml
Billo EJ in Excel® for Scientist and Engineers Numerical Methods, 1st ed., Wiley New Jersey, 2007, ch. 12, pp 263-286.
Billo EJ in Excel® for Chemists: A Comprehensive Guide, 3rd ed., Wiley, New Jersey, 2011, ch. 15, pp 463-488.
Kirkup L in Data Analysis for Physical Scientists: Featuring Excel®, 2nd ed., Cambridge University Press, Cambridge, 2012, ch. 8, pp 335-381.

APPENDIX