t VERSITY OF CALIFORN TACKS :alifornia AGRICULTURAL E X IE RIMENT STATION BULLETIN 866 CAEBAD 866 1-24 (1974) Useful procedures for estimating total volume (or value) from a given timber sale when not every truckload of logs is scaled are presented. The efficiency of these procedures is compared, using sample data from three timber sales at Jackson State Forest. No significant difference was found between the various ratio estimators and the simple regres- sion estimator. Using a regression of volume on weight and number of logs per load reduced sample size necessary for a given level of precision, but the cost of counting the logs in each load may be too high to make this practical. JUNE, 1974 THE AUTHOR Lee C. Wensel is Associate Professor in the School of Forestry and Conser- vation and in the Agricultural Experiment Station, Berkeley. ESTIMATORS FOR USE IN WEIGHT SCALING OF SAWLOGS 1 INTRODUCTION Interest in the use of weight as a basis for estimating board-foot scale (or value) of loads of sawlogs is increasing in California as well as in the rest of the United States. As used here, weight scaling involves obtaining the weight of every truck load but scaling only a fraction of these loads selected at ran- dom. The advantages of weight scaling over the previously-used method of scaling all loads of sawlogs in a sale include (1) more accurate estimates because greater care can be exercised in assessing defect on the few loads scaled; (2) reduced hauling costs be- cause weighing generally takes less than a few minutes per truckload; (3) re- duced labor costs because fewer scalers need to be employed and weighmasters need not be as highly trained; (4) re- duced accounting costs because fewer scale tickets need to be handled; and (5) an improved basis for paying truck- ing costs. The purpose of this study is to present several estimation procedures that can be used to estimate the total volume (or value) from a given timber sale when not every truckload of logs is scaled. Statistical efficiency of these procedures is compared by using sample data from three timber sales at Jackson State Forest, Fort Bragg, California. The procedures described here are limited to simple random sampling (SRS) and ratio and regression esti- mators making use of weight or the number of logs per load on all loads, or both. Measurement of other vari- ables, such as the total length of all logs, average diameter, moisture con- tent, species composition, etc., might also improve the prediction of total volume or value from the sale. How- ever, little is known concerning tech- niques for incorporating this informa- tion into the sampling process without incurring prohibitive measurement costs. As experience with sample scal- ing of sawlogs in California increases, additional refinements of the estimation procedures will certainly come about. ESTIMATION PROCEDURES We wish to estimate the total volume (or value), V of the logs taken from a particular timber sale in N truckloads. Page 4 gives summaries of the equations for jx v , the estimate of the average volume per truckload and S*, and the estimate of associated error variance for each estimation procedure considered. The esti- mate V of the total volume of the sale over all N truckloads is simply Variance of this estimator is given by 2 (0.1) 'Submitted for publication May 14, 1973. 2 The finite population correction (fpc) used in this context is also used by Johnson et al. (1963) and developed in papers by Tikkiwal (1960) and Tin (1965). var(V) = N 2 var(/},,) N(N - n) (0.2) where the values of (L v and S 2 are obtained using the appropriate equations given on page 4, "Summary of Estimation Equations." Confidence intervals on the estimate of the total volume (or value) are given for large samples by [V - z^var(V), V + z v /var(K)] (0.3) where z is the appropriate percentage point of the standard normal distribution (z = 1.96 for 95 per cent confidence intervals). For small samples, z is replaced in the above expression by the appropriate value of Student's /. Summary of estimation equations Estimator type Mean Sample variance Degrees S? of freedom Simple random sampling .... V Ratio estimators . . R^ w ratio of means . . R-r- V mean of ratios* . . n ^ W { unbiased mean of ratios .... N(n ■ \)n V-rW -1) l*„ Regression estimators: Si + R 2 Sl - 2RS V n - 1 n- 2 volume on weight V + b(n w — W) volume on number «, L// T . oflogs V+HMl-L) volume on weight and number of logs ){ V~2 n - 1 V+b l (ii w -rV) + b 2 (v L - L) S 2 V {\ - Rl wL ) n-2 w-2 «-3 *In the case of the mean of ratios estimator r we can also compute S 2 (n - 1 degrees of freedom) by tf.S 2 , where 5 r 2 Ur, ~ f) 2 and r, = — - n — \ W, In all cases the variance of the estimator of the mean is given by — — — S 2 and of the total by \ Nn J n Definition of symbols TV Total number of truckloads in sale. n Number of truckloads in samples to be scaled. V h W { , L, Volume (or value), weight, and number of logs, respectively, in the i ih sample (i = 1,2 ,n). V, W, L sample average for the n randomly selected values of V h W h and L h respectively. fi v , ix w , \x L average volume (or value), weight, and number of logs for all TV truckloads. /},, estimator of ji v for estimator chosen. Si, S*, Si sample variances for V t , W h and L,, respectively. S vw , S vL sample covariance for (V h W^) and (V h L t ), respectively. r rw , r vL sample correlation coefficients for (V h W^ and (V h L,), respectively. R?. wL multiple correlation coefficient for V { on W { and L,. b, b' simple regression coefficients for the regression of ^ on W i and V t on L,, respectively. b t , b 2 multiple regression coefficients for regression of V^ on W i and L,. SIMPLE RANDOM SAMPLING In the simplest case^ that of simple random sampling (SRS), (l v is simply the arithmetic average V given by V = \±V t (1.1) and S* is the sample variance given by si = sf. = ^zP 2 = rh^' ~ n F) (L2) Using the sample computational data below, we obtain V S rs^ tne SRS estimate of the total volume of the 10 loads for logs, as Vsrs = NpL v = Nv = (10 loads)(5,600 board feet per load) = 56,000 board feet. The variance of this estimate is obtained by first computing 5J = Sf. = ^~f = 1278600 °;- 4(5600)2 = 806,667 (board feet)* n — 1 3 then var(V SRS ) = N ^ N ~ n) S 2 e = 10(1 ° ~ 4) (806,667) = 12,100,000 (board feet) 2 and the standard error is thus Jvar{V SRS ) = ^12,100,000 = 3478 board feet. With Student's t with (n - 1) degrees of freedom t 3{95) equal to 3.182, the 95 per cent confidence interval on the total volume of the 10 loads becomes [V ± /Vvar(fO] or [56,000 ± (3.182)(3478)] [56,000 ± 11,067] [44,933, 67,067] board feet. Sample data for computational example Load Weight Logs Volume Ratio i W { ^ Vi Ri (number) (pounds) (number) (board feet) (board feet per pound) 1 51,000 28 — — *2 43,600 42 4,600 0.10550 3 68,000 18 — — *4 60,000 74 6,200 0.10333 5 52,000 24 — — *6 49,400 40 5,100 0.10324 7 56,000 10 — — *8 63,000 5 6,500 0.10317 9 48,000 25 — — 10 54,000 21 — — Population (N = 10) Sum 545,000 240 — — Mean 54,500 24 — — *Sample {n = 4) Sum 216,000 104 22,400 0.41524 Mean 54,000 26 5,600 0.10381 Sample variances s2 IK? - nV 2 n — 1 n - 1 2,420,000 3 806,667 c2 tWf - nW 2 *" ~ n - 1 n - 1 246,320,000 3 = 82,106,667 c2 ZLf - nU iL " n-\ n - 1 = 9 f = 306.67 c2 TRf - nR 2 bR ~ n-\ 3.8210 ' 3 X 10 " 6 = 1.2737 x 10- Sample covariances c ZFJWJ - nVW S ™ ~ n-\ = £ W = 24,400,000 = fz — 1 3 ZV^ - nVL ^rl. — Si* /,. -46,400 = -15,466.67 _ ZWjL, -nWL _ Iw/i _ -464,800 _ i wl " " „_i " " ^TT ------ 154.WJ.JJ Sample correlation coefficients 244 * 105 0.99938 Ei>? In', 2 ^(242 x 10 4 )(24632 x 10 4 ) Tv/j -46,400 Ei>? I*f 2 ^7(242 x 10 4 )(920) " 0.98337 -464,800 = _ Ew? 27? ^(246,320,000X920) RATIO ESTIMATES If R is the ratio of the total volume to the total weight on the sale, i.e., R- V We can obtain a ratio estimate V R of the total volume by first estimating R by R and then multiplying by the total weight as follows V R = WR where R is used here to imply any of the ratio estimators described below. Using the computational data on page 6, three methods of estimating R will be considered. The first estimator is r, the ratio of means estimator (Wensel, 1973.) h V 5,600 board feet ftinl . A , AC 4 ,. ,, R = r = = = ' __^ ; — = 0.10370 board feet per pound (2.1) W 54,000 pounds and V r = Wr = (545,000 pounds)(0. 10370 board feet per pound) = 56,517 board feet. The mean of ratios estimator r is calculated as 1 " V 41524 R = r = — Y— l - = — : - 0.10381 board feet per pound (2.2) n ^ W { 4 b F and V_ = W? = (545,000 pounds)(0. 10381 board feet per pound) = 56,576 board feet. Both of the ratio estimators given above are biased (Raj, 1968). However, the bias of r can be estimated unbiasedly by L/ _, (N - \)n V - rW b{r) = (n-^)N —^T yielding the unbiased ratio estimator f' = f + b(f) (2.3) For the example, m (10 - 1)4 5,600 - (0.10381)(54,000) 54,500 (4 - 1)10 = -0.00013 giving r' = 0.10381 - 0.00013 = 0.10368 board feet per pound and Vp, = (545,000 pounds)(0. 10368 board feet per pound) = 56,506 board feet. Raj (1968, p. 235) also states that for large samples the bias of r is approximately (l/n) times that of r. While this might seem to favor the ratio of means estimator r over the mean of ratios estimator f, experience in this study showed the esti- mators r, r, and r' to be almost identical and therefore not really separable on the basis of bias. Certainly the ratio of means estimator r is much easier to calculate. The error variance estimator S\ of the ratio estimator is given approximately by Johnson et al. (1963) and Raj (1968). 3 S 2 e = S? + R 2 S> - 2RS VW . (2.4) Using the ratio of means estimator r = 0.10370 for R we obtain S 2 e = 806,667 + (0.10370) 2 (82,106,667) - 2(0.10370)(8,133,333) = 2763 giving a sample variance of mr( y R) = W-n) sl = "Kt0 4 -4) 2763 _ 4M45 and a standard error of var(V R ) = V41,445 = 204 board feet. SIMPLE REGRESSION ESTIMATORS Regression of volume on weight. Ratio estimators can be used effectively when the relationship between weight and volume is a straight line passing through the origin. However, if the line doesn't pass through the origin, a regression estimator will produce more precise estimates. While the actual relationship between volume and weight does go through the origin (zero weight means zero volume), the truckload weights are usually more than 40,000 pounds. Thus there does not appear to be any reason to restrict the model at the origin. Further, it is important 3 In the case of the ratio of means estimator, r, an equivalent form for S* is given by using the un- corrected sums of squares and crossproducts. This form given by Cochran (1963, p. 163) yields the same value as that computed using equation (2.4). 8 to point out that, contrary to the usual regression situation, the weights are not fixed in repeated sampling. Ifb is the estimate of the slope of the regression line of volume on weight, i.e., "LVjWi _ 24,400,000 Twf ~ 246,320,000 b = ^x = ^a! ?™ nn» = 0.099066 board feet per pound then the regression estimator of the total volume of N = 10 loads is given by v v . w = N[y + htv w - wy\ = 10[5600 + 0.099066 (54,500 - 54,000)] (3.1) = 56,500 board feet. The variance of this estimator is estimated by where in this case 4 «(O-^zJ0s2 I n — \ \ |j|— ~JS;(1 -riJ (3.2) 806,667 [1 - (0.99938) 2 ] 4-2 1500 and thus var(V vw ) = 1 ^ ( ~ — —1500 = 22,500 The standard error is then given by y/var(V v .J = V 22,500 = 150 board feet. Regression of volume on number of logs. Where it is not possible to obtain the weight of each truckload of logs it may still be possible to obtain the number of logs per load. In such cases one could use the simple regression estimator Kl = tfp + V(Pl ~ £)] (3-3) where b\ the slope of the regression line relating volume and number of logs, is given by I./,. -46,400 b =Y^ = ^2^ = ~ 50A35 Thus V vL = 10 {5600 + (-50.435)[24 - 26]} = 57010 board feet and s 2 , = sl L = ('^)s?.(\ 4 This equation can be expressed as: c2 _ Id? - bXvj •*e — Z 4-2 = 39910 yielding a standard error of 1 806,667 [1 - (-.98337) 2 ] = 774 board feet. MULTIPLE REGRESSION ESTIMATOR It seems unlikely that a truckload consisting of only three logs would contain the same usable volume as a load of 20 logs weighing the same amount, so it appears useful to include the number of logs in the estimation procedure. For most truckloads the number of logs can be obtained by the weighmaster when the trucks are weighed. Loads having too many logs to count can be estimated separately, using only weight sampling. Including the number of logs per load in our regression model we obtain the estimator K- wL = N[_V + b,(n w - W) + b 2 (ji L - Lj] (4.1) where L and ji L are the average number of logs for the loads scaled and for all loads, respectively. The coefficients b l and b 2 are obtained by solving the following two equations simultaneously: b^wf + bjZw/i = ZvVft;,- bJLwJt + b 2 l/f = X^- Using the data from the sample problem we have b x (246,320,000) + b 2 (- 464,800) = 24,400,000 ^(-464,800) + b 2 {920) = -46,400 which yields b x = 0.083333 and b 2 = -8.3333. Using equation (4.1), the estimate of the total volume of the ten truckloads then becomes V v . wL = 10{5600 + (0.083333)[54,500 - 54,000] + (-8.3333)[24 - 26]} = 56580 board feet. The variance of V v . wL is estimated by var(V v . wL ) 10 N(N - n) c2 where in this case 5 (Jtti)' and R vwL is the multiple correlation coefficient and R?. wL is given by R: b^w^ + b 2 T/ i v i Therefore, and _ (0.083333)(24,400,000) + ( — 8.3333)( — 46,400) 2,420,000 = 0.999996 ^— -||J -(K_) = ^1^9.68 = 145 yielding a standard error of v/w*r(^,. wL ) = 12 board feet. Estimates and standard errors for various estimation equations are sum- marized as follows: Estimated Standard timator volume error board feet YSRS 56,000 3478 Vr 56,576 204 K-w 56,500 150 K;. 57,010 774 K„t. 56,580 12 The example above is given only to demonstrate computations involved in obtaining the various estimates. Relative efficiencies of these estimators are examined below by using actual data. SAMPLE SIZE COMPUTATION Assuming that we want the confidence interval to be ±E per cent of the total volume in the sale, then the sample size n is given by z\CV) 2 5 This equation can also be expressed as: S E 2 | z\cvy 2 I.rf — /^Zr.vr, — b 2 Z.v/i n - 3 11 where z is the value of the standard normal deviate for the appropriate level of confidence (z = 1 .96 for 95 per cent confidence interval) CV is the expected coefficient of variation of the estimator, and N is the anticipated number of truckloads of logs that will come from the sale. Example of sample size calculation estimated total volume 1 1 MM board feet estimated volume per load 5500 board feet per load = 2000 loads CV = 12 per cent E = + 2 per cent 100 — a = Z = 95 per cent 1.96 n = (1.96) 2 (12) 2 129 truckloads 2 (1.96) 2 (12) 2 2000 The value of the E that is selected depends upon the user. The California Division of Forestry (CDF) requires a precision of + 2 per cent at the 95 per cent level of confidence on all check scales. Thus it seems appropriate to use the same precision here. The value of N can be estimated by dividing the anticipated total volume on the sale by the expected volume per truckload N estimated total volume on sale area estimated average volume per truckload. Because gross volume is less variable than net volume, in terms of expected volume per truckload and total volume of the sale, it is probably better for calculating N. The standard normal deviate is used here in place of Student's / because the anticipated sample size is quite large. For small samples, Student's / would be used. The coefficient of variation in per cent is given by 100 times the ratio of the standard deviation divided by the estimate (e.g., in the case of simple random sampling) cv = -4k- ioo v where S 2 is given by the appropriate equation on page 4, "Summary of Esti- mation Equations." Table 1 shows that the sample size is not very sensitive to inaccuracies in estimating N. However, if the actual number of truckloads that come from a 12 particular sale is less than the number estimated, it may not be possible to take all of the samples planned. To guard against this possibility, it is advisable to use a 'lower bound" for the estimate of the total volume to be cut on the sale. If the sale overcuts and N' truckloads of logs result instead of TV, additional samples should be taken from the last (N' — N) loads with the approximate sampling fraction n/N. This will make the sample representative of all the loads taken from the sale and not just the first TV. After the sale is completed TV is redefined to be the actual total number of loads of logs that result, and this value is then used in calculating the final confidence interval for the sale. If you want to change the intensity of the sampling at any time during the sale, it would be necessary to deal with the two portions of the population of truckloads as two "strata" in a stratified sample. One should be careful, however, not to recognize "unnecessary" strata as this can reduce the over-all efficiency of the sampling effort. Table 1 SAMPLE SIZE: NINETY-FIVE PER CENT CONFIDENCE INTERVALS OF +2 PER CENT N CV per cent 1800 2000 2200 11 ... . 109 110 110 12 ... . 128 129 130 13 ... . 149 150 151 14 ... . 171 172 173 SELECTION OF SAMPLE LOADS Once the number of samples is known, there are a number of ways in which one can decide which truckloads in the sample are to be scaled. It is important that some kind of a random procedure be used to determine which loads are to be scaled. A subjective selection method could seriously bias the estimates. The procedure used will undoubtedly vary, depending upon the constraints imposed by the individual situation. However, the procedure used here involves choosing truckload numbers at random (n out of N without replacement) and then having the weighmaster, at the time he weighs the truckload, paint the word "sample" on the truckloads that are selected for scaling. The truckloads are then unloaded and set aside for scaling in the log yard at the mill. (An example of of this random selection process is given on page 14.) One problem with a completely random sample is that samples can tend to bunch up creating periods of heavy scaling activity with resultant loss of operating efficiency in the mill yard. In order to retain an element of randomness but still distributing the scaling effort more uniformly in time, we could use a restricted random sample (page 14). Here we would choose one load at random from 13 each group of k truckloads, where k = N/n. We could also choose more than one sample from each group, with an appropriate reduction in the number of groups. Ready-made lists with tear-off tabs are available to facilitate this type of sample selection. Random selection of truckloads to be scaled Assume that we are to select a sample of 5 truckloads out of a total of 85. We first consult a table of random digits (two digits at a time), and going through the table systematically from a random starting point we obtain numbers 07, 40, 91, 17, 86, 77, 40, and 76 Numbers 91 and 86 are greater than 85 so they are discarded. Also 40 appears twice so we delete the second 40, leaving the numbers, in increasing order, 07, 17, 40, 76, and 77. Now as each truckload is weighed, it is numbered sequentially. When truckloads 7, 17, 40, 76, and 77 are weighed they are marked for scaling. Restricted random selection of truckloads to be scaled If we select 5 truckloads out of a total of 85, we might divide the loads into 5 groups of 17 elements each and select one element at random from each group. Thus, we select 5 random numbers between 1 and 17 from a table of random numbers. These numbers tell us which element in the respective group will be included in the sample. For example, the random numbers (between 1 and 17) 07, 16, 11, 05, and 11 mean that the 7th load from the first group, 16th from the second, 1 lth from the third, 5th from the fourth and 1 1th from the fifth are sampled. This means that loads 7, 33, 45, 56, and 79 will be sampled. Note that the random numbers selected here are with replacement, and although the same random number may appear more than once the truckloads are still sampled without replacement. COMPARISON OF ESTIMATORS The following comparisons are based upon sample data from three 1971 timber sales at Jackson State Forest. The 631 loads used here as a sampling population are the loads that were selected for scaling as well as weighing. The personnel at Jackson State Forest have been involved in weight scaling since 1964 (Burns 1970) and, based upon data collected between 1964 and 1967, 14 two sales in 1968 were sold on a weight-scaling basis. Initially all weight sales were in young-growth timber, but satisfactory results with old-growth stands (Chamber Creek #6) show that it can be more generally applied. The following are sales analyzed in this report. (Descriptions of these stands were prepared by Mr. David Burns, California Division of Forestry.) Watershed #1 was a sale of 10,331 M board feet from 249 acres in the Casper Creek compartment. The timber harvested was 80- to 90-year-old second growth having a species composition and percentages as follows: redwood 43.1 per cent, Douglas fir 40.2 per cent, grand fir 12.5 per cent, western hemlock 3.5 per cent, and bishop pine 0.7 per cent. Average log size was about 250 board feet. Defect and breakage was 1 1.7 per cent. A heavy selection cutting system removed about 60 per cent of the stand volume. Caspar Creek #13 was a sale of 7,830 M board feet from 192 acres. It was also in the Caspar Creek drainages, however the average age of the timber was between 75"and 85 years old. About 68 per cent of the merchantable volume was harvested using group and single tree selection. Species composition and percentage of timber removed are as follows: redwood 67.7 per cent, Douglas fir 27.7 per cent, grand fir 0.7 per cent, western hemlock 1.4 per cent, and bishop pine 2.5 per cent. Defect and breakage of scaled logs was 8.9 per cent. Average log size was about 225 board feet. Chamber Creek #6 was a sale of 8,782 M board feet from 798 acres in the Chamberlain Creek drainage. This timber was residual old-growth left after the original logging in the early 1940's. Because of the two-storied stand, all trees over 23 inches d.b.h. were harvested. Species composition and percentage of the harvested timber are as follows: redwood 75.1 per cent, Douglas fir 24.6 per cent, and grand fir and western hemlock 0.3 per cent. Average defect and breakage was 13.6 per cent. Average log size was about 450 board feet. Table 2 gives a summary of the data for these three sales. The coefficients of variation that resulted from the use of the various estimation equations given above are listed in table 3. The sample sizes that would be necessary to achieve a 95 per cent confidence interval of + 2 per cent are given in table 4. Table 2 DATA FOR THREE SALES SOLD ON A WEIGHT BASIS ON JACKSON STATE FOREST Average weight per truck (lb) Average number of logs per truck Vol Lime Volume divided by weight Mean volume divided by mean weight Item Gross Net Gross Net Gross Net Mean .... CV(%) . . . Watershed ft 1 54515 6.8 n = 231 17.1 44.9 boon 5624 12.4 i feet 4973 15.3 0.10334 12.0 board feet 0.09127 14.2 per pound 0.10316 0.09121 Mean .... CV(%) . . . Caspar Creek #6 54009 7.1 n = 224 11.6 31.4 5625 13.5 4863 15.3 0.10420 12.0 0.09007 13.9 0.10415 0.09005 Mean .... CV(%) . . . Chamber Creek jtl3 53364 5.7 n = 176 17.2 43.6 5404 14.7 4828 15.8 0.10127 13.5 0.09045 14.6 0.10126 0.09047 Mean .... CV(%) . . . Combined 54015 6.7 n = 631 15.2 46.2 5563 13.5 4893 15.5 0.10307 12.5 0.09062 14.2 0.10299 0.09059 15 Table 3 COEFFICIENTS OF VARIATION OF NET VOLUME OBTAINED USING THE VARIOUS ESTIMATION EQUATIONS* Estimation procedure Sale area #1 #6 #13 Combined Simple random sampling V v . . Ratio V r . . Simple regression : Volume on weight V v . w . Volume on no. of logs V V . L . Multiple regression . . V v . wL . 15.3 14.2 14.1 15.1 13.7 per < 15.3 13.6 13.6 14.5 12.7 ~ent 15.8 14.6 14.6 14.7 13.7 15.5 14.1 14.1 15.1 13.5 *Here the term "coefficient of variation" is used quite loosely to mean ratio of S e to average net volume per load. Table 4 SAMPLE SIZE REQUIRED FOR 95 PER CENT CONFIDENCE INTERVALS WITHIN ±2 PER CENT OF TOTAL NET VOLUME. (TV = 2000) Estimation procedure Sale area #1 #6 #13 Combined Simple random sampling. . V v . Ratio V r . Simple regression : Volume on weight . . . V v . w Volume on number of logs V V . L Multiple regression .... V v . wL 208 182 181 206 172 number 209 168 168 190 149 of loads 111 191 191 196 162 214 180 180 205 168 Regression of volume on weight and number of logs The estimator with the smallest CV was the multiple regression estimator V v . wL that utilizes both weight of the load and the number of logs. This estimator was used in studies by Freeman (1962) and Blair (1965) although these authors did not compare this estimator with any other estimators. In this study, the estimator V v . wL required 5, 13, and 18 per cent fewer samples for the same level of precision than did the ratio estimator for the sales #1, #6, and #13, respectively. It is not felt that counting the number of logs on each truckload would have much effect on the time that most trucks would lose in being weighed. Thus, unless special problems exist in counting the number of logs from a particular sale, V v . wL is the estimator to be preferred over the others tested. The regression equation for the total net volume for each of the sales is as 16 follows : Watershed #1 K-wl = N[V + 0.088247 (/jl w - W) - 21.898 (j* L - L)] Chamber Creek #13 V v , wL = N[V + 0.088698 (/* w - *F) - 54.029 (/i L - I)] Caspar Creek #6 j/. m , l = AA[F + 0.10194 (fi w - W) - 38.289 (/x L - L)] Combined K,-wl = N\V + 0.092603 (ju w - W) - 26.942 (/i L - I)] Regression of volume on weight versus ratio estimates The regression estimator J^.. w would be expected to be no more precise than the ratio estimators if the linear relationships between volume and weight passes through the origin, i.e., zero volume has zero weight. While this should not be of overriding concern since we are not dealing with log loads much below 50,000 pounds, there was no rea[ difference between the CV (coefficient variation) of the regression estimator V v . w and the ratio estimators (table 3). Therefore the estimator V v . w is not to be preferred over the ratio estimators, which are a little easier to calculate. There was no difference between the CV's for the various ratio estimators. This can be attributed to the near equality of the ratios. As table 2 shows, the difference between the ratio of the means r and the mean of the ratios r was less than 0.2 per cent for all sales. If these ratios are not equal (which might occur in sales with a wider range of tree size or more defect) the ratio closest to the slope of the regression line of volume on weight would produce the more precise esti- mate (Raj, 1968, p. 98). The unbiased mean of ratios estimator, r\ showed the same properties as the other ratio estimators because the bias correction involves the term (V — rW) which is zero when r = r (since r — V/W). In any event, Raj (1968, p. 235) states that the relationship between the bias of r, Z?(r), and that of r is given by m * *&• n That is, the bias of r is less than r and decreases with increasing sample size. Thus, of the three ratio estimators the ratio of means estimator r = V/ W is to be preferred. Basing their conclusions upon observations of 236 loads of hemlock in the Pacific Northwest, Johnson et al. (1963) estimated that it took about 18 per cent more samples when using the simple random sampling estimator V SRS than it did for the ratio of means estimator V r for the same level of precision. After examining costs associated with weighing, Johnson concluded that the estimator V SRS was more than four times as efficient as the ratio estimator, which involved weight measurements. As Johnson suggests, the decision to use scaling rather than sample scaling without weights is not a statistical one— rather, it is an administrative de- cision based upon concepts of log inventory control. Also, weight is perhaps the ideal measure to use for paying trucking costs. Counting the number of logs in each load and using the estimator V r . L reduced 17 the number of samples required in the three sales by 1, 10, and 13 per cent re- spectively. Thus in some cases it appears that it may be worthwhile to use this estimator over the simple random sampling estimator V SRS . Value of weighed timber The above estimation procedures have stressed the estimation of volume without regard to species. Since the price paid for timber usually varies by species, it is necessary to apportion the total volume to the various species groups represented. The procedure used by CDF to determine the value of the timber weighed is as follows (Jackson State Forest weight scaling memorandum, undated mimeo) : • Determine the per cent volume by species group for the loads scaled. • Apply this percentage to the total estimated volume of the sale to get the estimated total volume by species. • Multiply the price bid for each species group times the estimated volume of that species and sum to obtain the total estimated value of the sale. This method of determining the value of weighed timber has the advantage of being relatively easy to compute. However, no expressions are available to eval- uate the standard error of these estimates by species, so that it is impossible to place confidence limits on estimates by species. A more direct approach would be to estimate value directly by using value in place of volume in the estimation equations developed above. The value of each sample load is determined by summing the products of volume per species times the price for that species. Table 5 gives coefficients of variation for the various estimators of value; table 6 gives sample sizes necessary to achieve a 95 per cent confidence interval of ±2 per cent. Note the tremendous increase in number of samples required when estimating value, as compared to the number required when estimating volume. This is a direct result of the large difference in value per M board feet of the various species. Also these values differ greatly from sale to sale depending upon the relative amount of each species present, and the anticipated logging costs involved. Prices are also influenced by the market and the speculative nature of the bidding procedure. Table 5 COEFFICIENTS OF VARIATION OF VALUE OBTAINED USING THE VARIOUS ESTIMATION EQUATIONS. Estimation procedure Sale area #1 #6 #13 Simple random sampling. . V t Ratio V r . . . . Simple regression : Volume on weight . . . V v . w . . . Volume on number of logs V r . L . . . Multiple regression .... V vwL . . . 33.2 32.4 32.4 32.5 31.2 per cent 21.9 20.8 20.8 21.3 20.1 36.4 36.0 36.0 35.8 35.4 18 Table 6 SAMPLE SIZES REQUIRED FOR 95 PER CENT CONFIDENCE INTERVALS WITHIN ±2 PER CENT OF TOTAL VALUE (N = 2000). Estimation procedure Sale area #1 #6 #13 Simple random sampling. . V v . . . . Ratio V r . . . . Simple regression : Volume on weight . . . V v . w . . . Volume on number of logs V V . L . . . Multiple regression .... V v . wL . . . 710 689 689 691 655 number of loads 388 357 357 370 336 796 787 787 781 770 While one could hardly generalize from one sale only, it is worth noting that the old-growth sale, Chamber Creek #6, required 15 per cent fewer samples to estimate total net volume, and only about one half the samples to estimate total value as the other two stands. This appears to support the idea that it may be possible to use weight scaling in old-growth timber. STRATIFICATION TO INCREASE PRECISION In order to provide confidence intervals on the volume by species, or simply to reduce the total amount of variation experienced in estimating total value of the sale, some kind of stratification of the sample loads is necessary. In the case of stratification by species this would mean that the truckloads would be made up by species (or species groups) and that the weighmaster would record the species at the time that he weighs the truckload. While this kind of a landing sort may be practical in some cases, it cannot be done in all cases. Where this can be done practically, it can greatly increase the precision of the estimate of value of the sale. An alternative procedure might be to stratify on the basis of the value of the load as estimated by the weighmaster. For sales in which there is a wide difference between the values of each species, the values per load tend to fall into two general classes, the very low value loads being distinct from the higher value loads. As an example, let us agree to place all those loads with estimated values less than $100 into one stratum and those with value greater than, or equal to $100 into a second stratum. All truckloads would be separated by the weighmaster and those loads that are selected for scaling will have their actual values computed. Whether stratifying on species or value strata or groups, the truckloads cannot be separated into strata before sampling. Thus this becomes a post-stratification problem (Cochran, 1963, p. 136). Let N h p. vh and Sj be the total number of truckloads, the estimated mean volume (or value) per truckload, and the error variance for the z th stratum, respectively. Then the estimated mean volume or value per truckload for all 19 truckloads is given by (s for stratified) where N = LA^ is the total number of truckloads on the sale. The expected variance over repeated sampling is estimated by (assuming pro- portional allocation of samples to the strata) The example of post stratification shown below uses data from the Chamber Creek #13 sale. For this sale the post stratification reduced the variance on the estimate of total value by about 75 per cent. Post stratification by value on Chamber Creek #13 Using a post stratification on value for the Chamber Creek #13 sale we obtain the following results Stratum >$100 <$100 All i 1 2 N t 1670 330 2000 n, 147 29 176 fi t , $192.98 $50.44 $169.49 S 2 ei 1180 118 3800 SRS ^2 Thus the estimate of the over-all value per load is = ^ $192 ' 98 + ^ $5044 = $16946 P erl0ad - The variance of this estimate is given by .«^Vi!*.^i('-!)« = 2000 - 176 / 1670 330 ~ (2000X176)^2000 2000 + (176) '-KW'-S' - 0.0051818(1004.77) + ^Ut(293.23) 20976 = 5.2065 + 0.0095 = 5.2160 Thus the standard error of jl vs is 20 Jvar(fi rs ) = ^5.2160 = 2.284. board feet Without stratification we had a variance of //,. equal to var{fi r ) N - n e2 Nn * e 2000 - 176 3800 = 19.69 (2000)(176) which is almost four times what we obtained by using post stratification. We can obtain an expression for the sample size required using post stratifica- tion by setting the variance of jl vs equal to £ 2 , the square of the desired standard error, and then solving the expression for n as follows: E 2 = N - n Nn i#^H 1 1 N E 2 = N-n [A] +-^[S] Nn n 2L J I - \rA + \b 1 n N n 1 which yields the second-degree equation in n E 2 + 4y\" 2 N Using the quadratic equation then n = _ A ± J A 2 + 4 B(E 2 + A/N) 2 (E z + A/N) where and A ~ £ TV ^ * = L ^|S 2 # Per In the examples used here the desired standard error E is given by r 2% of fL v h> = = where z = 1.96 is the 95 percentage point of the normal distribution. In the example immediately following, the number of samples required using post stratification is 288 compared to the 796 (table 4) required without strati- fication. Similar decreases in the sample size can be calculated for the ratio and regression estimators by using the appropriate value of S^. 21 Sample size calculation for post stratification From the data given on page 20, under "Post stratification by value on Chamber Creek #13" and £2 = p%of/U 2 = R.02)( 169.49) 96 2 = 2.99 2.99 + l J^\ = 3.49 N = [*'" + 2000 ) Therefore the sample size necessary to achieve a 95 per cent confidence interval of + 2 per cent is given by _ A ± 7 A 1 + 4 B{E Z + A/N) U ~ ~ 2 (E 2 + 5M0 1005 ± V(1005) 2 + (4)(293)(3.49) (2)(3.49) 1005 ± 1007 6.98 Taking the positive sign on the square root, n = 288. Using the more simple approximate formula for the sample size we find A 1005 = 29Q (E 2 4- A/N) 3.49 which is only a difference of two samples. SUMMARY AND CONCLUSIONS Sample scaling without use of weights can be used to achieve 95 per cent con- fidence intervals of ±2 per cent of the total volume by scaling only about 10 per cent of the sample loads. A ratio or regression estimator using weight reduced required number of samples by about 10 per cent of what was required for simple random sampling. No difference was found between the ratio and simple regres- sion estimates using weight. Also, the three ratio estimators used gave the same results. The multiple regression estimator using weight and the number of logs per load was the most efficient statistically, requiring 5, 13, and 18 per cent fewer samples respectively, on the three sales than that required for the ratio and simple regression estimators that used weight alone. Whether or not this savings is sufficient to warrant counting the number of logs on each truckload was not investigated. 22 Considerably larger sample sizes were required to estimate the total value of the sales examined. By using the post-stratification developed herein, however, required sample sizes were reduced by about 75 per cent for the one sale investi- gated. On the basis of this study and for the types of stands investigated it is concluded that stratification of the truckloads into either value or species groups is definitely advantageous. ACKNOWLEDGMENTS This paper is an outgrowth of work started in cooperation with the California Division of Forestry. The data used here was supplied by David Burns, CDF, Sacramento, and by the personnel at Jackson State Forest. For their contribu- tions to this paper, and for their pioneering work in weight scaling, the author is deeply grateful. The author wishes to thank Stephen Titus, Sipi Jaakkola, Vai Semion, and Frank Goddard, who while students at the School of Forestry and Conservation aided in development of the ideas and/or computer programs reported here. Contributions by reviewers W. G. O'Regan and F. A. Johnson of the U.S. Forest Service are also gratefully acknowledged. This work was done under Agricultural Experiment Station Project F-2520. The computer work was also aided by the cooperation of Dr. W. G. O'Regan, Pacific Southwest Forest and Range Experiment Station, Berkeley. Computing was done on the University of California's CDC 6400 and the Lawrence Berkeley Laboratory's CDC 6600 and CDC 7600 computers. LITERATURE CITED Blair, W. M. 1965. Weight-scaling pine sawlogs in Texas. Texas Forest Service Bulletin 52. Burns, D. M. 1970. Board foot by the pound. State Forest Note No. 42, California Division of Forestry, Sacramento. Cochran, W. G. 1963. Sampling Techniques (2nd edition). John Wiley and Sons, New York. Freeman, E. A. 1962. Weight-scaling sawlog volume by the truckload. Forest Products Journal 12 (10) 473-5. Johnson, F. A., R. H. Ruth, and R. W. Madison. 1963. Sample scaling for timber sales. Journal of Forestry 61 (5): 360-4. Raj, D. 1968. Sampling Theory. McGraw-Hill, New York. TlKKIWAL, B. D. 1960. On the theory of classical regression and double sampling estimation. Jour. Royal Stat. Soc. B 22: 131-8. Tin, M. 1965. Comparison of some ratio estimators. Jour. American Stat. Soc. 60:294-307. Wensel, L. C. 1973. Estimation procedures for weight scaling. Biometrics Note 1, March 1, 1973, (mimeo available from author). 7.5m-6,'74(R5427)VD 23 THUS fc/V^V '