Vous êtes sur la page 1sur 18

Brief discussion of broader applications to data analysis 1.

Summary
An analysis of the game-by-game batting logs for Babe Ruths 1923 season (when Ruth had his highest batting average) shows an interesting movement of the AB and Hits data along a series of parallels with the general equation y = hx + c. Here x is the number of At Bats (AB) and y is the number of Hits (H). Ruths game-by-game batting logs reveal scores of (0, 0), (1, 1), (2, 2), (3, 3), and (4, 4) and also scores of (1, 0), (2, 1), (3, 2), (4, 3) and even (6, 5), and so on. Thus, the slope h = 1, the theoretical ideal batting average, and the intercept c = 0, -1, -2, etc. indicating the number of missed hits for each game. When we consider the aggregated data, on a monthly basis, the same linear law holds but the slope h < 1 and the intercept c can be either positive or negative depending on the player. For Babe Ruth, it can be shown that the intercept c negative (c < 0). Thus, the batting average BA = y/x = h + (c/x) keeps increasing with increasing AB. The theoretical maximum BA that any batter can achieve is thus equal to the slope h of the AB-Hits graph when we consider monthly aggregated data for a season (or data over many seasons). Babe Ruths batting statistics also serves to illustrate the significance of the nonzero c that we observe in the analysis of our empirical (x, y) observations on many complex problems of interest to us. The nonzero c is like the work function conceived by Einstein, in 1905, to explain the photoelectric effect. Einsteins photoelectric law is also a linear law which suggests a movement along parallels for experiments with different metals. The nonzero c in baseball statistics is related to the missing hits, or the difficulty of producing a hit, or a home run (if y is taken as home runs). The same work function applies to many other problems. Some examples are financial data (profits
Page | 1

Babe Ruths 1923 Batting Statistics and Einsteins Work Function

and revenues of companies) or other performance related data (airline quality ratings), or fatality data (deaths due to traffic accident, guns, cancer, etc.).

2. Introduction
The main purpose of this article is to discuss the significance of the nonzero intercept c in the mathematical equation of a straight line, y = hx + c, and its relation to the idea of a work function conceived by Einstein in his famous 1905 paper on the quantum nature of light, see Refs. [1-9]. We will use baseball legend Babe Ruths batting statistics, Refs. [10-12], from the 1923 season, to explain the significance of the nonzero intercept c.

Table 1: Babe Ruths Batting Stats for the 1923 Season


Month Apr 1923 May 1923 Jun 1923 Jul 1923 Aug 1923 Sep 1923 Oct 1923 Totals Games 12 27 25 31 25 28 4 152 At Bats, AB = x 35 100 77 111 86 97 14 520 Hits, H=y 12 34 29 51 38 32 9 205 Batting Average BA = y/x 0.343 0.340 0.377 0.459 0.442 0.330 0.643 0.394 Home Runs HR 2 9 3 10 8 6 3 41

Source: http://www.baseball-almanac.com/players/hittinglogs.php?p=ruthba01&y=1923 (click here to go to batting logs)

The mathematical equation of a straight line, passing through the origin, can be written as y = mx where y/x = m is the slope of the line. As x increases y increases proportionately. A doubling of x produces a doubling of y; a tripling of x produces a tripling of y, and so on. We often use simple y/x ratios (usually converted to a percentage after multiplying by 100) to make comparisons. The batting average in baseball is one such example. The profit margin of a company (ratio of profits y to revenues x), the On-Time arrival percent for an airline (ratio of OT arrivals y to flights x) are other examples.

Page | 2

However, and, quite surprisingly, this is often overlooked, if the straight line does NOT pass through the origin, y = hx + c = h(x x0). Now the ratio y/x = m = h + (c/x) is NOT a constant as we move up or down the straight line. Here h is the slope of the line, c the intercept made on the y-axis (y = c when x = 0) and x0 = - c/h is the intercept made on the x-axis (x = x0 when y = 0). Although y increases as x increase, they do not increase any more in exact proportion because of the nonzero intercept c. Baseball batting statistics can be used to illustrate the significance of this nonzero c. The nonzero c observed in all such problems can be thought of as being exactly analogous to the work function first conceived by Einstein, in his famous 1905 paper on the quantum nature of light, see Refs. [1-4]. Indeed, it can be shown that a nonzero intercept c always appears if we carefully analyze many of our empirical observations on a number of complex social, political, cultural, environmental, economic, financial, and business systems. Nearly a year ago, following the Facebook IPO, on May 19, 2012, I posted my first article on this website describing my analysis of the revenues and profits data for Facebook (from their SEC filings, as a part of the IPO, Refs. [13, 14], click here and here). Many other articles followed, on a variety of topics (in addition to the important topic of the profits-revenues data for several companies), such as the unemployment problem, the US national debt, the US government budget, Olympic long jump records, Teenage pregnancy rates, the Forbes Billionaires average net worth, Airline Quality Ratings (see Ref. [15] for a detailed bibliography list). As shown in these articles, in many such problems, we find that our (x, y) observations follow a simple linear law, of the type y = hx + c. In other words, the ratio y/x = h + (c/x) is not a constant and can either increase or decrease, in a complex and bewildering manner, depending on the numerical values of h and c and also the size effect, as described by the absolute magnitude of the independent variable x. This also leads us to the conception of a more general nonlinear law given as equation 1 below, see discussion of the financial data for Ford Motor Company, the old General Motors and Yahoo Inc., Ref. [16] and Tesla Motor Company, Refs. [17, 18].
Page | 3

y = mxn [e-ax/(1 + be-ax) ] + c

..(1)

Equation 1 (which is actually a generalized statement of the famous blackbody radiation law due to Max Planck, see Refs. [16-18]) reduces to the familiar equation of a straight line for the special case of n = 1 and a = 0. For the special case of b = 0 and c = 0, y = mxne-ax and it can be shown that the x-y graph will go through a maximum as x increases. The slope of the x-y graph, given by the derivative of the function dy/dx = (n ax)(y/x), is positive for x < n/a and negative for x > n/a. Hence, the graph goes through a maximum at x = n/a. The profits-revenues graph for many leading companies can be shown to reveal a maximum point, see Ref. [16] for an example (many other examples are discussed in the articles listed in Ref. [15]), the existence of which, quite surprisingly, seems to have escaped the attention of economists, financial analysts, and business managers. With this background, let us now turn our attention to baseball statistics. The legendary American baseball player, Babe Ruth, whose career spanned the years 1914-1935, was known as a power hitter. In the 1920s, Ruth dominated the game and broke all records. He combined a high batting average (BA) with incredible hitting power. In 1920, he bested his own home run record of 29, set the previous year, with 54 home runs and then bettered it in 1927 with 60 home runs in a single season. That record held for 34 years, until it was broken by Roger Maris, with 61 home runs in the 1961 season. Ruth also set the career record of 714 home runs, which was subsequently broken by Hank Aaron in 1974; see Ref. [10]. Babe Ruth, it is often noted, transformed baseball itself, with his power hitting. The popularity of the game exploded after Ruth came on the scene and he is credited to turning it from a low-scoring, high-power, game to a high scoring, high power, game. His highest batting average (BA) of 0.393 was established in the 1923 season. (My calculations, to three decimal places, give BA = 0.394.) The baseball statistic, known as the batting average, can be traced to the batting average that had been used in cricket to measure the skill of a
Page | 4

batsman, see Ref. [11]. In baseball, the BA is defined as the ratio of the number of hits (H) to the number of At Bats (AB). Thus, BA = H/AB. A season batting average of 0.300 is considered to be excellent and 0.400 is truly exceptional and has been rarely accomplished. Ted Williams had a batting average of 0.406 in 1941. The BA, which is a fraction, is usually quoted to three decimal places but often read as a whole number in the thousands. In the 1923 season, when Babe Ruth had the highest seasonal BA of his career (205 Hits with 520 AB, giving BA = 205/520 = 0.394), his monthly BA was 0.459 in July, 0.442 in August and 0.643 in October (14 AB with 9 Hits). Ruth had a career BA of 0.342 with career AB = 8399 and H = 2873. In this article, we will discuss Babe Ruths BA using AB-Hits (x-y) diagrams. However, the main purpose here is NOT a discussion of Babe Ruths record, or baseball statistics, per se, but to use the insights gained as a basis to discuss other more complex problems where we compile huge volumes of x and y data on a daily, monthly, quarterly, and annual basis. One recent and noteworthy example of interest is the annual Airline Quality Rating, AQR, Ref. [19-24] published on April 8, 2013. This is a composite score based on the averaging of literally hundreds of y/x rations relating to four main criteria to assess airline performance: On-Time (OT) arrivals, Missing Baggages (MB), Denied Boardings (DB) and Customer Complaints (CC). In the AQR problem, we observe a simple linear relation, y = hx + c, relating the number of flights x and the number of OT arrivals y. In the baseball statistics, we find a simple linear relation between x, the number of At Bats, and y the number of Hits H or the Home Runs, HR. To use analogies, OT arrival is like a Hit in baseball. Thus, we can compare the AQR scores to the baseball stats. What is the significance of the nonzero intercept c and how does this relate to Einsteins work function?

3. Babe Ruths 1923 Season


As seen from Table 1, the game-by-game batting data are aggregated to arrive at the monthly total At Bats (AB or x), Hits (H or y) and Home Runs (HR, or y when HR is being analyzed). The monthly data are then aggregated to arrive
Page | 5

at the seasons AB, Hits HR and the BA. The monthly data show that Ruth achieved the exceptionally high BA of greater than 0.400 during this season and had a BA of 0.643 for the month of October 1923. The total number of HR was 41, significantly lower than the single season record of 60 achieved in 1927. Babe Ruths 1927 season, and his career batting statistics, is the subject of a companion article, Ref. [25] where we pursue the same discussion in more detail. Here, we will consider only the 1923 season.

(6, 5)
5

Number of Hits, y

(4, 4)
4

(3, 3)
3

(2, 2)
2

(1, 1)
1

(0, 0)
0 0 1 2 3 4 5 6 7

Number of At Bats, x
Figure 1: Graphical representation of the game-by-game batting stats for Babe Ruth in the 1923 season. Babe Ruth achieved the highest (single season) batting average of 0.394 = 205/520 in this season with 41 homers and 205 hits. Ruths batting performance can be studied using the AB-Hits diagram of Figure 1. Notice that Babe Ruth achieved the PERFECT single game BA of 1 in several games with scores of (1, 1), (2, 2) in three games, (3, 3) in six games, and (4, 4). The batting logs also give the score (0, 0) for the games on May 24,
Page | 6

1923 and September 19, 1923. Hence, all these (x, y) pairs, the highest attainable for each AB, can be joined by the straight y = x. Thus, Ruth achieved the enviable single game BA of 1 or y/x = 1 in several games. (This is also true if we analyze batting stats for other seasons with BA = 1 for several games.)
7 6

y=x

Number of Hits, y

(4, 4)
4

(6, 5)

(3, 3)
3

y = x -1

(2, 2)
2

(1, 1)
1 0 0 1

)
2 3 4 5 6 7

Number of At Bats, x
Figure 2: The movement of Babe Ruths game-by-game batting data (hits for the 1923 season) along parallels is illustrated here. Babe Ruth achieved the highest (single season) batting average of 0.394 = 205/520 in this season with 41 homers and 205 hits. Ruth also achieved scores of (5, 4) and (6, 5) with only one missed hit for the number of ABs. In other games, Ruth got scores of (1, 0), (2, 1), (3, 2), (4, 3) and (5, 4) also indicating one missed hit in each case, or y = (x 1). Then, in other games, the scores were (2, 0), (3, 1), (4, 2), and (5, 3) indicating 2 missed hits in each case, or y = x 2. In other words, the general equation describing these batting stats is y = hx + c with the slope h = 1 (the ideal BA) and the intercept c = 0, -1, -2 indicating the number of missed hits.
Page | 7

The data thus follow a series parallels with the nonzero intercept telling us something about the difficulty of producing a hit (or a home run, if HR is being analyzed using a similar AB-HR diagram), see Figure 2. The difficulty of producing a hit depends on the pitcher, the pitching speed, the trajectory of the ball, the wind speed, even the stadium where the game was being played, and also Ruths own mental alertness and concentration. Hence, the nonzero c can be thought of as a work function, akin to the work function conceived by Einstein to describe the difficulty of producing an electron when photons of a fixed energy bombard the surface of a metal. In Einsteins law, see Refs.[1,4,26], K = ( W) = hf W = h(f f0). Here K is the maximum kinetic energy of the electron, h is a universal constant called the Planck constant, f is the frequency of light, and W = hf0 is the work function of the metal. This is the minimum energy needed to eject the electron. If < W, no electrons can be produced. The cut-off frequency f0 = W/h is the minimum frequency below which no electrons are produced. Only a portion of the photons energy appears as the energy of the electron K. Some must be given up to do the work needed to get the electron out of the metal, as explained by Millikan in his Nobel lecture, Ref. [4]. Millikans elegant experiments, with lithium (used in modern lithium-ion batteries) and sodium (one of the two elements in common salt), confirmed Einsteins law and also led to the first direct determination of the universal constant h. Note that, according to Einsteins law, the K-f graph, for experiments with different metals, will be a set of parallels, each having the slope h equal to the Planck constant. We see an exactly analogous situation here with baseball statistics. The batting data falls on a series of parallels, with the slope h = 1. The number of AB is always a whole number and can only increase by 1 at a time. The number of Hits is also a whole number and can only go up by 1 at a time. Hence, the slope h = 1 is the theoretically ideal BA. The nonzero intercept c is clearly related to the number of missed hits, or the difficulty of producing hits. The cumulative (or aggregated) monthly data is plotted in Figure 3. This reveals the more general law y = hx + c with h < 1 and a negative intercept c.
Page | 8

Home runs are more difficult to achieve than hits. Hence, the AB-HR graph has a slope h < 1 and a higher intercept x0 = - c/h compared to the AB-Hits graph, see Figure 4.

Table 2: Babe Ruths Batting Stats for the 1923 Season


Single Month data At Bats, Hits, Batting AB = x H=y Average (BA) = y/x 35 12 0.343 100 34 0.340 77 29 0.377 111 51 0.459 86 38 0.442 97 32 0.330 14 9 0.643 520 205 0.394 Aggregated monthly data At Bats, Hits, Batting AB = x H=y Average (BA) = y/x 35 12 0.343 135 46 0.341 212 75 0.354 323 126 0.390 409 164 0.401 506 196 0.387 520 205 0.394

Month Apr 1923 May 1923 Jun 1923 Jul 1923 Aug 1923 Sep 1923 Oct 1923 Totals

Source: http://www.baseball-almanac.com/players/hittinglogs.php?p=ruthba01&y=1923

(click here to go to batting logs). Notice that the single-month data, like the game-by-game data, yields a much higher BA compared to the aggregated monthly data. Thus, we get BA = 9/14 = 0.643 for Oct 1923 since this is based strictly on the additional Hits in the October. The aggregated average went up from BA = 0.387 to BA = 0.394. The individual y/x ratios in the last column are all less than the slope h = 0.404 of the best-fit line through this data, see Figure 3.

Also, with c < 0, it is clear that the theoretical highest value of the BA equals the slope h = 0.404 of the best-fit line through the data points. This is in agreement with the BA values listed in the last column of Table 2.

Page | 9

240 200

Number of Hits, y

160 120 80 40 0

y = 0.404x 5.874 = 0.404 (x 14.53) r2 = 0.9978

100

200

300

400

500

600

Number of At Bats, x
Figure 3: Graphical representation of the aggregated monthly batting stats (hits) or Babe Ruth in 1923. Babe Ruth achieved the highest (single season) batting average of 0.394 = 205/520 in this season with 41 homers and 205 hits.

The best-fit line through the aggregated monthly data was determined using linear regression analysis (method of least squares, see Refs. [27,28]). Because intercept c = -14.53 < 0, the BA = y/x = 0.404 (14.53/x) keeps increasing as the AB increases. The maximum value of BA equals the slope h = 0.404. In Figure 4, we consider the number of home runs (HR) instead of Hits (H). The slope h, in other words, the rate at which home runs increase, was lower in the first two months. It then increases to the nearly constant value observed in the remaining months of the season. The first two data points were therefore neglected to develop the regression equation.

Page | 10

The work function, as determined by the cut-off x0 = - c/h = 28.5, is higher for the home runs HR than it is for hits H. This again shows that the nonzero intercept c is related to the difficulty of producing Hits or HRs.

50

Number of Home Runs, y

40

30

y = 0.082x 2.346 = 0.082 (x 28.5) r2 = 0.9964

20

10

-10 0 100 200 300 400 500 600

Number of At Bats, x
Figure 4: Graphical representation of the aggregated monthly battings (hone runs) stats for Babe Ruth in the 1923 season. It should be noted that the approach taken here is consistent with Einsteins emphasis on determining the maximum kinetic energy K of the electron in order to test his photoelectric law, see Ref. [1]. Furthermore, as discussed by both Einstein and Millikan, the correct numerical value of the Planck constant h can only be determined if the maximum K is determined accurately in the experiments, see Millikans Nobel lecture, Ref. [4]. This has also been discussed nicely by Shamos, on pages 235 and 236 of Ref. [26], in his brief extracts of many original leading physics papers.

Page | 11

If all the HR data is included, the regression equation y = 0.075x + 0.4166 has a smaller slope h and the intercept c has a small positive value. Now, we have actually created confusion by trying to include all of the data, as we tend to do in most statistical analysis, in our attempts to being unbiased. With c > 0, we are led to the conclusion that Ruths BA decreases with increasing AB, whereas including only his best performance leads to the opposite conclusion. If one reviews all of Ruths batting performance, it becomes clear that c < 0 is the correct value. Now, one can also begin to appreciate why the Nobel laureate Millikan often ignored data (and so has even been accused of data manipulation, the interested reader can Google it, or see Ref. [15]) from some of his famous oil-drop experiments which led to the determination of the absolute magnitude of the electrical charge (often denoted by the symbol q)on a single electron. The Planck constant h could be deduced from the later photoelectricity experiments only because of the knowledge of q. In summary, the brief discussion here of Babe Ruths batting statistics from the 1923 season (see also Ref. [25]) provides insights into the evolution of the linear law y = hx +c, often observed when we analyze many empirical observations. The nonzero c is just like Einsteins work function from physics, or Babe Ruths work function to produce a hit or a home run. We often see a movement of the empirical data in many problems along parallels (see discussion of Microsoft and Kia profits-revenues data, in Ref. [15], and the analysis of the billionaires versus population data for different countries). This too can be understood using the batting average stats of a legendary baseball player like babe Ruth.

Reference list
1. On a heuristic point of view about the creation and conversion of light, by A. Einstein, 1905, Einsteins original paper which showed light can be viewed as particles with fixed energy quanta,
http://www.ffn.ub.es/luisnavarro/nuevo_maletin/Einstein_1905_heuristic.pdf

Page | 12

2. On a heuristic point of view concerning the production and transformation of light, Paper 5, in Einsteins Miraculous Year: Five Papers that changed the face of physics, Princeton Univ. Press (1998). http://press.princeton.edu/einstein/materials/light_quanta.pdf 3. Einsteins Quanta, Entropy, and the Photoelectric Effect, by Dwight E. Neuenschwander, Excellent discussion about how Einstein arrives at his conception of light quanta from the property called entropy possessed by radiation in the form light,
http://www.sigmapisigma.org/radiations/2004/elegant_connections_f04.pdf

4. The electron and light quant from experimental point of view, May 23, 1924, Nobel Lecture, by Robert Millikan, see Figure 4 on page 63, for experiments with sodium. The straight line graph for photoelectric experiments confirms Einsteins law. The slope of the graph gives the universal Planck constant h, one of the fundamental constants of nature. http://www.nobelprize.org/nobel_prizes/physics/laureates/1923/millika n-lecture.pdf 5. On Cathode Rays, Nobel Lecture, May 28, 1906, by Philip Lenard, http://www.nobelprize.org/nobel_prizes/physics/laureates/1905/lenardlecture.pdf 6. Focus: Centennial Focus, Millikans Measurement of the Planck constant, April 22, 1999, by Gerald Holton, http://physics.aps.org/story/v3/st23 7. The Photoelectric Effect, by M. Brandl, Project PhysNet, http://www.ifsc.usp.br/~lavfis/BancoApostilasImagens/ApEfFotoeletrico /The%20Photoelectric%20Effect%20-%20m213.pdf 8. The Millikan experiment to verify the Photoelectric relationship, http://tap.iop.org/atoms/quantum/502/file_47016.pdf 9. Photoelectric Effect, http://physics.tutorvista.com/modernphysics/photoelectric-effect.html 10. Babe Ruth, http://en.wikipedia.org/wiki/Babe_Ruth 11. Batting Average, http://en.wikipedia.org/wiki/Batting_average 12. Babe Ruth 1923 Game by Game Batting logs, http://www.baseballalmanac.com/players/hittinglogs.php?p=ruthba01&y=1923

Page | 13

13. The Facebook Future: Revenues-Profits analysis, Published May 19, 2012, http://www.scribd.com/doc/94103265/The-FaceBook-Future 14. The Future of Facebook I, Published May 21, 2012, http://www.scribd.com/doc/94325593/The-Future-of-Facebook-I 15. Bibliography, Articles on Extension of Plancks Ideas and Einsteins Ideas beyond physics, Compiled on April 16, 2013, http://www.scribd.com/doc/136492067/Bibliography-Articles-on-theExtension-of-Planck-s-Ideas-and-Einstein-s-Ideas-on-Energy-Quantum-totopics-Outside-Physics-by-V-Laxmanan 16. Money in Economics is Just like Energy in Physics, Published Jan 14, 2013, http://www.scribd.com/doc/120324960/Money-in-Economics-isJust-like-Energy-in-Physics-Extending-Planck-s-law-beyond-Physics Examples of the old General Motors, Ford Motor Company and Yahoo Inc. 17. Tesla Motors Profits (Losses)-Revenues Analysis and a New Measure of Profitability, Published March 26, 2013, http://www.scribd.com/doc/127436107/Tesla-Motors-Profits-LossesRevenues-Analysis-and-a-New-Measure-of-Profitability-MPR 18. Tesla Motors: Nonlinear Model for Profitability Analysis, Published March 28, 2013, http://www.scribd.com/doc/127767159/Tesla-MotorsNonlinear-Model-for-Profitability-Analysis 19. Airline Quality Rating 2013, Purdue University, e-Pubs, April 8, 2013, by Dr. Brent D. Bowen (Purdue University, College of Technology) and Dr. Dean E. Headley (Wichita State University, W. Frank Barton School of Business) http://docs.lib.purdue.edu/aqrr/23/ 20. Airline Quality Rating 2012, by Dr. Brent D. Bowen (Purdue University, College of Technology) and Dr. Dean E. Headley (Wichita State University, W. Frank Barton School of Business), Table on Page 48 of the report gives percent on time arrivals for each airline for each month and the annual average; see image extracted below from page 48. 21. Americas Best Airlines, by David M Ewalt, April 8, 2013 http://www.forbes.com/sites/davidewalt/2013/04/08/americas-bestairlines/ 22. Airline Quality Report sorts out duds from the dynamos, CNN Travel, by Jim Barnett, April 8, 2013, the two way Breaking News from NPR,
Page | 14

http://www.cnn.com/2013/04/08/travel/airline-qualityreport/?hpt=us_c2 The Airline Quality Rankings Report looks at the 14 largest U.S. airlines and is based on an analysis of U.S. Department of Transportation figures. It's co-authored by Brent Bowen, the head of the Department of Aviation Technology at Purdue University, and Dean Headley of Wichita State. 23. Airline quality study finds more on-time flights, fewer lost bags, LA Times, April 9, 2013, by Hugo Martin, http://www.latimes.com/business/la-fi-airline-quality-report20130409,0,4799186.story?track=rss&utm_source=feedburner&utm_medi um=feed&utm_campaign=Feed%3A+latimes%2Fmostviewed+%28L.A.+Ti mes+-+Most+Viewed+Stories%29 Complaints against carriers jumped 20%. 24. Complaints Soar but Airlines Quality Ratings Stay High, by Mark Memmott, April 8, 2013, http://www.npr.org/blogs/thetwoway/2013/04/08/176566213/complaints-soar-but-airlines-qualityrating-stays-high Customers are more unhappy, but the carriers continue to improve their ontime performances and they're losing fewer bags. Those are among ...

25. Babe Ruth Batting Statistics and Einsteins Work Function, 26. The Photoelectric Effect, by A. Einstein, in Great Experiments in Physics, Edited by Morris H. Shamos, Dover Publications (1959), pp. 235-237. 27. Legendre, On Least Squares, English Translation of the original paper http://www.york.ac.uk/depts/maths/histstat/legendre.pdf 28. Line of Best-Fit, Least Squares Method, see worked example given http://hotmath.com/hotmath_help/topics/line-of-best-fit.html The formula for h used in this example is an actually approximate one and was used, before the advent of modern computers, since it only involves the determination of x2 and xy and the sum of all the values of x, y, x2 and xy. The exact formula, is given below, with xm and ym denoting the mean or average values of x and y in the data set, and ym = hxm + c since the bestfit line always passes through the point (xm , ym). h = (x xm)(y ym)/ (x xm)2

Page | 15

Determine the deviations of the individual x and y values from the mean, or average, (x xm) and (y ym). Determine the product (x xm)(y ym) and their sum. This gives the numerator in the expression for h. Determine the square (x xm)2 and the sum. This gives the denominator in the expression for h. This also fixes the intercept c via ym = hxm = c . Then, using the regression equation, determine the predicted value yb on the best-fit line and the vertical deviation (y yb) and the squares (y- yb)2. The sum of these squares is a minimum. This can be checked by assigning other values for h (using any two points) and allowing the graph to pivot around (xm, ym). The regression coefficient r2 = 1 - { (y- yb)2 / (y- ym)2 } is a measure of the strength of the correlation between x and y (or y/x versus x). For a perfect correlation, when all points lie exactly on the graph, r2 =+1.000.

Page | 16

About the author V. Laxmanan, Sc. D.


The author obtained his Bachelors degree (B. E.) in Mechanical Engineering from the University of Poona and his Masters degree (M. E.), also in Mechanical Engineering, from the Indian Institute of Science, Bangalore, followed by a Masters (S. M.) and Doctoral (Sc. D.) degrees in Materials Engineering from the Massachusetts Institute of Technology, Cambridge, MA, USA. He then spent his entire professional career at leading US research institutions (MIT, Allied Chemical Corporate R & D, now part of Honeywell, NASA, Case Western Reserve University (CWRU), and General Motors Research and Development Center in Warren, MI). He holds four patents in materials processing, has co-authored two books and published several scientific papers in leading peer-reviewed international journals. His expertise includes developing simple mathematical models to explain the behavior of complex systems. While at NASA and CWRU, he was responsible for developing material processing experiments to be performed aboard the space shuttle and developed a simple mathematical model to explain the growth Christmas-tree, or snowflake, like structures (called dendrites) widely observed in many types of liquid-to-solid phase transformations (e.g., freezing of all commercial metals and alloys, freezing of water, and, yes, production of snowflakes!). This led to a simple model to explain the growth of dendritic structures in both the groundbased experiments and in the space shuttle experiments. More recently, he has been interested in the analysis of the large volumes of data from financial and economic systems and has developed what may be called the Quantum Business Model (QBM). This extends (to financial and economic systems) the mathematical arguments used by Max Planck to develop quantum physics using the analogy Energy = Money, i.e., energy in physics is like money in economics. Einstein applied Plancks ideas to describe the photoelectric effect (by treating light as being composed of particles called photons, each with the fixed quantum of energy conceived by Planck). The mathematical law deduced

Page | 17

by Planck, the generalized power-exponential law, might actually have many applications far beyond blackbody radiation studies where it was first conceived. Einsteins photoelectric law is a simple linear law and was deduced from Plancks non-linear law for describing blackbody radiation. It appears that financial and economic systems can be modeled using a similar approach. Finance, business, economics and management sciences now essentially seem to operate like astronomy and physics before the advent of Kepler and Newton. Finally, during my professional career, I also twice had the opportunity and great honor to make presentations to two Nobel laureates: first at NASA to Prof. Robert Schrieffer (1972 Physics Nobel Prize), who was the Chairman of the Schrieffer Committee appointed to review NASAs space flight experiments (following the loss of the space shuttle Challenger on January 28, 1986) and second at GM Research Labs to Prof. Robert Solow (1987 Nobel Prize in economics), who was Chairman of Corporate Research Review Committee, appointed by GM corporate management.

Cover page of AirTran 2000 Annual Report


Can you see that plane flying above the tall tree tops that make a nearly perfect circle? It requires a great deal of imagination to see it and to photograph it.

Page | 18