Vous êtes sur la page 1sur 18

Automated Data Analysis

FEBRUARY 25, 2013

Doing research lately I realized I was duplicating a lot of my efforts in terms of gathering and organizing data for analysis. I decided to invest some time creating databases and macros that could do this for me and make it easier to do analysis going forward. I started with commodity futures data from Quandl.com which is a great data website. It also has an API for calling specific date ranges and data. First I created a list of the commodities I needed and made a macro in Excel that would download this. The macro would go through the list of products I needed data for, and request them for the date ranges specified in cells C1 and D1. It would then save them by symbol name in a folder.

So after running the macro the folder would contain the price data for the commodities requested. In this case there was a small range downloaded so I have a number of small CSV files. Next I used VBA in Access to import the CSV data into the database tables for the related commodities. As shown below there is now tables for each contract, and each tables has OHLC data as well as volume and open interest.

In order to avoid pulling unnecessary amounts of data back into Excel to analyze I made a union query which would return the data needed. In this case front and back month corn since 1/1/2013.

Once in excel I used to a pivot table to organize the data which made it easier to graph.

Lastly I created different charts. The following pages are the automated reports I had Excel generate for corn, soy and wheat.

Relationships between USDA Crop Progress Reports and Corn Yields. 2012 From April to November the US Department of Agriculture releases a weekly report titled Crop Progress and Conditions which provides information on the growth cycle and conditions of major grain crops in the US. These reports are an important source of information for commodity markets and have the potential to cause large changes in price. The report is a compilation of surveys taken from approximately 4,000 respondents who provide subjective ratings and categorize crops as Very Poor, Poor, Fair, Good or Excellent. The USDA then provides the aggregate percentage of each crop, in each state, which falls into each category. While breadth of the weekly surveys and the methodology provides valuable data, the USDAs presentation of the data makes it difficult to analyze. For example, when trying to compare two weeks there are five different changes that have to be checked. In this hypothetical scenario the amount rated Excellent has dropped 5%, the amount rated Poor has dropped 10% and the amount rated Fair has increased 15%.

Excellent Good Fair Poor Very Poor


Week 1 10% 20% 30% 30% 10%

Week 2 5% 20% 45% 20% 10%

Although this is valuable data, it is handicapped through presentation and reporting, making it difficult run statistical analysis or to compare to historical data. Even in the current form visual representation does not provide more insight.

In order to condense the weekly report of five different categories and corresponding percentages I have created a numerical rating scale with weighted categories to come up with an aggregate rating with 1 equaling Very Poor and 5 equaling Excellent. Applying this to the previous hypothetical example we see that the overall changes in the categories left the conditions unchanged

Week 1 Excellent Good Fair Poor Very Poor 5 4 3 2 1 10% =0.50 20% =0.80 30% =0.90 30% =0.60 10% =0.10 2.9

Week 2 5% =0.25 20% =0.80 45% =1.35 20% =0.40 10% =0.10 2.9

In this example, the average rating for Week 1 and 2 was 2.9, which is marginally below Fair (3). In order to analyze this data I ran queries against the USDAs Agricultural Marketing Services database which I then imported into Microsoft Access. Corn ratings for every state going back to 1986 produced over 55,000 entries in the database. I created queries to pull specific states and years. I also utilized calculated fields in Access to produce the aggregate for each week for each state. RatingNumber: IIf([Corn]="Excellent",5,(IIf([Corn]="Good",4,(IIf([Corn]="Fair",3,(IIf([Corn]="Poor",2,(IIf([Corn]="Very Poor",1))))))))) Weighted: [RatingNumber]*[Rating]*0.01 Creating this aggregate allows for easy comparisons across years. Access queries can be made for specified states and years and then linked to Excel. Once in Excel its easy to analyze the data or chart it. Below is an example of the weekly aggregate from 2007-2012, with 2012 bolded. This data was from Week #25 which th corresponded to the June 11 report in 2012. Although this chart shows the conditions were markedly poorer than previous years, the market for September corn futures closed at $5.37 on the day of the report. The following week futures began their rally which topped out with September corn hitting $8.43

The July 9

th

report showed excessive deterioration in Iowa quality, with September corn futures closing at $7.31.

The aggregate charts make it easy to compare states and also visualize the conditions relative to previous years. For example, Indiana was especially hard hit during the drought of 2012 and this method puts it into perspective with other years.

In drought years such as 2012 markets often look for analog years with similar conditions in order to have historical reference for trading. For 2012 the often cited analog years were 1988 and 1993. I created Access queries to pull and compare this data. Additionally, I also created another index which compared average crop ratings from between 2006-2011 for comparison with drought years.

Another advantage for aggregating the ratings is that it allows for yearly averages of conditions. During the course of the year final yield estimates from the USDA, as well as private companies, have major impacts on markets as participants readjust supply/demand models. Given that crop conditions throughout the year are closely linked to the final official yield I wanted to look at how well the average yearly crop rating matched with the final reported yield.

I created a scatter plot from 2000 to 2012 with each states average rating and final realized yield.

These show a broad correlation between average yield and aggregate annual crop rating. One interesting outlier is South Dakota whose ratings tend to be more optimistic than the other eight major producing states. This can be then analyzed by individual states and linear regression lines can be created to quantify the trends. For Iowa, during this period the linear regression equation is y=0.0129x+1.5717. If the average crop condition was Fair (3) for a year this would suggest a final ending yield of 110 bushels per acre, while if it were Good (4) this would suggest a yield of 188bpa.

This can be run for other states as well. For instance Nebraskas linear regression is y=0.0199x+0.6353 which for a an average yearly rating of Fair (3) suggestions a 118bpa yield and for Good (4) suggestions 169bps yield. This flatter relationship implies that, relative to Iowa, Nebraska corn yields fluctuate less as reported crop conditions change.

Using the equation for each states linear regression I calculated the difference between the implied trendline yield and the final realized yield. This shows the aggregate yearly average to be quite accurate with the largest percentage being less than 5bpa away from the final yield.

This accuracy data can analyzed by state State NE KS WI MN SD IN OH IA IL And by year Avg difference 0.35 0.23 0.1 0.37 -0.12 -0.34 -0.1 -0.1 0.01 Max 13.89 14.39 14.39 11.42 17.67 13.79 22.46 27.27 33.72 Min -10.27 -16.71 -20.32 -23.89 -22.16 -32.65 -26.55 -26.8 -33.89 Range 24.16 31.09 34.71 35.32 39.83 46.45 49.01 54.07 67.61

State 2000 2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012

Avg difference -21.58 -9.6 2.64 -1 1.01 7.77 0.31 0.89 3.96 7.86 -7.05 3.29 12.09

Max -2.33 2.13 17.12 10.66 8.85 33.72 17.63 15.11 13.39 17.67 3.87 15.95 27.27

Min -33.89 -19.59 -6.67 -10.18 -8.78 -4.37 -7.94 -10.27 -5.2 -5.46 -23.14 -3.23 2.77

Range 31.56 21.72 23.79 20.84 17.63 38.09 25.57 25.38 18.59 23.14 27.01 19.18 24.5

One interesting take away from this data is how much 2012 exceeded expectations. Between January and the beginning of June 2012 corn prices traded between $5 and $6. During mid-June to August prices spiked to $8.43 on August 10. Since then, prices have fallen with December corn futures reaching a September 29 low of $7.05. The 2012 correlation data shows that the final realized yield was 12bpa higher than the crop conditions implied yield. This is biggest outperformance in the sample series and suggested the market overpriced risk earlier the year before readjusting after harvest. Also that accuracy of the implied yield calculations decreases in extreme years such as 2012.

Corn Update 1/29/13


JANUARY 30, 2013

Lately RBOB/Ethanol spread has been widening. However this is mostly due to the strength in the RBOB market being greater than the strength in the ethanol market, rather than ethanol weakness.

The correlation between ethanol and corn has been weakening, as ethanol prices have been strong relative to corn prices and thereby increasing margins for ethanol plants. Given the low carryout number in the January WASDE we need old crop higher to ration demand and prior to the report the prices werent high enough to do this. With margins increasing the demand reduction is unlikely to come from ethanol. Its possible that feed usage with decrease, but it remains to be seen if that will happen. Technically there can be a case to be made for the long side in March corn. Its broken through the descending trendline and has been consolidating after the last supply and demand report. Its also above the rising 20, 50 and 200 day moving averages. And 710 has proved to be an important level that has held as well. However looking at linear regressions from the August peak March corn has been selling off and is now just back at the top of range.

To grind or not to grind


JANUARY 29, 2013

Recently POET made headlines for idling its Macon, Missouri plant due primarily to a lack of available local corn. Given that its a large operator and the announcement made many headlines, I felt it deserved a closer look. According to Ethanol Producer Magazine, Macon is one of the only plants POET runs that currently doesnt do corn oil extraction. POET runs 27 plants, has their own corn oil extraction technology, and Macon is one of the 2 plants that doesnt have this. Extracting co-product value has been especially important since the tax credits expired, even more so after the drought. POET is private company so for comparison I looked at the breakdown of operating profits for Green Plains Renewable Energy on their most recent 10-Q for the impact of corn oil.

In this case GPRE has been losing money on ethanol production for the 9 months ending 9/31/2012 but coproducts like corn oil are becoming an important source of income that can help to offset losses. Also, basis in northeastern Missouri has traded on the lower end of the range for Missouri according to data from USDA AMS and doesnt seem high when comparing to other bids for the Midwest.

Although the closure is attributed to lack of corn supply in the area it seems like adding corn oil extraction to the plant before new crop could also be a major factor.

National DDGS Price Comparisons


FEBRUARY 3, 2012

USDA AMS has a page for running custom reports from their data which is useful for DDGS data (http://marketnews.usda.gov/portal/) Pulling data from around the country I created a national average price and looked at how 100 day moving averages for different regions traded at premiums or discounts. Of course Chicago and Iowa are trading at a discount. But I found the variability and seasonality of the Kansas to be surprising. The Kansas prices usually bottom in the summer and then stay higher in the winter when there is more feed demand. With the amount of the DDGS that is still going to China, Kansas seems like a potential shipping point. There is a large intermodal facility in Kansas City and a few ethanol plants nearby. However no plants in the area have container-loading capacity and most can easily truck out their capacity. Given the large premium Kansas is currently trading at it wouldnt make sense unless DDGS was sourced from Iowa, and by then any comparative advantage would be gone due to transportation cost. Also intermodal shipping costs to China are very similar if shipping from Kansas City or Chicago.

Taking another look at Chicago and California prices in one of the longer data series. The California premium has really strengthened since 2008.

Last thing I found interesting was how closely Western Iowa and Chicago traded. Given how much capacity in Western Iowa I assumed it would be trading at a larger discount to Chicago. However for most of 2011 they have traded in lockstep. Possibly due to the production in Illinois as well large marketing firms being able to transport huge quantities out of Western Iowa by rail.

Chinese DDGS Imports


JANUARY 2, 2012

For a while the Chinese Commerce Ministry has been doing an anti-dumping investigation into US DDGS exports to China. Recently they announced this investigation would continue until June 28th. Admittedly China doesnt have a lot of choice when it comes to US export to China that it can limit, but an investigation into DDGS is an obvious paper tiger. Reuters points out that last week China asked the US to lift duties on Chinese-made tires so China is likely trying to use DDGS restrictions as leverage. The problem is China would be hurting themselves more than the US if they imposed restrictions so this is an empty threat. 1.Chinese ethanol producers requested the investigation but they are marginal players. Their total DDGS output is 3.5m tonnes. Total US productions for marketing year 09-10 is ~39m tonnes. Total US exports to China January to August 2011 is 805k tonnes. Chinese ethanol producers are also not going to be increasing output given that Premier Wen said on October 22 that China needs to strictly control ethanol production. Domestic demand is higher than their production can satisfy and they arent going to be able to increase production. 2. Even if Chinese domestic production was enough to satisfy feed mill demand there would still be demand for US DDGS. Growing conditions in China results higher level of mycotoxins and Chinese DDGS trades at a discount to US DDGS. Containerized DDGS can be shipped closer to feedmills in southern China reducing the need for slow and expensive inland transit from the ethanol plants in the north. 3. China doesnt have the luxury of limiting food imports. DDGS can only be used as animal feed so limiting its import directly leads to meat inflation. If you are fining companies like Unilever for even talking about a price increase for shampoo, you are not in position to do anything that could lead to actual food inflation. 4. Not exporting to China would hurt the US, but not that much. Even though China is the #2 export market there werent sizeable exports until 2008. Most US DDGS can be used domestically. Even now 60% more is being exported to Mexico than to China. Canada is the 3rd biggest market. Canada and Mexico also have the advantage of being serviced by rail, making it much cheaper than containerized shipments to China. Its hard to imagine China doing anything substantial given the it would lead to directly to food inflation, there isnt a viable domestic alternative and the damage to the US ethanol industry would be limited.

Vous aimerez peut-être aussi