A Forum for Readers
Please feel free to use the "Comment" facility below to provide questions and answers relating to Econometrics. I won't be able to answer all questions myself, but other readers may be able to help. The Forum will be "lightly moderated" to avoid spam and inappropriate content.
This is a very helpful blog. I think it would be great to a have a post on weak instruments. To my knowledge this has not yet been done.The topic is very important and people tend to ignore it.
Great suggestion - thanks! I'd be very happy to do this.Delete
Dear Prof. Dave,ReplyDelete
thank you for all the effort, I have a question concerning the sample size for the Johansen cointegration test. What is the min. number of observations required? Thank you
Yasmin - There's no "formal" minimum. However, keep in mind:Delete
1. The testing procedure is based on VAR models - depending on your data frequency, these may require several lags, and this in turn will affect the minimum sample size you can get away with.
2. The Johansen procedure is Likelihood-based. Accordingly, good asymptotic properties are assured, but the small-sample power can be low.
3. Cointegration is a "long-run" concept. You need a decent "temporal span" in the data to get sensible results. That is, 48 years of annual data may be more relevant than 4 years worth of monthly data.
I have a question related to the topic “Maximum Likelihood Estimation & Inequality Constraints”, the restrictions on the output of an equation. In some cases the dependent variable can only take certain values (a probability, prices, etc.) therefore some models can be easily adjusted to the data (probit, logit, ANN) or some data transformations can be made.
Nevertheless what if one wishes that the output of system of equations satisfy some restriction?
I tell you my specify problem, which can shed light on what I say, I have (for a research project) to forecast some shares of sectorial GDP. For this I adjust an Artificial Neural Network for each share, without weights for the activation function, each forecasted share It´s in the range [0,1] but the sum of these is not always equal to one.
P.S. Thanks for the blog.
First, suppose that you were using OLS, rather than ANN. In this case you would have an example of what's called an "allocation model". If each equation in the system includes an intercept, then the sum of the dependent variables (the "shares") would equal one. It's easy to show that if you take such a system and estimate each equation by OLS, the PREDICTED values of the dependent variables will all lie between zero and one, and their sum will equal one. This situation arises, for example, with systems of Engel curves. I am not sure how to get this to carry over to ANN. I guess you could model and predict all except for one of the shares. The the forecast for the remaining one would be one minus the sum of the other predictions. I bet the results depend on which share you omit, though - unlike the case with OLS (MLE) where the results are invariant to the one you drop.Delete
Also, see http://davegiles.blogspot.ca/2013/07/allocation-models-with-bounded.htmlDelete
I just wanted to say this is a fantastic blog, and I thank you very much for providing us with all this free and easy to understand content, not to mention this opportunity to ask questions and discuss.
I would like to suggest a topic about the properties of some commonly used ARIMA models,are there limitations from the inferences we draw from a model that is not dynamically complete , rate of convergence, and things to this effect.
Dear Professor Giles;ReplyDelete
Your website is wonderful. congratulations.
I have a question:
I try to estimate the model like
Y=a_1 + a_2*X1 + (1/a_3)*X2
Z=b_1 + (a_3)*X3 + a_4 * X4
I would like to use nonlinear three stages least square, but since equations include (1/a_3) and a_3, respectively. I checked your website and saw these one http://davegiles.blogspot.co.uk/2012/05/estimating-simulating-sem.html
Is it possible to solve my problem by eviews? I tried this one but I could not get the result. Do i make a mistake or eviews is insufficient to solve nonlinearity in parameters?
There is no need to use 3SLS as the right hand side variables are all exogenous. You need ti useSUR estimation, and you can do this, allowing for the non-linearities you have. In Eviews you would create a new "Object" - choose "System" as the option. Then fill in the specification box as follows:Delete
Y=c(1) + c(2)*X1 + (1/c(3))*X2
Z=c(5) + (c(3)*X3 + c(4) * X4
Make sure that you choose "SUR" as the preferred estimator. Before you estimate, go into the "c" series in the workfile - that' where the coefficients are stored. Edit that series so that the third element( c(3)) is 1 (not 0) - if you leave it at zero you won't have a valid staring value for the non-linear algorithm, because (1/0) isn't defined.
I hope this helps.
Hello Professor Giles,ReplyDelete
First of all, I really appreciate your blog and also for sharing your thoughts. It is really helpful. I am curious for your thoughts on an issue that is related to insufficient data. For instance, if we have insufficient data, say about 15 observations (annual data on a variable for 15 years), but very costly information which is not available publicly. So, the situation is very important information but not enough sample to carry out an empirical analysis and/or forecasting. What are the methods to be used for empirical analysis to develop a decent manuscript to convince the reviewer? especially in the context of insufficient sample. What are the options available to overcome this insufficient data issue? I hope this might be interested for many others as well. Can you please share your thoughts? Thank you.
Glad that you like the blog. With 15 years of annual data, you really are very limited in terms of what you can do. One thing that you do have going for you is that your data span 15 YEARS, rather than months or quarters. You could meaningfully test for unit roots and cointegration (using the Engle-Granger 2-step approach) with that time span. That might tell you something about any long-run relationships between the variables. However, any regression modelling is going to be severely hampered by a shortage of degrees of freedom, and you're certainly not going to be able to set up a VAR model and test for G-causality. In short, you have very few options, in my view, but let's see if other readers want to make suggestions.Delete
Anonymous: Bayesian Model Averaging (BMA) may be a great fit given your situation.Delete
Hello Professor Giles,I have found your posts and blog very helpful in my self improvement.I am right now trying to model non linear cointegration and i understand i am to use the so called Threshold Cointegration proposed by Enders and Siklos (2001). It will be very kind and helpful of you to explain the threshold cointegration procedure in a step by step approach.I am trying to model pass-through effects of interest rates from monetary policy rate to lending rate.I must confess i know not how to even proceed on this.I am using monthly interest rates from january 2006 to April 2015.Regards and Thanks.Bala.ReplyDelete
Dear readers and Mr. Giles,ReplyDelete
I have two questions on the application of the bounds test and the ARDL model.
1) I wonder if we can still apply the Wald test in the ARDL model with no lags (a DL model then) of the dependent variable. The lag selection criteria sometimes points at such cases.
2) If the t bounds test does not confirm the result of the F-test, how do we conclude for the existence of cointegration? For instance, if the F-test rejects the null of no cointegration but the t test does not confirm this result, what to say finally ?
Thank you in advance,
I got the one question in my empirical research of my dissertation. As I intend to analyze the relationship between the exchange rates and the stock prices in China. there are two time series variables, the exchange rates and the stock prices. so I use the ADF to test the unit root, and find they are none-stationary but integrated in the same order. and then I test the co-integration, but find there is no co-integration between them. so what should I do in the next step? should I difference the data and construct the VAR model? if so, in this case, this VAR is to analyze the short term relationship between them? could you give me some advice？ thank you
if the series are not co-integrated, you could proceed with VAR. You should difference the data. VAR analyzes only short term relationship. You could also proceed with Granger-causality or IRF and Variance decomposition after you calculate the VAR.Delete
Or use TY if you want to analyze the relatioship in term of G causality.Delete
If series are not cointegrated why proceed with anything at all?Delete
Because there can still be a short-run relationship. Cointegration is a long-run equilibrium phenomenon.Delete
I would love to get expert opinions on this brief summary of a dispute about econometrics. Thanks.ReplyDelete
Note that one of the disputants (Smith) brings up this as an example to support his case. If econometricians were presented with those time series, what would be the result? Something different than experts at estimation and statistics in other fields? Why or why not? Thanks.Delete
They'd respond in the same way as any other statistician would.Delete
If the 2 series are random walks (and clearly they are not cointegrated) then this is just a spurious regression, ans we know what happens in this case. As the sample size grows, the R-squared does to one and the t-stat. for testing if the slope in the relationship will be unbounded. SO, the p-value goes to zero.
Dr. Giless, much thanks for your response! But just to be clear, the "2 series" you're referring to are the synthetic aperture radar (SAR) time series in Smith's example (my 2nd link, repeated here)?Delete
Tom - no, I was referring to the artificial series, V1 and V2.Delete
Ah, of course!... now your comment makes perfect sense to me! Sorry for the confusion. I should have explained that the link was to Smith's comment, not his blog post at the top.Delete
I believe the two disputants (Smith and Sadowski) agree about those two series (the artificial V1 and V2). What they disagreed about is the economic time series data from the first link (repeated here). I don't know Sadowski's take on the SAR example in Smith's comment, since he didn't respond, but clearly Smith thought it was an example supporting his point on the economic data. Any thoughts on either of those other two cases? (either the economic data or the SAR example)? Again, much thanks, and I apologize for the confusion.
i have a sample range from-2014 quarterly data(64 observations).i used the NG-perron unit root test which is relatively more accurate for sample samples as against the other traditional tests.i have 3 variables in total, mixture of I(1) and I(0).is it worth trying the ARDL bound test. secondly, when constructing the ECM from the usual OLS regression do i still need to include break and @trend in the model specification as i have included in the ardl model? thirdly, if i supposedly find co integration, do i need to explain the break coefficient included which is a mixture 2 independent variables? or do i need to construct individual break for each independent variable?(i used the chow breakpoint test initially, to determine both simultaneously). i did a pre-test, and i got a really high r-squared(91%) which is absolutely scary..does the r-squared matter in the ARDL model just like how the durbin watson is inappropriate in the ARDL model .kindly reply. THANK YOU. ----ose
sir, when doing T&Y granger causality test, and you have structural break captured by a dummy, do we have to specifiy the dummy variable just like the rest regressors in the unrestricted VAR? and then check for granger causality including the dummy as well ----ose. Thank youReplyDelete
Hello Prof. Giles,ReplyDelete
First, I would like to thank you for you blog and sharing your knowledge and experiance with us. I would like to ask you on your thought regarding discrepancies between Johansen cointegration test and Engle-Granger cointegration procedure. I realize that Johanses shows us rank of cointegration, which if I am not mistaken, should be the maximum number of cointegrated vectors, whilst E-G tests for stationarity of the residuals of a specific vector. I am working with a data set where I first use Johansen, and then, in order to investigate further on the exact vector which is cointegrated, I use E-G. However, I have noticed that the two tests sometimes provide me with different conclusions. e.g. Johansen shows at most 2 cointegrated vectors, but I find 3 using E-G, or Johansen showing no cointegration but I find a vector which is cointegrated according to E-G, or sometimes Johansen showing 3rd rank cointegration but I can not find any using E-G. I realize that some part of the discrepancy is due to test statistics' significance levels and values which are close to the 5% significance could cause problems. However, on which of the two tests should I base my conclusions and is it appropriate to comment on the discrepancies? E.g. one of the variables is constantly found to be the dependent one in the E-G cointegrated vector, would this be correct to state that if that variable is excluded no cointegration would exist?
I have experienced the same problem. Looking forward to an answer :)Delete
Dr. Giles, do you know of any published experts who's expertise covers both economic and physical systems? This article implies that "system identification" is a field spanning both domains. Thanks.ReplyDelete
Dear Dr GilesReplyDelete
Sorry to ask this question but you have made several remarkable posts about checking the heteroskedasticity in binary response models.Can you halp me please? if we use Binary Time-Series Cross-Section (or Discrete time Survival Analysis or Grouped Duration Data, objects experience 1 event and then they are dropped out) or clustered data, where the cluster is a unit and objects in the cluster are observations for unit ( thus clusters are assumed to be intependent and objects within cluster dependent) and if we use in this case Huber White SE for removing problems with observation correlation inside the cluster) - is it still relevant to use Davidson and MacKinnon test for heteroskedasticity in this case? Kind Regards. Paul
Dr. Giles, dumb question: can the curve y=constant be considered stationary? Isn't it true that it's mean, variance and auto-correlation don't change over time?ReplyDelete
So then is it fair to say that a line (y = a*x + b, with a =/= 0) can be rendered stationary by 1st differencing?Delete
This line is deterministic - there's no randomness. So, I don;t follow where you're going with this. Also, if you difference a stationary time-series, it will still be stationary (if that helps).Delete
Thanks Dr. Giles, yes, this must seem like a strange question, but isn't a deterministic line just a special case of a stochastic "line" (i.e. line + noise) with variance = 0?Delete
Sure - but can you spell out for me the context and where this is going? Thanks.Delete
Sure, this is what prompted my questions (the link is to a comment, not the post).Delete
Dear Dr Giles. Can you help me please? If there is an assumption of no unobserved heterogeneity and we are using a logit model - can we check the heteroskedasticity using Davidson and MacKinnon test? Kind Regards PaulReplyDelete
Paul - yes.Delete
Thanks so much! Paul. Actually this replies to my previous question. Thanks so much!!!!!Delete
Dr. Giles, an econometrician wrote this:ReplyDelete
""A "stationary" series is *stochastic* process. The probabilistic counterpart of a stochastic process is a *deterministic* process."
Which confused me, so I looked up the Wikipedia article about stochastic processes here, and the 1st paragraph seemed to contradict that (and also make a lot more sense to me):
"In probability theory, a stochastic (/stoʊˈkæstɪk/) process, or often random process, is a collection of random variables, representing the evolution of some system of random values over time. This is the probabilistic counterpart to a deterministic process (or deterministic system)."
So I asked him if he meant to write what he did or if he meant to write this instead:
"The probabilistic counterpart of a deterministic process is a *stochastic* process."
But he was adamant that he had it right the 1st time. What am I missing? I don't see the sense in which a deterministic process is probabilistic. I'm super confused! If you can set me straight on this I'd greatly appreciate it! Thanks. (Also, does Wikipedia have it wrong too?)
Did I actually write that - it's clearly false. A "deterministic" means non-random / non-probabilistic/non-stochastic.Delete
"Did I actually write that"... No, you didn't write that! A different econometrican (whom I left unnamed) did. Sorry about my unclear writing, and much thanks for taking the time to respond!Delete
Whew!!!!! I thought I was really losing it ! :-)Delete
I have a question. I am analysing time series data using cointegration and VECM. All the series were tested for a unit root allowing for structural breaks. The tests reveal that all the series are non-stationary, and also contain structural breaks. This suggests that I will need to account for the breaks in the VECM model. However, the structural breaks in all the series occurred at different dates. As a result, I am not sure how to incorporate the different breaks in the VECM. I am asking if you could guide me on the best approach to dealing with structural breaks in a VECM with different break dates.
@Mutawakil Zankawa Mumuni, interesting: I *think* your question might be relevant to the general thread of thought I was examining, since **I think** VECM and tests for structural breaks were involved there as well (Cholesky decomposition comes up, I know). If you have any opinions yourself on that, I'd be interested to hear them. Thanks. If you'd prefer to email me confidentially instead of a public comment, that'd be great too. I'm a very much a non-expert and my objective is to learn a little something about the subject (big picture anyway). "VECM" stands for vector error correction model?Delete
Regarding to the post the H-P filter and unit roots, when you stated that Cogley and Nason (1995) and Phillips and Jin (2015) mention that the filter can generate spurious cycles are in terms of gain, right? i.e. from the frequency domain perspective the relative contribution of each frequency. Or in terms of phase?, the position of the series with respect to the time axis or both.
Thanks for the excellent post.
Nicolas - Both, I believe.Delete
I´m sure that the gain is affected, but the phase I don´t know. I am tempted to think that the phase is affected as in the moving average filter.ReplyDelete
Dr. Giles, can you recommend an overview of the subject of econometrics for the non-expert? Not necessarily a book: perhaps an online resource. I'm not interested in getting a PhD in the subject, but I'd like to know a little something about the field as a whole, and the types of problems you address and the methods used to address those problems.ReplyDelete
I have a technical background (I'm an engineer), and I'm familiar with discrete and continuous time systems, characteristic equations, ODEs, state space representations, frequency domain analysis, basic probability theory and optimum estimation and tracking filters (such as Kalman filters), signal processing techniques, basic matrix theory, feedback control theory etc. I just like to get a feel for the strengths and limitations of econometric techniques as applied to any time series data (not necessarily just economic), or a feel for why economic system are treated with the special tools that they are (if indeed they are). Thanks.
I will like to know which software runs the Johansen, Mosconi, and Nielsen cointegration test with structural breaks in the deterministic trend.
You can do it in EViews or with R, and you can download the code from here:Delete
Dear dr. DaveReplyDelete
It is a great opportunity for readers to get connected with an expert person like you.
Please I want to know whether we need check for stationarity in data simulating even we assume the data are normal distribution?
Kurda - if you are simulating from a Normal dstribution with fixed and finite mean and variance, the data will be (strongly and weakly) stationary by construction,Delete
Thank you dr. for the answer.Delete
I got an issue in my application. Once I put initial values for the parameters for simulating them, one of the parameters (the mean of the model) does not get back the initial value ( higher by one number ) .
Could you please advise me what would be wrong with it? is that necessary all parameters converge to their initial values in the same iteration if I repeat them over 100 iterations? for instants, the Mean get converges in the 15 iteration and other parameters in the 100 iteration?
Many thanks for any advise.
Hello Dr. Giles,ReplyDelete
I have recently had cause to consider the implications of variance in variable effect estimates in probit models, which subsequently are to be used in prediction. I discuss the problem in more detail here (http://bit.ly/1JqhLLL), but in a nutshell, when estimating the probability that Y=1 on new data, is it reasonable to give equal weight to beta1 and beta2 if they have very different variances (even if the effect magnitude is identical)? Any insight would be greatly appreciated.
Dear Dr. Giles,ReplyDelete
I started to wonder about testing OLS assumptions namely homoskedascicity and linear functional form. Two usual tests are Breush-Pagan and LINK test respectively. If we denote residuals from OLS as "e" and fitted values as "y_hat" the two tests use additional regressions:
e^2 = alpha_0 + X*alpha + u (for BPT)
e = alpha_0 +y_hat*alpha1 + y_hat^2 *alpha2 + y_hat^3 *alpha3 + y_hat^4 *alpha4 + u (for LINK test)
above test specifications are very similar (these are specifications from NLOGIT software I use) and therefore it seems to me that two effect maybe easily cofounded. How can I know which one is the real problem in given dataset? Is there any formal procedure to test i.e. both assumptions jointly?
I figured that Quantile regression is a very easy way to detect heteroskedascicity, but i didn't found any formal tests which use it. Are there any? (and why not? - is it bad idea?)
I am estimating a cointegration relationship among variables. However, one of the variables is exogenous whilst the rest are endogenous. Since eviews only offers cointegration with endogenous variables, I wanted to ask if there is anyway I could estimate a possible cointegration relationship with exogenous and endogenous variables.
Mutawakil - no, there is no such restriction in the cointegration routines in EViews.Delete
Thanks for the useful blogs, they are very helpful.
I have a question regarding the applicability of cointegration tests on two or more variables. These variables are indices and have different base years. For example the first time series is of base year= 2005, while the second one has a base year = 2008. Can we justify the use of cointegration tests on these variables given that when changing the base year there is a possibility that one or both of the variables may have completely different trends. Unfortunately, both variables are available in forms of indices and the source of data are not responding to emails to get the actual values of the base years.
Many thanks for any help you may offer
Amar - there is no problem with this. You can easily change the base of one index to that of the other if you wish, but there's no need to even do that if you don;t want to. Any index only gives us information about CHANGES (not LEVELS). Changing the base (properly) won't affect the "trend" in the index series.Delete
Thank you for your response to my question regarding cointegration test with exogenous variables. You indicated that there is no restriction in the cointegration routines in Eviews. So I want to ask if there is any other software or alternative way of testing for cointegration when some variables are exogenous.
Sorry - I don't understand your point/question.Delete
I mean I am having a difficulty estimating a cointegration relationship among a set of variables in Eviews. One of the variables is exogenous whilst the reaming variables are all endogenous, and as it is in Eviews, the Johansen's cointegration test can only be applied when all variables are endogenous. So I am asking if there is any other econometric technique or software that can be used to test for cointegration using both endogenous and exogenous variables.Delete
I am afraid you are wrong in your interpretation of what Johansen's procedure is for (whatever the package).Delete
dear Prof. Dave, this is a model which i want to regress kindly give your insight about model specification..ReplyDelete
topic is diversification and economic growth..
per capita GDP= Total factor productivity+domestic investment+diversification...
is this necessary to follow any production function, solow growth model etc????? labor and capital are incorporated in the TFP already... so do we need to additionally add it?
or otherwisewe can specify our model independently????
Dear Dave Giles, some textbooks say (Gujarati, for example) that in the beginning of the past century, the term multicollinearity was understood only as perfect multicollinearity. Then, all of a sudden, this term began to also refer to the cases of imperfect multicollinearity. Do you know when it happened?ReplyDelete
thank you! Best regards,
Dear Professor Giles,ReplyDelete
Can you explain to me - as someone with only a modest grasp of econometrics - how to interpret unit root tests on variables that one would think have strict limits, for example ratios that can vary between 0 and 1, such as the ratio of government expenditure to total expenditure. Doesn't a unit root imply the variable has an infinite variance and so can end up anywhere? I have seen people apply these tests to such variables without any qualms. Are they correct to do so?
Hi, sir if I have k+dmax=2 (where lag length =1 and the variables are integrated at I(1)), can I still go ahead with the TY non-granger causality test?ReplyDelete
Please define k and dmax.Delete
This question is wrt your blockbuster post "ARDL Modelling in EViews 9". Even after accounting for breaks in the unit root test (URT), LOG_CRUDE series has unit root at the 5% significance level. And LOG_GAS does not have a unit root at the 5% significance level. In both cases, the break date is 1929. I have three questions wrt this.
1. If it was found that both LOG_CRUDE and LOG_GAS have unit root, will we still account for breaks in ARDL? That is, should we create the dummy, SHOCK?
2. If both the variables have separate break dates, then how to account for it
3. The null is that the LOG_CRUDE is non-stationary. Alternative is LOG_CRUDE is stationary with a break in the trend and intercept. Thus, if we fail to reject the null, what will be the conclusion? That LOG_CRUDE HAS A UNIT ROOT. Is it correct?
I request you to kindly clarify it.
1. Yes, I would do that.
2. In that case, two different dummy variables would be used.
3.That is correct.
Thank You a lot Sir.Delete
I have some clarifying questions with regard to your answers for #3.
4. Thus, if we reject the null, (that is, we conclude that LOG_GAS is stationary with a break in 2008), accounting for the break (with a dummy) is must. Is it correct?
5. Even if we don't reject the null, (that is we conclude LOG-CRUDE is non-stationary) since the programs gives a break date, is it still necessary to account for the break (assuming the break date different from the other variable)?
6. This is with regard to answer #2. Suppose i have 4 variables in my model. And i get 4 different break dates. According to your answer i should account for all the break dates. I did that. But my results became pretty bad. Plus, some break dummies are found to be insignificant. Can you suggest any better way to handle multiple breaks in a model? Is it reasonable to have ONE BREAK DUMMY FOR THE DEPENDENT VARIABLE?
7. If the break point generated by EViews 9 "unit root test with break point" is not correlated with any economic/financial/oil/war/ shock, do we still need to account for the break?
Dear Sir, Kindly clarify the above doubts.
Thank you a lot.
I would like to estimate 4 linear Simultaneous equations in 3SLS method. I have both time series and cross section data over variables. I want to apply panel data regression technique in 3SLS estimation. I did not get any such estimation method. Again I am using STATA. I do not know the STATA Command for such estimation method.
Please help me.
Kuntal - sorry, I don't use STATA. You might want to check a STATA user group.Delete
Do you know how to obtain generalized impulse response using EVIEWS or RATS?
Hi Prof Dave, It's a great relief coming across this wonderful blog-spot of yours. I have a challenge for which i will require your assistance. I am running a VAR model but got stacked on the way. My problem is that, a i had four cointegrating vectors from my cointegrating estimation. So i went on to run my error correction model. However i noticed that two of my error correction terms are significant with the right sign. This is where i got confused. I understand that, the coefficient of the error terms indicates the speed of adjustment towards the long run path. But this time i have two coefficients that are all significant with the right signs. How am i suppose to interpret this? Or should i discuss them one after the other? Please i need your help. ThanksReplyDelete
Hello Prof Giles,ReplyDelete
Do you have an opinion about replication of results published in journal articles? I read this article and thought it was interesting:
Tom - enjoyed reading John's post a few days ago. I must say that I agree with pretty much everything he has to say. There is far too much empirical work around that just can't be replicated. You might be interested in The Replication Network: http://replicationnetwork.com/Delete
Hi Prof. Dave,ReplyDelete
Wish you a Happy and Prosperous New Year.
Wish, many enlightening posts will come in this year also :):):)
Suppose, two variables, X and Y, have break dates which are (1) identical, (2) X's break date is, say, 1974, and Y's break date is, say, 1976.
1. In this case, should we account for breaks in our estimation?
2. If yes, should we create two break variables or one?
With regard to #1, my guess is that since the break happens in the same date, the coefficients will not be affected (I'm not sure, though).
Hello Prof. Giles,ReplyDelete
Are you of any forecasters out there who made a forecast of what the effective federal funds rate (in the US) would be after the Fed raised rates? I follow a blog that did use a mathematical model to make that forecast, but I'm curious how well he did as compared to other forecasters (perhaps using more traditional models).
Fitting the parameters a and b and introducing a time lab of one data set wrt the other seems to show a good correlation between core CPI and CLF via the following relation:ReplyDelete
log CPI = b + a log CLF
Here's some examples:
In your view, is this a good candidate for a Granger causation analysis?
Hi Professor Giles,ReplyDelete
I was curious about your thoughts on this article: https://blogs.cfainstitute.org/investor/2016/01/26/a-valuation-method-for-private-equity/
These folks claim to have built a better mouse trap for the valuation of private equity firms. While they don't go into a huge amount of detail regarding their methodology (i.e. the exact specification of their econometric model) they throw around things like "We have really high R^2". To me, at a surface level, it seems like they may be simply overfitting data. With that said, I would be very curious to hear your thoughts on this.
Hello professor GilesReplyDelete
I want to apply SURADF and CADF test in Gauss or R. But i couldnt because i didnt use GAUSS or R before. Could you help me about this subject?
you educated many people around the globe with your blog. your amazing job is highly appreciated by many economists all over. i was wondering if you are planning on posting non-linear ARDL examples soon. i noticed it is a new trend and many researchers are starting to adopt it.
hope to see it soon.
have a wonderful time and enjoyable classes.
Dear Prof. Giles,ReplyDelete
I have a question regarding the consequence of base year changing on regression coefficients.
1. If two series have different base years in a regression will it impact the coefficients?
2. What are the econometric consequences of it?
3. How serious is the problem?
4. Is it must that all data should have same base years? And if required, all data should be “rebased” to base year?
Your help will be greatly appreciated.
Btn, you kind of promised us to write a E-book on "Granger Causality". When will it come out? We all are eagerly waiting for it. I hope the book also takes care of structural break issues in the data.
1. Re-basing simply involves scaling the data. The estimated regression coefficient will simply scale in the opposite way so that the product of the coefficient and variable remains the same. e.e., if the variable is multiplied by 50 to re-base, the estimated coefficient will get divided by 50.Delete
3.Not at all.
4. No need for them to be the same.
Thank you Sir.Delete
I have small question on that. Suppose I have two variables X and Y. Two regression results are:
1. X (Base year: 1981-82 = 100) and Y (Base year: 1970-71 = 100)
X = 5.92 + 0.09Y
2. X (Base year: 1981-82 = 100) and Y (Base year: 1981-82 = 100)
X = 5.92 + 0.31Y
As you told, rebasing leads to scaling of slope coefficient. However, my Concern is:
If I have two regressions, I can manipulate results. For example, I want to show how Y affects X. In the first, it is 0.09 and in the second, it is 0.31. If my objective is to show Y has a larger impact on X, and then I will rebase all variables to same base year and I will present the second regression; and if my objective is to show it has smaller impact on WPI, I will have data in different base years and present results from the first regression. So, my choice of base year depends on what I want to show.
Thank you Sir.
But you shouldn't be comparing the two values for the estimated coefficients like that in the first place. They haven't been standardized. That's what we have the "beta coefficients" for See my post, http://davegiles.blogspot.ca/2013/08/large-and-small-regression-coefficients.htmlDelete
If you make the correct comparison there's no problm.
Good day Professor Giles,ReplyDelete
Do Economists or Statisticians normally compare levels of significance? That is, would you say gdp is significant at the 5% level and fdi is significant at the 10% level, so gdp is more significant than FDI? Is that the reason empirical papers usually include the 1%, 5% and 10% levels of significance?
Yes, they certainly do.Delete
Hello Prof. Giles,ReplyDelete
If two series, say X & Y are I(0) and we run a regression Y on X (a static regression model). Can we interpret that coefficient of X as a long run coefficient?
No, it's just the within-period marginal effect.Delete
A question on Break date finding and accounting for it.
1. Suppose, Break URT (BURT) suggests one break date. Should we consider this break date in the regression model?
2. Or, we should strictly perform a Bai-Perron test? Accordingly, we will consider the date suggested by BP test.
If the answer is we have to do a BP test, then;
3. If X variable is NOT trending, then i will regress X on constant. Then i will apply BP test. Is this correct, Prof?
4. And if it is trending, then i will regress X on constant and trend. Then i will apply BP test. Is this correct, Prof?
Suppose we created a dummy (crisis dummy). When we add the dummy variable in the regression, it turns out that the dummy is NOT significant. However, the fit of the model improves considerably once we add the dummy even if the dummy is not significant. Should we drop the dummy or keep it in the equation?
In the Johansen cointegration test, in both Trace Test and Eigenvalue test, the null hypothesis begins are “None”, “At Most 1” and so on. Suppose, both the test don’t reject the “None” hypothesis but they reject the “At Most 1”. This means there are 2 cointegrating vectors (CV).
My doubt is: Can we conclude that there exists 2 CVs? If yes, then test statistic should have rejected the “None” hypothesis also.
This comment has been removed by the author.ReplyDelete
do you have codes on ANST GARCH models..
Sorry, I don't. Maybe another reader does.Delete
Hi Dave, this is an excellent blog and has helped my understanding of econometrics so much since I have found it.ReplyDelete
I am looking to estimate a ARDL model (using monthly data * 15 years) where 3 of my 4 variables are not seasonally adjusted. Unfortunately one – unemployment – is only published as seasonally adjusted. I think it's always best to start analysis with unadjusted data but cannot for this variable. Should I try to SA the other three when estimating the ARDL? Interestingly, the variable I am most interested in from the model does not appear to have any seasonality. I'm not sure how to progress at the moment. Any thoughts you might have would be very helpful,
If the one that is SA already is to be the dependent variable in the model, I'd seasonally adjust the other series before modelling. It is's going to be a regressor, I'd not SA the others, but I;d include seasonal dummy variables in the ARDL model.Delete
I have been using Bai-Perron tests to identify structural breaks in variables I am trying to model for Cointegration. One comment I have received is that B-P has problems with non-stationary data. I am also using graphs to identify breaks - recursive, cusum with r - but would you recommend any other approach to cointegration with obvious multiple structural breaks?
That comment is correct. You might find the following paper by Pierre Perron helpful: http://people.bu.edu/perron/papers/dealing.pdfDelete
I have read some of your posts on ARDL models and they were great. Thanks for your efforts. During working with ARDL feature in Eviews 9 on my case (residential gas demand as function of price and income) I find that in the selected ARDL model by Eviews some of the coefficients for lagged dependent and explanatory variables are not significant at all (some p-values are not significant even at 50%) so I was surprised how such ARDL was selected by Eviews. Now if I must keep it and go forward for long run coefficient or not. Another question is that when I put maximum lagged higher than 5 Eviews give the message of singular matrix. ( my data range is 1990-2014 annually.) would you please explain what does it mean?
Choosing an optimal model on the basis of AIC or SIC (in any context) never guarantees the statistical significance of individual regressors. That's a different criterion. Your other problem is that your lagged values are highly collinear and you have near or perfect multicollinearity. Unless you can lengthen your sample, you'll have to choose a shorter maximum lag length. Again, this is just the same as for any regression model.Delete
Suppose a variable is non-stationary in level. If we transform it to log, will it become stationary? Or,at least, will the probability of becoming stationary increase?
Short answer - NO. In both cases. I've been meaning for ages to do a post on logs vs. levels when it comes to unit roots. Must do!Delete
Dear Prof. Giles,ReplyDelete
Thank you very much for the great posts about ARDL. Thank to you I read the paper of Pesaran and Shin and I'm convinced that this is an excellent method for estimating the long-run parmeters and carrying out statistical inference on them.
However, I'm not sure what about the short-run parmeters? Can I trust the results in case the errors are serially uncorrelated? or should I use 2SLS and treat the lags as I.V. estimators? (hopefully the lags are not weak instruments)
Thanks in advance.
If two series, say x and y are (I), will the ratio (x/y) be stationary for sure? Ratios such as investment-gdp ratio, trade-gdp ratio are considered as I(0), although in practice unit root tests fail to reveal so. We often do ratio transformation to make the series I(0). What is the logic behind it?
If X and Y are both I(1), then there is no reason at all for (X/Y) to be I(0).Delete
However, if log(X) and log(Y) are I(1) and cointegrated, then log(X)- log(Y) will be I(0). That is, log(X/Y) will be I(0).
But the cointegrating vector must be (1,-1)Delete
Hello proffesor, in trying to run cointegration test in eviews 7 (my data are on gdp, cpi, personal consumption expenditure, exchange rate and intrest rate) I get a message that says 'Near Singular Matrix'. My time series data spans 31 years. What can I do about this? Thank you.ReplyDelete
You are using too many lags and/or too many regressors. IN this respect an ARDL model is just like any other regression model.Delete
I am estimating a VAR with four variables. I have tested all the variables for a unit root, and the variables are all I(1). The unit root tests also revealed that all the variables contain structural breaks but the breaks occurred at different dates. Is there a way that I can account for the different structural breaks in the VAR?
Mutawakil - you can do this by adding appropriate dummy variables, depending on whether the breaks are in levels or trends. If all of your series are I(1), I presume you've tested for cointegration. If it's present then you'd want to consider using a VECM model.Delete
Hi Prof. Giles,ReplyDelete
I am estimating the determinants of inflation for Tanzania using an ARDL model and eviews-9. Prior to analysis, I conducted a simple correlation analysis to see if there is any high correlation (multicollinearity) between the independent variables. Most of my variables are found to be highly correlated. My problem is that using eviews-9 and an ardl model there is no clear information on how to get a second opinion for multicollinearity as for instance in stata I would have conducted a simple VIF test for multicollinearity. In addition most papers don't present the test for multicollinearity in the case of an ARDL model. Is there a way to go around this or do I simply present the corr analysis and move on?Thank you
Gabriel - In EViews, once you have estimated the model, selectDelete
Variance Inflation Factors"
Despite what some packages purport to do, there is no TEST for multicollinearity. It is a DATA phenomenon, and has nothing to do with the model's PARAMETERS. See my various post on this point.
Hello Professor! I thought I'd pass along this MOOC from the IMF on macroeconometric forecasting in eviews. This might be a good resource for your students or others looking to refresh their time series skills. https://www.edx.org/course/macroeconometric-forecasting-imfx-mfxReplyDelete
Kailer - thanks very much for this. Great to hear from you!Delete
Long time no seen! Hope you are all well.
I just found a toolkit/package in R program for extreme value analysis is quite useful. You may know it already. It is called In2extRemes, which contains point-and-click windows to do all kinds of analysis using extreme value theory and is very convenient to use. You can load it in R. Here is the developer's website for more information:
Sky - nice to hear from you. Thanks for the pointer to this package in R.Delete
I have a series of quarterly data. When I run a ADF and KPPS unit root tests the results reject a unit root and cannot reject stationary. When I run a HEGY test the results reject all seasonal roots but do not reject a non-seasonal root. Given the lack of seasonal roots should I accept that the series is stationary under the ADF and KPPS tests or accept the non-stationary HEGY result?
Mike - you should accept the HEGY result, for the following reason. The ADF and KPSS tests are applied with no allowance for the possibility that there may be unit roots at seasonal frequencies. When the HEGY procedure is used to test for a unit root at the zero frequency, it is done in a context that allows for the possibility of seasonal unit roots (even though you didn`t find any in this case).Delete
I am estimating a VAR and I have identified 5 structural breaks in one of the series. I have decided to introduce 5 dummy variables, each dummy taking the value of 0 prior to the break date, and 1 after the break had occurred to the last date of the series. I want to ask if this is the right approach to dealing with multiple structural breaks in a series using dummies.ReplyDelete
From the information you've given, that sounds fine, as long as the breaks are only in the levels of the series, and not in the trends.Delete
I want to estimate the effect of monetary policy on dis-aggregate consumer price index (CPI) using VAR model, is it possible to estimate VAR model for dis-aggregate data?ReplyDelete
My variables are trend-stationary. So, without differencing I can do OLS. I can do it in two ways. (1) I will detrend the data, and estimate the model. (2) I will retain the trend-stationary variables in the model but add a trend term.
Sir, kindly tell which is the best option.
I am using a series on number of European union directives as an independent variable over a long historical time series. The series is zero until the 1970s when countries join the European union. The zeros are genuine and not missing values but do reflect a change in regime. I’m not sure how best to handle – I have tried a dummy for when countries entered the EU but am missing out on the growth after the 1970s. I’ve seen a lot of ways suggested to handle having lots of zero in a time series but I'm not sure what the best one is.
Grateful for any thoughts,
Hi Dave! I am trying to model monthly house price booms and busts episodes with VECM models including stock and capital inflow endogenous variables. To capture accelerating and decelerating growths around these boom/busts episodes, I have first HP-defiltered the housing price indices to identify a few such episodes and constructed deterministic dummies (linear and quadratic) spanning a few months to capture accelerating and decelerating growth periods (quadratic and cubic accelerations/decelerations) and included these dummies as exogenous variables. Is it practice to do so? Are inferences valid for the significance of the deterministic components' estimates? or is it advisable to use I(2) models which I am not really familiar with ... Many thanks in advance. Kind regards.ReplyDelete
Dear Prof. Giles,ReplyDelete
I am avid reader of your blog posts. Thank you for educating us. I have always wondered: Is violation of normality assumption a grave mistake? We know that OLS estimators are consistent.
So, how serious is to ignore the model where estimated-errors are non-normal?
See this post: http://davegiles.blogspot.ca/2011/08/being-normal-is-optional.htmlDelete
and this one: http://davegiles.blogspot.ca/2011/09/students-t-test-normality-and-bootstrap.html
The size of the ECM term should lie between 0 and -1. Can ECM term be lower than -1? How to interpret a value lower than -1? Does it suggest that something wrong with the model?
Thank you in advance.
Yes, it does suggest that.Delete
What do the terms "long run" and "short run" associated with cointegration and Granger's causality analyses, respectively, mean? Could it be days or weeks?
In other words, when we say that two variables share a long run equilibrium relationship, what does "long run " mean in practice?
Dear Prof. Giles,ReplyDelete
Quarterly data are prone to seasonality. In addition to standard unit root problem they might have seasonal unit root also. How to take care of "both" unit roots? Does taking care of seasonal unit root also take care of standard unit root (since it involves differencing)?
No. It all depends at what frequencies the seasonal unit roots occur. First-differencing will only eliminate a unit root at the "zero frequency", but not at the "pi", "pi/2", or "3*pi/2" frequencies. These require different frequencies. Fourth-differencing of the series will be appropriate only if there are unit roots at all of the above frequencies.Delete
Thank you for your reply. If HEGY test does NOT reject both non-seasonal root and seasonal root, we will conclude that there are non-seasonal and seasonal unit roots. But the HEGY test does not say the order of integration of non-seasonal root. It might be I(2) also. Further, how to make data non-seasonal stationary and seasonal stationary? Thus, it’s a request that if you can write for your numerous readers how to handle both non-seasonal root and seasonal root, it would surely benefit them.Delete
If you're working with the levels of the data and get that result from HEGY , then the appropriate filter is Y(t)-Y(t-4). Just as with the ADF test you can transform the data and then apply HEGY to check for higher-order unit roots. I have a post in draft form on HEGY but it will be a while before it gets posted.Delete
Thank you Sir. Looking forward to your HEGY post.Delete
Good Day Professor.ReplyDelete
Thank you for dedicating your valuable time to post and respond to various questions by your esteemed followers.
I have a question Prof. but it sounds elementary, this is due to my knowledge on econometrics.
My question is that, Which among ARDL model and VAR model is more superior? I was working on VAR model, but now want to change to ARDL model, if possible.
Thanks, in anticipation for your kind response.
There's no simple answer to this. Different types of models are used for different purposes.Delete
Dear Sir, I have a question on the lag order for the Johansen cointegration test. To test for Johansen cointegration in a VAR setup, we have to feed the number of lags. Suppose VAR suggests 4 lags. At this lag of 4, AR roots lie outside the unit circle and there's also autocorrelation. For the number of lags to be entered in Johansen test, is it necessary that VAR should not have autocorrelation and be dynamically stable?ReplyDelete
Many thanks for your kind help. Regards, Panda
Enjoy reading your posts and learning from it. I have a question: in your ARDL cointegration procedure, you estimated an ecm model and from that model, you unscrambled (your word) and presented a graph showing actual and fitted values of the original (level) variable (not the differenced LHS varibale). Is it possible to help me on how to do this in Eviews? If you cna give the steps needed, that will be appreciated. Thanks. Dr. islam
If you use the forecast function, clicking static, and choose to forecast the variable in levels rather than differences this will "unscramble" it. Alternatively, you could use the output and work through the maths in a spreadsheet, which will help remove the "black box" illusion of eviews.Delete
Very helpful blog,I wants to learn Narayan and Popp (2010) test,Gregory and Hansen Cointegeration test in E-views.
Kindly help out.
Thanks and Regards
For Gregory & Hansen code, see: http://forums.eviews.com/viewtopic.php?t=976Delete
This comment has been removed by the author.ReplyDelete
Thank you for your helpful blog, I am trying to estimate an ARDL model relative to 2 time series data with Eviews-9.5. I've two questions:
1- Is it right to look for structural changes in the ratio of the two variable with Bai and Perron approach, then we introduce Dummy variables on the identified break dates in the ARDL model.
2- If yes, how we construct dummy variables that deal with multiple change points; for example if break dates in the ratio are in 1997, 2001 and 2014? is it correct to consider one dummy variable that takes 1 in the specified dates (1997, 2001 and 2014) and 0 outside?
3- If the answer of question 1 is NO, can I apply the Breakpoint URT test in eviews for the search of multiple break points in order to introduce them in the ARDL? and how to define the dummy variable in the case of different break dates (example in X, 2000 and in Y, 2010) in the two series?
First of all, I would thank you for a really nice job on your blog Econometrics Beat. It´s amazing what you do there.
Because of this, i took a liberty to ask you something about impulse response analysis. I have doubts about the interpretation of this plots and i´m quite sure that is a common doubt.
Despite of that, i couldn´t find any exact answer for this question. The problem is what the criteria for analysis the statistical significance of a response in a variable with a shock impulse?
What I mean, if I have plots like on the figures in attachment, how can I know if there is statistical significance? The lower and upper confidence bands are both positive or negative is a criteria to statistical significance?
For example, in your blog you put a figure with IRF plot (http://davegiles.blogspot.com.br/2013/04/confidence-intervals-for-impulse.html) where lower band is negative and upper band is positive. Is this a criteria for not significantly, like you said in comments?
If you could stablish a criteria for this type of analysis or indicate a paper/slides/post where this kind of doubt is treated I´ll be very, very, helpfull.
Thank you for attention!
Just a quick question, is it conventional to include a trend component when conducting a rolling regression? In eviews terms, including @trend among the right hand side variables. I did not see this in the rolling regression literature, but I think there is intuition for including. Please advise, thank you.ReplyDelete
I don't see what the intuition is...... do you want to elaborate?Delete
I was thinking in terms of time series with unit roots and/or varying degrees of seasonality. If a time trend was included, then the beta space that we plot the coefficients in from the rolling regression could more confidently be attributed to changes in the explanatory variable rather than the trend (delta y = alpha if delta e =0). But I could be over-thinking it. If the window is small enough, perhaps the trend will not be justified. I realize some researchers set their windows at intervals to let each iteration have equal exposure to seasonality. So the trend component is typically not justified in a rolling regression?Delete
Thanks - that helps. I don't think there is any compelling reason to include a (deterministic) trend.Delete
This comment has been removed by the author.ReplyDelete
Dear Prof. Giles,ReplyDelete
Can we apply panel time series techniques (like panel unit root tests, panel cointegration tests, and panel causality test) to unbalanced panel data?
A clarification would be greatly appreciated.
Thanking you in advance.
How to fill in those observations in panel data when:
1. If the data is missing either at the beginning or end of the series?
2. If the data is missing middle of the series?
3. Suppose the data is missing where the series at that position is clearly increasing or decreasing?
Kindly advise. Thank you.
I am using E-views for running Fama-Macbeth regression. I got results for Gama and t-test. However, it does not provide newey-west correction for t-tests for FM. How can I code or do these corrections.
Thanks in advance
hi Dave thanks for your excellent blog and explanationReplyDelete
here is a question can we use I(2) in Johansan cointegration?
Not without modification - see this post http://davegiles.blogspot.ca/2012/01/cointegration-analysis-with-i2-i1-data.htmlDelete
and the paper that it links to.
Hello Dear Dave, Thank you very much for making this easy. I am working on a panel data with 16 countries for period 1995-2014. I wonder if you could explain why Hausman test does in the panel ARDL and is there any way of applying bound test in the panel data. THANK YOUReplyDelete
Dear Prof. Dave,ReplyDelete
In a panel data consisting of several countries, different countries have different base years for, say, CPI. Is it required that the researcher should rebase to one year? Further, can we apply panel unit root tests to unbalanced data?
Regarding the index number bases - there is no need to re-base. Regarding the other question - it depends what software you're using and whether or not it deals with the unbalanced data.Delete
Is it required to accommodate structural breaks when one deals with long (time periods) and large (no. of countries) panel?
In case of large and long panel data, if we cointegration without accounting for structural breaks is established, the results is “robust”. However, if cointegration cannot be established then it calls for accommodating structural breaks in the cointegrating test equation.
Further, In a long time period, Bai-Perron structural break test will suggest more than 2 endogenous breakdates. It would be difficult to accommodate all break points (if they are country specific and dates are different).
Hello Prof. Dave,ReplyDelete
I have a question about the ratio of coefficients (α/β) estimated from a simple regression model - can we say that the asymptotic distribution of (α/β) is normal?
Thank you very much.
The asymptotic distribution of the ratio of the maximum likelihood estimates will be normal. The corresponding ratio of estimates will NOT be normal in finite samples. Why is the asymptotic distribution normal (not STANDARD) normal)? The ML estimator has an asymptotic normal distribution. By the invariance property of MLEs, a ratio of MLE's is the MLE of the ratio of the parameters being estimated. So, the ratio estimator, being an MLE itself, will have an asymptotic distribution that is normal. In addition, if the model's errors are normal, then MLE = OLS, so the above comments apply to the ratio of the OLS estimates. To get the variance (and ultimately the standard error) of this variance ratio, you would use the Delta method (which can be used for any non-linear transformation of random variablnes, such as estimators). Are you using EViews? If so, I can suggest a trick to get the asymptotic standard error with no effort.Delete
No, I'm not using EViews but, thank you very much, Prof. Dave for your help. Hope I'll be able to get the right resultDelete
I am estimating a regression to predict the selling price of a going concern fast food restaurant. I have 37 observations. Predicted variable is selling price. Predictor variables considered are: gross building area, site area, year built, whether building has been remodeled or not, corner location or not, 3-mile radius population, 3-mile radius household income, median household income and annual daily average traffic count. The coefficients for the 3-mile radius population, 3-mile radius household income, median household income and annual daily average traffic counts are negative. I expect them to be positive. What is the likely remedy?ReplyDelete
Hello Prof Dave I have time series data from 1947-1971 for prices and quantity indices. In the original paper from 1975 which I obtained the data from the paper, the authors did not detrend the data. Here is how I detrended the data:ReplyDelete
I applied the Hodrick Prescott filter to each of the variables via Eviews
I obtained the residuals values from the Hodrick Prescott and examined them.
It was good because of no trend so I took the residuals and replaced them for each of the variables. Then I proceeded to the estimation of the model.
Would this be correct approach to detrend the data?
Yes, this seems appropriate.Delete
Hello Professor Dave,ReplyDelete
Suppose, in Eq 1, we estimate X on nominal interest rate (NIR) and inflation and other variables.
Next, in Eq 2, we estimate X on real interest rate (RIR) and inflation and other variables.
However, we know that RIR = NIR-inflation.
So, if we subtract the coefficient on inflation from NIR in Eq 1, will it be equal to
the coefficient on RIR in Eq 2?
No, it won't - because the 2 equations have been estimated quite independently of each other.Delete
Dear Professor Giles,ReplyDelete
Whenever I want to make sure that the ARDL estimates that I compute are valid, I usually run ARDL models in Eviews and Microfit 5.0. Of course the parameters are not expected to be that different but I find computed p-values in Eviews to be high than those computed in Microfit 5.0. Where it really gave me a problem was when the estimated ARDL model in Eviews produced an ECM that had a p-value of 0.000 while in Microfit it was statistically insignificant with a p-value of 0.535. This was puzzling to me because though the computed parameter was both the same, the p-values were way different. I have noted that this is quite the issue with Eviews that the estimated parameters tend to have high p-values. Have you come across this event?
You should contact the team at EViews through their forum. I don't work for them :-)Delete
Dear Prof Gile,ReplyDelete
I have been following your blog for quite sometimes now and it has really been helpful.
I was wondering whether you can provide me with some clarification on Granger causality (using T-Y as you did on your April, 29, 2011 post). I am doing an analysis on market integration and price transmission mechanism using maize prices for the 7 provinces in Lesotho. Can I use average provincial maize prices to do such an analysis? If so, the data are highly correlated across provinces and eviews lag length criteria suggests about 6 lags, however, the 6 lags are insufficient to take care of serial correlation. The issue is that I Eviews is not letting me add more lags, if I do i get this Error: Insufficient number of observations. Do you have any suggestions ?
Dear Prof. Giles,ReplyDelete
In a closed economy, saving = investment; S=I
So, if I regress I = a+b*S+u, and find b=0.8
This implies that if I rises by 0.8% when S rises by 1%. Alternatively, it implies that when I rises by 1%, S should rise by 1/b=1/0.8=1.25. If this interpretation is correct, then when I regress S=c+d*I+v,
Should not d be 1/b=1/0.8=1.25?
Dear Prof. Gile,ReplyDelete
I have been following your blog for quite sometimes now, though i don't usually ask questions and it has really been helpful to me as a postgraduate student in a developing country.
Prof. can you help me provide a blog on the new approach to non-linear Unit root tests (such as that of Kruse, 2011 and Kilic, 2011). These two approaches are what am working on in my Ph.D. thesis. Kindly assist sir.
Musefiu - I'll see what I can do.ReplyDelete
Dear Prof. Gile,ReplyDelete
In a linear regression model, y=a+bx1+cx2, after regression with intercept term, I want to calculate the prediction intervals of bx1+cx2 for a specific (x1,x2) without a, how to calculate it?
Hi Lei Xu,Delete
I am not the Professor, but another student of econometrics and willing to help out if I can.
Are you asking about a confidence interval for your estimated coefficients.
If so let us know and I can help explain.
This blog is very informative and i learned many things by reading your comments. Thank you for your efforts. I have questions in mind regarding ARDL and GMM methods in timer series analysis.
If i want to examine the long run and short run relationship and there is endogenity in the model then which method would be suitable. ARDL or GMM?.
Under what circumstances ARDL would be best approach?
This comment has been removed by the author.ReplyDelete
Professor Giles-This is a great blog! You have posted lots of interesting things on the ARDL Bounds Testing approach to cointegration. One question I had, and you may have touched on this earlier, is the requirement for one of the variables to be weakly exogenous for the ARDL method to be valid. I can't seem to understand this reading the actual 2001 paper, but in Walter Enders "Applied Econometric Time Series", 3rd edition, on page 411 the author states that to use the error correction test of the ARDL it is necessary to assume that one of the variables is weakly exogenous. If so would this mean that if all variables in a system react to the error correction term, that the ARDL method is not valid? Thanks for any insight,ReplyDelete
Dear Giles, I've been searching the internet in vain for an answer to a question, and I now turn to you hoping you'll have the time to answer.ReplyDelete
Here is my question. I write an equation in dln, where the dependant variable is a (dln of a) rate "Ri", and where some independant variables are (dln of) rates "Rj". The estimated R², DW, and T stats of estimated coefficients are good, and all coefficients are of correct a priori sign. I then change all of the rates, dependant and independant, from "R" to "1+R" and re-estimate this modified equation. I find once again that the estimated R², DW, and T stats of estimated coefficients are good, and all coefficients are of correct a priori sign. The estimated coefficients of the two equations differ somewhat, but not drastically. However, the two estimated equations react quite differently to standard shocks. Could you explain why this is so? In advance, thank nyou very much!!!
Dear Professor Giles and forum followers,ReplyDelete
Thank you for your time and effort writing the blog and I learned a lot. Your blog provides a vital link between what we learn in the textbook and the reality of building an econometric model. I would like to get any insight on the following question:
I have a daily time series data. One of my goals is to extract or model a trend for the time series data. How to model a trend is a totally different topic for a separate discussion. For our purpose, let us assume, the trend is the simple regression slope where the time series data is the dependent variable and a time index as the independent variable. Let us also ignore the technical aspect of running such regression e.g. the daily data is correlated.
My question is: the estimated slope (or trend) will depends on the length of data used in the regression. What is a good guiding principal in choosing the length of the time series data? I only have some purely subjective criteria. For example, if we use last 30 days of data, it will be relatively short term trend comparing to the scenario of using 90 days of data.
Any suggestions are welcome
I see ARDL models and the bounds test used more and more for determining if a long run relationship exists. But the ARDL approach, as you know, requires that the independent variable be weakly exogenous. Yet studies having used the bounds test then estimate a bidirectional VECM. My question is can the bounds test still be used when there is a bidirectional relationship between the variables? I should think not, but wish your considered opinion.
Michael - for the distribution of the bounds test statistic to be correct (& hence for the tabulated critical values to apply), weak exogeneity is required. So, no, I don't think so either.Delete
Dear Professor Dave,ReplyDelete
I want to find the following equation "GDP = f(exh, pexe)". "exh" is expenditure on health and "pexe" is the public expenditure on education. In the Unit root test Out of three only one variable is stationary at level-2 (Annual data from 1997-2017 total 21 obs.). What shall I do the make the data stationary from non-stationary. Can I run the Johansen co-integration with one variable stationary at level-2. or not or what will be the appropriate method for my data.
Thank you for your guidance and time in advance.
Shoaib - Please clarify: do you have one variable that is I(2) and 2 that are I(0); or do you have one that is I(2) and 2 that are I(1); or something else.Delete
Thank you very much for your blog. I'm not sure if you take requests, but would you ever consider doing a blog related to time series regressions and interpretation? I feel that this is an often overlooked area of time series in many textbooks, and an area that I (and others) have struggled because of the lack of emphasis of interpreting various coefficients of different lag orders and transformations.
Dear Professor Dave and allReplyDelete
Can you or any one in this group provide me a step by step of using ARIMA for forecasting something? Thank you so much in advance
First of all, think of ARIMA modelling as a slight extension of ARMA modelling. So if you're familar with ARMA modelling then this will help. I would recommend a good what I describe as a beginners graduate level text, Verbeek 'A guide to modern econometrics'. On page. 290. they give a good introduction to ARIMA modelling and even provide an empirical example.
If the process that you're trying to modelling as a unit root. Then you will have to first difference it before completing ARMA modelling. Then it would be the change in the series that is modelled by the ARMA process.
A series that becomes stationariy after first differencing, is said to be integrated of order 1.
To use an ARIMA model for forecasting something.
1) Decide if it has a unit root or not (or multiple unit roots for that matter).
2) If it has (got a unit root or more), differecing it will alow you use ARMA modelling as normal.
3) If it hasn't, go back to modelling with an ARMA model.
For good forecasts, it is generally accepted to use the BOX-JEKINS method of modelling.
This will help you select the appropriate model (i.e. AR and MA lags) and from here forecasting is straightforward. Especially in eviews.
Have a look at the eviews manual for unit root tests as this is the best place to start.
I hope this message reaches you well.
Dear Professor Dave and all,ReplyDelete
Does anyone know how to proof with FWL Theorem:
When y = ßx + u , and x is correlated with u (endogenous)
Firstly, regress x on z and save the residuals, Mzx.
Second, Regress y on x and Mzx.
Show that the estimate on x in the second stage is numerically identical to the IV estimate of β using z to instrument x. Is the coefficient on Mzx of interest, and if so, why?
You'll find the answer in pretty much any intermediate-level textbook. Seriously.Delete
I too am a big fan of Eviews.
I have a question about VAR modelling. I understand that the default option in Eviews if OLS.
My question is related to the error terms in a VAR system. Lets say if I estimate the VAR by OLS, how can the error term in the second equation (u2), impact the estimation of the first equation of the VAR.
Every post is an interesting read. Thank you very much for your efforts.
Jack - if every equation in the VAR has exactly the same regressors (i.e., the same number of lags for each of the variables), then OLS is identical to SURE estimation. The second error has no impact on the estimates of the parameters in the first equation, and vice versa. You'll find this result explained and proven in most grad.-level econometrics textbooks. On the other hand, if you have different lag lengths in the two equations, and hence different regressors, the OLS estimates are no longer efficient - they differ from the SURE (system-GLS) estimates. In such cases, in EViews you would create a system object, spell out which lags you want in each equation, and then choose SURE as the estimation method. I hope this helps.Delete
Thank you very much Professor. Sorry to bother you again but are you perhaps familiar with research by Vilasuso (2001), "Causality tests and conditional heteroskedasticity: Monte Varlo evidence". Jounral of Econometrics 101 (2001) 25-35. I am actually trying to replicate his results.Delete
He sets up a VAR(1) model and finds that when using least-sqaures, to test for causality in mean AND there is a causality in variance relationship (that is the conditional variances of the disturbance terms exhibit conditional heteroskedasticity which is related) least sqaures exhibits severe size distortion.
I am just wondering how least squares could even pick up that there was a causality in variance relationship. As you say, for a VAR model with the same lags it should be equivalent use OLS line by line.
Any tips or pointers would be greatley appreciated and I would like to thank you once again for a excellent blog, website and publications. You have certainly helped out me and tens of thousands of students worldwide.