*A Forum for Readers*

Please feel free to use the "Comment" facility below to provide questions and answers relating to Econometrics.

*I won't be able to answer all questions myself*, but other readers may be able to help. The Forum will be "lightly moderated" to avoid spam and inappropriate content.

*Dave Giles*

Dear Dave,

ReplyDeleteThis is a very helpful blog. I think it would be great to a have a post on weak instruments. To my knowledge this has not yet been done.The topic is very important and people tend to ignore it.

Best regards.

Great suggestion - thanks! I'd be very happy to do this.

DeleteDear Prof. Dave,

ReplyDeletethank you for all the effort, I have a question concerning the sample size for the Johansen cointegration test. What is the min. number of observations required? Thank you

Yasmin - There's no "formal" minimum. However, keep in mind:

Delete1. The testing procedure is based on VAR models - depending on your data frequency, these may require several lags, and this in turn will affect the minimum sample size you can get away with.

2. The Johansen procedure is Likelihood-based. Accordingly, good asymptotic properties are assured, but the small-sample power can be low.

3. Cointegration is a "long-run" concept. You need a decent "temporal span" in the data to get sensible results. That is, 48 years of annual data may be more relevant than 4 years worth of monthly data.

Hi Dave,

ReplyDeleteI have a question related to the topic “Maximum Likelihood Estimation & Inequality Constraints”, the restrictions on the output of an equation. In some cases the dependent variable can only take certain values (a probability, prices, etc.) therefore some models can be easily adjusted to the data (probit, logit, ANN) or some data transformations can be made.

Nevertheless what if one wishes that the output of system of equations satisfy some restriction?

I tell you my specify problem, which can shed light on what I say, I have (for a research project) to forecast some shares of sectorial GDP. For this I adjust an Artificial Neural Network for each share, without weights for the activation function, each forecasted share It´s in the range [0,1] but the sum of these is not always equal to one.

Best regards,

P.S. Thanks for the blog.

First, suppose that you were using OLS, rather than ANN. In this case you would have an example of what's called an "allocation model". If each equation in the system includes an intercept, then the sum of the dependent variables (the "shares") would equal one. It's easy to show that if you take such a system and estimate each equation by OLS, the PREDICTED values of the dependent variables will all lie between zero and one, and their sum will equal one. This situation arises, for example, with systems of Engel curves. I am not sure how to get this to carry over to ANN. I guess you could model and predict all except for one of the shares. The the forecast for the remaining one would be one minus the sum of the other predictions. I bet the results depend on which share you omit, though - unlike the case with OLS (MLE) where the results are invariant to the one you drop.

DeleteAlso, see http://davegiles.blogspot.ca/2013/07/allocation-models-with-bounded.html

DeleteHello professor,

ReplyDeleteI just wanted to say this is a fantastic blog, and I thank you very much for providing us with all this free and easy to understand content, not to mention this opportunity to ask questions and discuss.

Hello!

ReplyDeleteI would like to suggest a topic about the properties of some commonly used ARIMA models,are there limitations from the inferences we draw from a model that is not dynamically complete , rate of convergence, and things to this effect.

Cheers

Thanks for the suggestion!

DeleteDear Professor Giles;

ReplyDeleteYour website is wonderful. congratulations.

I have a question:

I try to estimate the model like

Y=a_1 + a_2*X1 + (1/a_3)*X2

Z=b_1 + (a_3)*X3 + a_4 * X4

I would like to use nonlinear three stages least square, but since equations include (1/a_3) and a_3, respectively. I checked your website and saw these one http://davegiles.blogspot.co.uk/2012/05/estimating-simulating-sem.html

Is it possible to solve my problem by eviews? I tried this one but I could not get the result. Do i make a mistake or eviews is insufficient to solve nonlinearity in parameters?

Thanks

There is no need to use 3SLS as the right hand side variables are all exogenous. You need ti useSUR estimation, and you can do this, allowing for the non-linearities you have. In Eviews you would create a new "Object" - choose "System" as the option. Then fill in the specification box as follows:

DeleteY=c(1) + c(2)*X1 + (1/c(3))*X2

Z=c(5) + (c(3)*X3 + c(4) * X4

Make sure that you choose "SUR" as the preferred estimator. Before you estimate, go into the "c" series in the workfile - that' where the coefficients are stored. Edit that series so that the third element( c(3)) is 1 (not 0) - if you leave it at zero you won't have a valid staring value for the non-linear algorithm, because (1/0) isn't defined.

I hope this helps.

Hello Professor Giles,

ReplyDeleteFirst of all, I really appreciate your blog and also for sharing your thoughts. It is really helpful. I am curious for your thoughts on an issue that is related to insufficient data. For instance, if we have insufficient data, say about 15 observations (annual data on a variable for 15 years), but very costly information which is not available publicly. So, the situation is very important information but not enough sample to carry out an empirical analysis and/or forecasting. What are the methods to be used for empirical analysis to develop a decent manuscript to convince the reviewer? especially in the context of insufficient sample. What are the options available to overcome this insufficient data issue? I hope this might be interested for many others as well. Can you please share your thoughts? Thank you.

Glad that you like the blog. With 15 years of annual data, you really are very limited in terms of what you can do. One thing that you do have going for you is that your data span 15 YEARS, rather than months or quarters. You could meaningfully test for unit roots and cointegration (using the Engle-Granger 2-step approach) with that time span. That might tell you something about any long-run relationships between the variables. However, any regression modelling is going to be severely hampered by a shortage of degrees of freedom, and you're certainly not going to be able to set up a VAR model and test for G-causality. In short, you have very few options, in my view, but let's see if other readers want to make suggestions.

DeleteAnonymous: Bayesian Model Averaging (BMA) may be a great fit given your situation.

DeleteHello Professor Giles,I have found your posts and blog very helpful in my self improvement.I am right now trying to model non linear cointegration and i understand i am to use the so called Threshold Cointegration proposed by Enders and Siklos (2001). It will be very kind and helpful of you to explain the threshold cointegration procedure in a step by step approach.I am trying to model pass-through effects of interest rates from monetary policy rate to lending rate.I must confess i know not how to even proceed on this.I am using monthly interest rates from january 2006 to April 2015.Regards and Thanks.Bala.

ReplyDeleteDear readers and Mr. Giles,

ReplyDeleteI have two questions on the application of the bounds test and the ARDL model.

1) I wonder if we can still apply the Wald test in the ARDL model with no lags (a DL model then) of the dependent variable. The lag selection criteria sometimes points at such cases.

2) If the t bounds test does not confirm the result of the F-test, how do we conclude for the existence of cointegration? For instance, if the F-test rejects the null of no cointegration but the t test does not confirm this result, what to say finally ?

Thank you in advance,

OE Kurt

Hi, professor.

ReplyDeleteI got the one question in my empirical research of my dissertation. As I intend to analyze the relationship between the exchange rates and the stock prices in China. there are two time series variables, the exchange rates and the stock prices. so I use the ADF to test the unit root, and find they are none-stationary but integrated in the same order. and then I test the co-integration, but find there is no co-integration between them. so what should I do in the next step? should I difference the data and construct the VAR model? if so, in this case, this VAR is to analyze the short term relationship between them? could you give me some advice？ thank you

if the series are not co-integrated, you could proceed with VAR. You should difference the data. VAR analyzes only short term relationship. You could also proceed with Granger-causality or IRF and Variance decomposition after you calculate the VAR.

DeleteOr use TY if you want to analyze the relatioship in term of G causality.

Deletehttp://davegiles.blogspot.com/2011/04/testing-for-granger-causality.html

I would love to get expert opinions on this brief summary of a dispute about econometrics. Thanks.

ReplyDeleteNote that one of the disputants (Smith) brings up this as an example to support his case. If econometricians were presented with those time series, what would be the result? Something different than experts at estimation and statistics in other fields? Why or why not? Thanks.

DeleteThey'd respond in the same way as any other statistician would.

DeleteIf the 2 series are random walks (and clearly they are not cointegrated) then this is just a spurious regression, ans we know what happens in this case. As the sample size grows, the R-squared does to one and the t-stat. for testing if the slope in the relationship will be unbounded. SO, the p-value goes to zero.

Dr. Giless, much thanks for your response! But just to be clear, the "2 series" you're referring to are the synthetic aperture radar (SAR) time series in Smith's example (my 2nd link, repeated here)?

DeleteTom - no, I was referring to the artificial series, V1 and V2.

DeleteAh, of course!... now your comment makes perfect sense to me! Sorry for the confusion. I should have explained that the link was to Smith's comment, not his blog post at the top.

DeleteI believe the two disputants (Smith and Sadowski) agree about those two series (the artificial V1 and V2). What they disagreed about is the economic time series data from the first link (repeated here). I don't know Sadowski's take on the SAR example in Smith's comment, since he didn't respond, but clearly Smith thought it was an example supporting his point on the economic data. Any thoughts on either of those other two cases? (either the economic data or the SAR example)? Again, much thanks, and I apologize for the confusion.

Hello dave,

ReplyDeletei have a sample range from-2014 quarterly data(64 observations).i used the NG-perron unit root test which is relatively more accurate for sample samples as against the other traditional tests.i have 3 variables in total, mixture of I(1) and I(0).is it worth trying the ARDL bound test. secondly, when constructing the ECM from the usual OLS regression do i still need to include break and @trend in the model specification as i have included in the ardl model? thirdly, if i supposedly find co integration, do i need to explain the break coefficient included which is a mixture 2 independent variables? or do i need to construct individual break for each independent variable?(i used the chow breakpoint test initially, to determine both simultaneously). i did a pre-test, and i got a really high r-squared(91%) which is absolutely scary..does the r-squared matter in the ARDL model just like how the durbin watson is inappropriate in the ARDL model .kindly reply. THANK YOU. ----ose

sir, when doing T&Y granger causality test, and you have structural break captured by a dummy, do we have to specifiy the dummy variable just like the rest regressors in the unrestricted VAR? and then check for granger causality including the dummy as well ----ose. Thank you

ReplyDeleteHello Prof. Giles,

ReplyDeleteFirst, I would like to thank you for you blog and sharing your knowledge and experiance with us. I would like to ask you on your thought regarding discrepancies between Johansen cointegration test and Engle-Granger cointegration procedure. I realize that Johanses shows us rank of cointegration, which if I am not mistaken, should be the maximum number of cointegrated vectors, whilst E-G tests for stationarity of the residuals of a specific vector. I am working with a data set where I first use Johansen, and then, in order to investigate further on the exact vector which is cointegrated, I use E-G. However, I have noticed that the two tests sometimes provide me with different conclusions. e.g. Johansen shows at most 2 cointegrated vectors, but I find 3 using E-G, or Johansen showing no cointegration but I find a vector which is cointegrated according to E-G, or sometimes Johansen showing 3rd rank cointegration but I can not find any using E-G. I realize that some part of the discrepancy is due to test statistics' significance levels and values which are close to the 5% significance could cause problems. However, on which of the two tests should I base my conclusions and is it appropriate to comment on the discrepancies? E.g. one of the variables is constantly found to be the dependent one in the E-G cointegrated vector, would this be correct to state that if that variable is excluded no cointegration would exist?

I have experienced the same problem. Looking forward to an answer :)

DeleteDr. Giles, do you know of any published experts who's expertise covers both economic and physical systems? This article implies that "system identification" is a field spanning both domains. Thanks.

ReplyDeleteDear Dr Giles

ReplyDeleteSorry to ask this question but you have made several remarkable posts about checking the heteroskedasticity in binary response models.Can you halp me please? if we use Binary Time-Series Cross-Section (or Discrete time Survival Analysis or Grouped Duration Data, objects experience 1 event and then they are dropped out) or clustered data, where the cluster is a unit and objects in the cluster are observations for unit ( thus clusters are assumed to be intependent and objects within cluster dependent) and if we use in this case Huber White SE for removing problems with observation correlation inside the cluster) - is it still relevant to use Davidson and MacKinnon test for heteroskedasticity in this case? Kind Regards. Paul

Dr. Giles, dumb question: can the curve y=constant be considered stationary? Isn't it true that it's mean, variance and auto-correlation don't change over time?

ReplyDeleteYep!

DeleteSo then is it fair to say that a line (y = a*x + b, with a =/= 0) can be rendered stationary by 1st differencing?

DeleteThis line is deterministic - there's no randomness. So, I don;t follow where you're going with this. Also, if you difference a stationary time-series, it will still be stationary (if that helps).

DeleteThanks Dr. Giles, yes, this must seem like a strange question, but isn't a deterministic line just a special case of a stochastic "line" (i.e. line + noise) with variance = 0?

DeleteSure - but can you spell out for me the context and where this is going? Thanks.

DeleteSure, this is what prompted my questions (the link is to a comment, not the post).

DeleteDear Dr Giles. Can you help me please? If there is an assumption of no unobserved heterogeneity and we are using a logit model - can we check the heteroskedasticity using Davidson and MacKinnon test? Kind Regards Paul

ReplyDeletePaul - yes.

DeleteThanks so much! Paul. Actually this replies to my previous question. Thanks so much!!!!!

DeleteDr. Giles, an econometrician wrote this:

ReplyDelete""A "stationary" series is *stochastic* process. The probabilistic counterpart of a stochastic process is a *deterministic* process."

Which confused me, so I looked up the Wikipedia article about stochastic processes here, and the 1st paragraph seemed to contradict that (and also make a lot more sense to me):

"In probability theory, a stochastic (/stoÊŠËˆkÃ¦stÉªk/) process, or often random process, is a collection of random variables, representing the evolution of some system of random values over time. This is the probabilistic counterpart to a deterministic process (or deterministic system)."

So I asked him if he meant to write what he did or if he meant to write this instead:

"The probabilistic counterpart of a deterministic process is a *stochastic* process."

But he was adamant that he had it right the 1st time. What am I missing? I don't see the sense in which a deterministic process is probabilistic. I'm super confused! If you can set me straight on this I'd greatly appreciate it! Thanks. (Also, does Wikipedia have it wrong too?)

Did I actually write that - it's clearly false. A "deterministic" means non-random / non-probabilistic/non-stochastic.

Delete"Did I actually write that"... No, you didn't write that! A different econometrican (whom I left unnamed) did. Sorry about my unclear writing, and much thanks for taking the time to respond!

DeleteWhew!!!!! I thought I was really losing it ! :-)

DeleteHello Professor,

DeleteI have a question. I am analysing time series data using cointegration and VECM. All the series were tested for a unit root allowing for structural breaks. The tests reveal that all the series are non-stationary, and also contain structural breaks. This suggests that I will need to account for the breaks in the VECM model. However, the structural breaks in all the series occurred at different dates. As a result, I am not sure how to incorporate the different breaks in the VECM. I am asking if you could guide me on the best approach to dealing with structural breaks in a VECM with different break dates.

@Mutawakil Zankawa Mumuni, interesting: I *think* your question might be relevant to the general thread of thought I was examining, since **I think** VECM and tests for structural breaks were involved there as well (Cholesky decomposition comes up, I know). If you have any opinions yourself on that, I'd be interested to hear them. Thanks. If you'd prefer to email me confidentially instead of a public comment, that'd be great too. I'm a very much a non-expert and my objective is to learn a little something about the subject (big picture anyway). "VECM" stands for vector error correction model?

Deletebrown.tom5@gmail.com

Hi Dave,

ReplyDeleteRegarding to the post the H-P filter and unit roots, when you stated that Cogley and Nason (1995) and Phillips and Jin (2015) mention that the filter can generate spurious cycles are in terms of gain, right? i.e. from the frequency domain perspective the relative contribution of each frequency. Or in terms of phase?, the position of the series with respect to the time axis or both.

Thanks for the excellent post.

NicolasR.

Nicolas - Both, I believe.

DeleteD

I´m sure that the gain is affected, but the phase I don´t know. I am tempted to think that the phase is affected as in the moving average filter.

ReplyDeleteDr. Giles, can you recommend an overview of the subject of econometrics for the non-expert? Not necessarily a book: perhaps an online resource. I'm not interested in getting a PhD in the subject, but I'd like to know a little something about the field as a whole, and the types of problems you address and the methods used to address those problems.

ReplyDeleteI have a technical background (I'm an engineer), and I'm familiar with discrete and continuous time systems, characteristic equations, ODEs, state space representations, frequency domain analysis, basic probability theory and optimum estimation and tracking filters (such as Kalman filters), signal processing techniques, basic matrix theory, feedback control theory etc. I just like to get a feel for the strengths and limitations of econometric techniques as applied to any time series data (not necessarily just economic), or a feel for why economic system are treated with the special tools that they are (if indeed they are). Thanks.

Hi Dave,

ReplyDeleteI will like to know which software runs the Johansen, Mosconi, and Nielsen cointegration test with structural breaks in the deterministic trend.

Thank you

You can do it in EViews or with R, and you can download the code from here:

Deletehttp://web.uvic.ca/~dgiles/downloads/johansen/index.html

Dear dr. Dave

ReplyDeleteIt is a great opportunity for readers to get connected with an expert person like you.

Please I want to know whether we need check for stationarity in data simulating even we assume the data are normal distribution?

regards

Kurda

Kurda - if you are simulating from a Normal dstribution with fixed and finite mean and variance, the data will be (strongly and weakly) stationary by construction,

DeleteThank you dr. for the answer.

DeleteI got an issue in my application. Once I put initial values for the parameters for simulating them, one of the parameters (the mean of the model) does not get back the initial value ( higher by one number ) .

Could you please advise me what would be wrong with it? is that necessary all parameters converge to their initial values in the same iteration if I repeat them over 100 iterations? for instants, the Mean get converges in the 15 iteration and other parameters in the 100 iteration?

Many thanks for any advise.

Kurda

Hello Dr. Giles,

ReplyDeleteI have recently had cause to consider the implications of variance in variable effect estimates in probit models, which subsequently are to be used in prediction. I discuss the problem in more detail here (http://bit.ly/1JqhLLL), but in a nutshell, when estimating the probability that Y=1 on new data, is it reasonable to give equal weight to beta1 and beta2 if they have very different variances (even if the effect magnitude is identical)? Any insight would be greatly appreciated.

Dear Dr. Giles,

ReplyDeleteI started to wonder about testing OLS assumptions namely homoskedascicity and linear functional form. Two usual tests are Breush-Pagan and LINK test respectively. If we denote residuals from OLS as "e" and fitted values as "y_hat" the two tests use additional regressions:

e^2 = alpha_0 + X*alpha + u (for BPT)

and

e = alpha_0 +y_hat*alpha1 + y_hat^2 *alpha2 + y_hat^3 *alpha3 + y_hat^4 *alpha4 + u (for LINK test)

above test specifications are very similar (these are specifications from NLOGIT software I use) and therefore it seems to me that two effect maybe easily cofounded. How can I know which one is the real problem in given dataset? Is there any formal procedure to test i.e. both assumptions jointly?

I figured that Quantile regression is a very easy way to detect heteroskedascicity, but i didn't found any formal tests which use it. Are there any? (and why not? - is it bad idea?)

Best regards,

Wiktor

Hi Dave,

ReplyDeleteI am estimating a cointegration relationship among variables. However, one of the variables is exogenous whilst the rest are endogenous. Since eviews only offers cointegration with endogenous variables, I wanted to ask if there is anyway I could estimate a possible cointegration relationship with exogenous and endogenous variables.

Thank you

Kind regards

Mutawakil

Mutawakil - no, there is no such restriction in the cointegration routines in EViews.

DeleteDear Professor,

ReplyDeleteThanks for the useful blogs, they are very helpful.

I have a question regarding the applicability of cointegration tests on two or more variables. These variables are indices and have different base years. For example the first time series is of base year= 2005, while the second one has a base year = 2008. Can we justify the use of cointegration tests on these variables given that when changing the base year there is a possibility that one or both of the variables may have completely different trends. Unfortunately, both variables are available in forms of indices and the source of data are not responding to emails to get the actual values of the base years.

Many thanks for any help you may offer

Amar

Amar - there is no problem with this. You can easily change the base of one index to that of the other if you wish, but there's no need to even do that if you don;t want to. Any index only gives us information about CHANGES (not LEVELS). Changing the base (properly) won't affect the "trend" in the index series.

DeleteHi Dave,

ReplyDeleteThank you for your response to my question regarding cointegration test with exogenous variables. You indicated that there is no restriction in the cointegration routines in Eviews. So I want to ask if there is any other software or alternative way of testing for cointegration when some variables are exogenous.

Thank you

Regards

Mutawakil Zankawa

Sorry - I don't understand your point/question.

DeleteI mean I am having a difficulty estimating a cointegration relationship among a set of variables in Eviews. One of the variables is exogenous whilst the reaming variables are all endogenous, and as it is in Eviews, the Johansen's cointegration test can only be applied when all variables are endogenous. So I am asking if there is any other econometric technique or software that can be used to test for cointegration using both endogenous and exogenous variables.

DeleteThank you

Mutawakil

I am afraid you are wrong in your interpretation of what Johansen's procedure is for (whatever the package).

Deletedear Prof. Dave, this is a model which i want to regress kindly give your insight about model specification..

ReplyDeletetopic is diversification and economic growth..

per capita GDP= Total factor productivity+domestic investment+diversification...

is this necessary to follow any production function, solow growth model etc????? labor and capital are incorporated in the TFP already... so do we need to additionally add it?

or otherwisewe can specify our model independently????

Dear Dave Giles, some textbooks say (Gujarati, for example) that in the beginning of the past century, the term multicollinearity was understood only as perfect multicollinearity. Then, all of a sudden, this term began to also refer to the cases of imperfect multicollinearity. Do you know when it happened?

ReplyDeletethank you! Best regards,

Dear Professor Giles,

ReplyDeleteCan you explain to me - as someone with only a modest grasp of econometrics - how to interpret unit root tests on variables that one would think have strict limits, for example ratios that can vary between 0 and 1, such as the ratio of government expenditure to total expenditure. Doesn't a unit root imply the variable has an infinite variance and so can end up anywhere? I have seen people apply these tests to such variables without any qualms. Are they correct to do so?

Thank you.

Hi, sir if I have k+dmax=2 (where lag length =1 and the variables are integrated at I(1)), can I still go ahead with the TY non-granger causality test?

ReplyDeletePlease define k and dmax.

DeleteProf. Giles,

ReplyDeleteThis question is wrt your blockbuster post "ARDL Modelling in EViews 9". Even after accounting for breaks in the unit root test (URT), LOG_CRUDE series has unit root at the 5% significance level. And LOG_GAS does not have a unit root at the 5% significance level. In both cases, the break date is 1929. I have three questions wrt this.

1. If it was found that both LOG_CRUDE and LOG_GAS have unit root, will we still account for breaks in ARDL? That is, should we create the dummy, SHOCK?

2. If both the variables have separate break dates, then how to account for it

3. The null is that the LOG_CRUDE is non-stationary. Alternative is LOG_CRUDE is stationary with a break in the trend and intercept. Thus, if we fail to reject the null, what will be the conclusion? That LOG_CRUDE HAS A UNIT ROOT. Is it correct?

I request you to kindly clarify it.

Santosh

Santosh:

Delete1. Yes, I would do that.

2. In that case, two different dummy variables would be used.

3.That is correct.

Thank You a lot Sir.

DeleteI have some clarifying questions with regard to your answers for #3.

4. Thus, if we reject the null, (that is, we conclude that LOG_GAS is stationary with a break in 2008), accounting for the break (with a dummy) is must. Is it correct?

5. Even if we don't reject the null, (that is we conclude LOG-CRUDE is non-stationary) since the programs gives a break date, is it still necessary to account for the break (assuming the break date different from the other variable)?

6. This is with regard to answer #2. Suppose i have 4 variables in my model. And i get 4 different break dates. According to your answer i should account for all the break dates. I did that. But my results became pretty bad. Plus, some break dummies are found to be insignificant. Can you suggest any better way to handle multiple breaks in a model? Is it reasonable to have ONE BREAK DUMMY FOR THE DEPENDENT VARIABLE?

7. If the break point generated by EViews 9 "unit root test with break point" is not correlated with any economic/financial/oil/war/ shock, do we still need to account for the break?

Dear Sir, Kindly clarify the above doubts.

Thank you a lot.

--------

Regards

Santosh

Rev. Sir,

ReplyDeleteI would like to estimate 4 linear Simultaneous equations in 3SLS method. I have both time series and cross section data over variables. I want to apply panel data regression technique in 3SLS estimation. I did not get any such estimation method. Again I am using STATA. I do not know the STATA Command for such estimation method.

Please help me.

Regards,

Kuntal Chakrabartty.

Kuntal - sorry, I don't use STATA. You might want to check a STATA user group.

DeleteDear Dave

ReplyDeleteDo you know how to obtain generalized impulse response using EVIEWS or RATS?

Hi Prof Dave, It's a great relief coming across this wonderful blog-spot of yours. I have a challenge for which i will require your assistance. I am running a VAR model but got stacked on the way. My problem is that, a i had four cointegrating vectors from my cointegrating estimation. So i went on to run my error correction model. However i noticed that two of my error correction terms are significant with the right sign. This is where i got confused. I understand that, the coefficient of the error terms indicates the speed of adjustment towards the long run path. But this time i have two coefficients that are all significant with the right signs. How am i suppose to interpret this? Or should i discuss them one after the other? Please i need your help. Thanks

ReplyDeleteHello Prof Giles,

ReplyDeleteDo you have an opinion about replication of results published in journal articles? I read this article and thought it was interesting:

http://johnhcochrane.blogspot.com/2015/12/secret-data.html#comment-form

Tom - enjoyed reading John's post a few days ago. I must say that I agree with pretty much everything he has to say. There is far too much empirical work around that just can't be replicated. You might be interested in The Replication Network: http://replicationnetwork.com/

DeleteHi Prof. Dave,

ReplyDeleteWish you a Happy and Prosperous New Year.

Wish, many enlightening posts will come in this year also :):):)

Suppose, two variables, X and Y, have break dates which are (1) identical, (2) X's break date is, say, 1974, and Y's break date is, say, 1976.

1. In this case, should we account for breaks in our estimation?

2. If yes, should we create two break variables or one?

With regard to #1, my guess is that since the break happens in the same date, the coefficients will not be affected (I'm not sure, though).

Hello Prof. Giles,

ReplyDeleteAre you of any forecasters out there who made a forecast of what the effective federal funds rate (in the US) would be after the Fed raised rates? I follow a blog that did use a mathematical model to make that forecast, but I'm curious how well he did as compared to other forecasters (perhaps using more traditional models).

Fitting the parameters a and b and introducing a time lab of one data set wrt the other seems to show a good correlation between core CPI and CLF via the following relation:

ReplyDeletelog CPI = b + a log CLF

Here's some examples:

USA

Japan

Canada

In your view, is this a good candidate for a Granger causation analysis?

Hi Professor Giles,

ReplyDeleteI was curious about your thoughts on this article: https://blogs.cfainstitute.org/investor/2016/01/26/a-valuation-method-for-private-equity/

These folks claim to have built a better mouse trap for the valuation of private equity firms. While they don't go into a huge amount of detail regarding their methodology (i.e. the exact specification of their econometric model) they throw around things like "We have really high R^2". To me, at a surface level, it seems like they may be simply overfitting data. With that said, I would be very curious to hear your thoughts on this.

Regards,

Darryl

Hello professor Giles

ReplyDeleteI want to apply SURADF and CADF test in Gauss or R. But i couldnt because i didnt use GAUSS or R before. Could you help me about this subject?

Dear Professor,

ReplyDeleteyou educated many people around the globe with your blog. your amazing job is highly appreciated by many economists all over. i was wondering if you are planning on posting non-linear ARDL examples soon. i noticed it is a new trend and many researchers are starting to adopt it.

hope to see it soon.

have a wonderful time and enjoyable classes.

Kyle

Dear Prof. Giles,

ReplyDeleteI have a question regarding the consequence of base year changing on regression coefficients.

1. If two series have different base years in a regression will it impact the coefficients?

2. What are the econometric consequences of it?

3. How serious is the problem?

4. Is it must that all data should have same base years? And if required, all data should be “rebased” to base year?

Your help will be greatly appreciated.

Btn, you kind of promised us to write a E-book on "Granger Causality". When will it come out? We all are eagerly waiting for it. I hope the book also takes care of structural break issues in the data.

Thank you.

1. Re-basing simply involves scaling the data. The estimated regression coefficient will simply scale in the opposite way so that the product of the coefficient and variable remains the same. e.e., if the variable is multiplied by 50 to re-base, the estimated coefficient will get divided by 50.

Delete2.None.

3.Not at all.

4. No need for them to be the same.

Thank you Sir.

DeleteI have small question on that. Suppose I have two variables X and Y. Two regression results are:

1. X (Base year: 1981-82 = 100) and Y (Base year: 1970-71 = 100)

X = 5.92 + 0.09Y

2. X (Base year: 1981-82 = 100) and Y (Base year: 1981-82 = 100)

X = 5.92 + 0.31Y

As you told, rebasing leads to scaling of slope coefficient. However, my Concern is:

If I have two regressions, I can manipulate results. For example, I want to show how Y affects X. In the first, it is 0.09 and in the second, it is 0.31. If my objective is to show Y has a larger impact on X, and then I will rebase all variables to same base year and I will present the second regression; and if my objective is to show it has smaller impact on WPI, I will have data in different base years and present results from the first regression. So, my choice of base year depends on what I want to show.

Thank you Sir.

But you shouldn't be comparing the two values for the estimated coefficients like that in the first place. They haven't been standardized. That's what we have the "beta coefficients" for See my post, http://davegiles.blogspot.ca/2013/08/large-and-small-regression-coefficients.html

DeleteIf you make the correct comparison there's no problm.

Good day Professor Giles,

ReplyDeleteDo Economists or Statisticians normally compare levels of significance? That is, would you say gdp is significant at the 5% level and fdi is significant at the 10% level, so gdp is more significant than FDI? Is that the reason empirical papers usually include the 1%, 5% and 10% levels of significance?

Yes, they certainly do.

DeleteHello Prof. Giles,

ReplyDeleteIf two series, say X & Y are I(0) and we run a regression Y on X (a static regression model). Can we interpret that coefficient of X as a long run coefficient?

No, it's just the within-period marginal effect.

DeleteProfessor Giles,

ReplyDeleteA question on Break date finding and accounting for it.

1. Suppose, Break URT (BURT) suggests one break date. Should we consider this break date in the regression model?

2. Or, we should strictly perform a Bai-Perron test? Accordingly, we will consider the date suggested by BP test.

If the answer is we have to do a BP test, then;

3. If X variable is NOT trending, then i will regress X on constant. Then i will apply BP test. Is this correct, Prof?

4. And if it is trending, then i will regress X on constant and trend. Then i will apply BP test. Is this correct, Prof?

Thank you.

skd

Dear Professor,

ReplyDeleteSuppose we created a dummy (crisis dummy). When we add the dummy variable in the regression, it turns out that the dummy is NOT significant. However, the fit of the model improves considerably once we add the dummy even if the dummy is not significant. Should we drop the dummy or keep it in the equation?

Hello Professor,

ReplyDeleteIn the Johansen cointegration test, in both Trace Test and Eigenvalue test, the null hypothesis begins are “None”, “At Most 1” and so on. Suppose, both the test don’t reject the “None” hypothesis but they reject the “At Most 1”. This means there are 2 cointegrating vectors (CV).

My doubt is: Can we conclude that there exists 2 CVs? If yes, then test statistic should have rejected the “None” hypothesis also.

Thank you.

This comment has been removed by the author.

ReplyDeletesir,

ReplyDeletedo you have codes on ANST GARCH models..

Sorry, I don't. Maybe another reader does.

DeleteHi Dave, this is an excellent blog and has helped my understanding of econometrics so much since I have found it.

ReplyDeleteI am looking to estimate a ARDL model (using monthly data * 15 years) where 3 of my 4 variables are not seasonally adjusted. Unfortunately one – unemployment – is only published as seasonally adjusted. I think it's always best to start analysis with unadjusted data but cannot for this variable. Should I try to SA the other three when estimating the ARDL? Interestingly, the variable I am most interested in from the model does not appear to have any seasonality. I'm not sure how to progress at the moment. Any thoughts you might have would be very helpful,

If the one that is SA already is to be the dependent variable in the model, I'd seasonally adjust the other series before modelling. It is's going to be a regressor, I'd not SA the others, but I;d include seasonal dummy variables in the ARDL model.

DeleteDear Sir,

ReplyDeleteI have been using Bai-Perron tests to identify structural breaks in variables I am trying to model for Cointegration. One comment I have received is that B-P has problems with non-stationary data. I am also using graphs to identify breaks - recursive, cusum with r - but would you recommend any other approach to cointegration with obvious multiple structural breaks?

That comment is correct. You might find the following paper by Pierre Perron helpful: http://people.bu.edu/perron/papers/dealing.pdf

DeleteDear Dave,

ReplyDeleteI have read some of your posts on ARDL models and they were great. Thanks for your efforts. During working with ARDL feature in Eviews 9 on my case (residential gas demand as function of price and income) I find that in the selected ARDL model by Eviews some of the coefficients for lagged dependent and explanatory variables are not significant at all (some p-values are not significant even at 50%) so I was surprised how such ARDL was selected by Eviews. Now if I must keep it and go forward for long run coefficient or not. Another question is that when I put maximum lagged higher than 5 Eviews give the message of singular matrix. ( my data range is 1990-2014 annually.) would you please explain what does it mean?

Choosing an optimal model on the basis of AIC or SIC (in any context) never guarantees the statistical significance of individual regressors. That's a different criterion. Your other problem is that your lagged values are highly collinear and you have near or perfect multicollinearity. Unless you can lengthen your sample, you'll have to choose a shorter maximum lag length. Again, this is just the same as for any regression model.

DeleteDear Prof,

ReplyDeleteSuppose a variable is non-stationary in level. If we transform it to log, will it become stationary? Or,at least, will the probability of becoming stationary increase?

Thank you.

Short answer - NO. In both cases. I've been meaning for ages to do a post on logs vs. levels when it comes to unit roots. Must do!

DeleteDear Prof. Giles,

ReplyDeleteThank you very much for the great posts about ARDL. Thank to you I read the paper of Pesaran and Shin and I'm convinced that this is an excellent method for estimating the long-run parmeters and carrying out statistical inference on them.

However, I'm not sure what about the short-run parmeters? Can I trust the results in case the errors are serially uncorrelated? or should I use 2SLS and treat the lags as I.V. estimators? (hopefully the lags are not weak instruments)

Thanks in advance.

Sir,

ReplyDeleteIf two series, say x and y are (I), will the ratio (x/y) be stationary for sure? Ratios such as investment-gdp ratio, trade-gdp ratio are considered as I(0), although in practice unit root tests fail to reveal so. We often do ratio transformation to make the series I(0). What is the logic behind it?

====================

Jagan

If X and Y are both I(1), then there is no reason at all for (X/Y) to be I(0).

DeleteHowever, if log(X) and log(Y) are I(1) and cointegrated, then log(X)- log(Y) will be I(0). That is, log(X/Y) will be I(0).

But the cointegrating vector must be (1,-1)

DeleteHello proffesor, in trying to run cointegration test in eviews 7 (my data are on gdp, cpi, personal consumption expenditure, exchange rate and intrest rate) I get a message that says 'Near Singular Matrix'. My time series data spans 31 years. What can I do about this? Thank you.

ReplyDeleteYou are using too many lags and/or too many regressors. IN this respect an ARDL model is just like any other regression model.

DeleteHi Dave,

ReplyDeleteI am estimating a VAR with four variables. I have tested all the variables for a unit root, and the variables are all I(1). The unit root tests also revealed that all the variables contain structural breaks but the breaks occurred at different dates. Is there a way that I can account for the different structural breaks in the VAR?

Thank you

Mutawakil - you can do this by adding appropriate dummy variables, depending on whether the breaks are in levels or trends. If all of your series are I(1), I presume you've tested for cointegration. If it's present then you'd want to consider using a VECM model.

DeleteHi Prof. Giles,

ReplyDeleteI am estimating the determinants of inflation for Tanzania using an ARDL model and eviews-9. Prior to analysis, I conducted a simple correlation analysis to see if there is any high correlation (multicollinearity) between the independent variables. Most of my variables are found to be highly correlated. My problem is that using eviews-9 and an ardl model there is no clear information on how to get a second opinion for multicollinearity as for instance in stata I would have conducted a simple VIF test for multicollinearity. In addition most papers don't present the test for multicollinearity in the case of an ARDL model. Is there a way to go around this or do I simply present the corr analysis and move on?Thank you

Gabriel - In EViews, once you have estimated the model, select

Delete"View;

Coefficient Diagnostics;

Variance Inflation Factors"

Despite what some packages purport to do, there is no TEST for multicollinearity. It is a DATA phenomenon, and has nothing to do with the model's PARAMETERS. See my various post on this point.

Hello Professor! I thought I'd pass along this MOOC from the IMF on macroeconometric forecasting in eviews. This might be a good resource for your students or others looking to refresh their time series skills. https://www.edx.org/course/macroeconometric-forecasting-imfx-mfx

ReplyDeleteKailer - thanks very much for this. Great to hear from you!

DeleteHello David,

ReplyDeleteLong time no seen! Hope you are all well.

I just found a toolkit/package in R program for extreme value analysis is quite useful. You may know it already. It is called In2extRemes, which contains point-and-click windows to do all kinds of analysis using extreme value theory and is very convenient to use. You can load it in R. Here is the developer's website for more information:

http://www.ral.ucar.edu/staff/ericg/extRemes/

Have fun!

Sky

Sky - nice to hear from you. Thanks for the pointer to this package in R.

DeleteProfessor,

ReplyDeleteI have a series of quarterly data. When I run a ADF and KPPS unit root tests the results reject a unit root and cannot reject stationary. When I run a HEGY test the results reject all seasonal roots but do not reject a non-seasonal root. Given the lack of seasonal roots should I accept that the series is stationary under the ADF and KPPS tests or accept the non-stationary HEGY result?

Cheers

Mike Stone

Mike - you should accept the HEGY result, for the following reason. The ADF and KPSS tests are applied with no allowance for the possibility that there may be unit roots at seasonal frequencies. When the HEGY procedure is used to test for a unit root at the zero frequency, it is done in a context that allows for the possibility of seasonal unit roots (even though you didn`t find any in this case).

DeleteI am estimating a VAR and I have identified 5 structural breaks in one of the series. I have decided to introduce 5 dummy variables, each dummy taking the value of 0 prior to the break date, and 1 after the break had occurred to the last date of the series. I want to ask if this is the right approach to dealing with multiple structural breaks in a series using dummies.

ReplyDeleteThank you

From the information you've given, that sounds fine, as long as the breaks are only in the levels of the series, and not in the trends.

DeleteI want to estimate the effect of monetary policy on dis-aggregate consumer price index (CPI) using VAR model, is it possible to estimate VAR model for dis-aggregate data?

ReplyDeleteDear Professor,

ReplyDeleteMy variables are trend-stationary. So, without differencing I can do OLS. I can do it in two ways. (1) I will detrend the data, and estimate the model. (2) I will retain the trend-stationary variables in the model but add a trend term.

Sir, kindly tell which is the best option.

Thank you.

Hi Dave,

ReplyDeleteI am using a series on number of European union directives as an independent variable over a long historical time series. The series is zero until the 1970s when countries join the European union. The zeros are genuine and not missing values but do reflect a change in regime. I’m not sure how best to handle – I have tried a dummy for when countries entered the EU but am missing out on the growth after the 1970s. I’ve seen a lot of ways suggested to handle having lots of zero in a time series but I'm not sure what the best one is.

Grateful for any thoughts,

Hi Dave! I am trying to model monthly house price booms and busts episodes with VECM models including stock and capital inflow endogenous variables. To capture accelerating and decelerating growths around these boom/busts episodes, I have first HP-defiltered the housing price indices to identify a few such episodes and constructed deterministic dummies (linear and quadratic) spanning a few months to capture accelerating and decelerating growth periods (quadratic and cubic accelerations/decelerations) and included these dummies as exogenous variables. Is it practice to do so? Are inferences valid for the significance of the deterministic components' estimates? or is it advisable to use I(2) models which I am not really familiar with ... Many thanks in advance. Kind regards.

ReplyDeleteDear Prof. Giles,

ReplyDeleteI am avid reader of your blog posts. Thank you for educating us. I have always wondered: Is violation of normality assumption a grave mistake? We know that OLS estimators are consistent.

So, how serious is to ignore the model where estimated-errors are non-normal?

Thank you

See this post: http://davegiles.blogspot.ca/2011/08/being-normal-is-optional.html

Deleteand this one: http://davegiles.blogspot.ca/2011/09/students-t-test-normality-and-bootstrap.html

Professor,

ReplyDeleteThe size of the ECM term should lie between 0 and -1. Can ECM term be lower than -1? How to interpret a value lower than -1? Does it suggest that something wrong with the model?

Any help?

Thank you in advance.

Sitakanta

Yes, it does suggest that.

DeleteHi everyone,

ReplyDeleteWhat do the terms "long run" and "short run" associated with cointegration and Granger's causality analyses, respectively, mean? Could it be days or weeks?

In other words, when we say that two variables share a long run equilibrium relationship, what does "long run " mean in practice?

Dear Prof. Giles,

ReplyDeleteQuarterly data are prone to seasonality. In addition to standard unit root problem they might have seasonal unit root also. How to take care of "both" unit roots? Does taking care of seasonal unit root also take care of standard unit root (since it involves differencing)?

Thanking you.

No. It all depends at what frequencies the seasonal unit roots occur. First-differencing will only eliminate a unit root at the "zero frequency", but not at the "pi", "pi/2", or "3*pi/2" frequencies. These require different frequencies. Fourth-differencing of the series will be appropriate only if there are unit roots at all of the above frequencies.

DeleteThank you for your reply. If HEGY test does NOT reject both non-seasonal root and seasonal root, we will conclude that there are non-seasonal and seasonal unit roots. But the HEGY test does not say the order of integration of non-seasonal root. It might be I(2) also. Further, how to make data non-seasonal stationary and seasonal stationary? Thus, it’s a request that if you can write for your numerous readers how to handle both non-seasonal root and seasonal root, it would surely benefit them.

DeleteIf you're working with the levels of the data and get that result from HEGY , then the appropriate filter is Y(t)-Y(t-4). Just as with the ADF test you can transform the data and then apply HEGY to check for higher-order unit roots. I have a post in draft form on HEGY but it will be a while before it gets posted.

DeleteThank you Sir. Looking forward to your HEGY post.

DeleteGood Day Professor.

ReplyDeleteThank you for dedicating your valuable time to post and respond to various questions by your esteemed followers.

I have a question Prof. but it sounds elementary, this is due to my knowledge on econometrics.

My question is that, Which among ARDL model and VAR model is more superior? I was working on VAR model, but now want to change to ARDL model, if possible.

Thanks, in anticipation for your kind response.

There's no simple answer to this. Different types of models are used for different purposes.

DeleteDear Sir, I have a question on the lag order for the Johansen cointegration test. To test for Johansen cointegration in a VAR setup, we have to feed the number of lags. Suppose VAR suggests 4 lags. At this lag of 4, AR roots lie outside the unit circle and there's also autocorrelation. For the number of lags to be entered in Johansen test, is it necessary that VAR should not have autocorrelation and be dynamically stable?

ReplyDeleteMany thanks for your kind help. Regards, Panda

Professor Giles,

ReplyDeleteEnjoy reading your posts and learning from it. I have a question: in your ARDL cointegration procedure, you estimated an ecm model and from that model, you unscrambled (your word) and presented a graph showing actual and fitted values of the original (level) variable (not the differenced LHS varibale). Is it possible to help me on how to do this in Eviews? If you cna give the steps needed, that will be appreciated. Thanks. Dr. islam

If you use the forecast function, clicking static, and choose to forecast the variable in levels rather than differences this will "unscramble" it. Alternatively, you could use the output and work through the maths in a spreadsheet, which will help remove the "black box" illusion of eviews.

DeleteDear Sir,

ReplyDeleteVery helpful blog,I wants to learn Narayan and Popp (2010) test,Gregory and Hansen Cointegeration test in E-views.

Kindly help out.

Thanks and Regards

For Gregory & Hansen code, see: http://forums.eviews.com/viewtopic.php?t=976

DeleteThank Sir

DeleteThis comment has been removed by the author.

ReplyDeleteDear Prof.Dave,

ReplyDeleteThank you for your helpful blog, I am trying to estimate an ARDL model relative to 2 time series data with Eviews-9.5. I've two questions:

1- Is it right to look for structural changes in the ratio of the two variable with Bai and Perron approach, then we introduce Dummy variables on the identified break dates in the ARDL model.

2- If yes, how we construct dummy variables that deal with multiple change points; for example if break dates in the ratio are in 1997, 2001 and 2014? is it correct to consider one dummy variable that takes 1 in the specified dates (1997, 2001 and 2014) and 0 outside?

3- If the answer of question 1 is NO, can I apply the Breakpoint URT test in eviews for the search of multiple break points in order to introduce them in the ARDL? and how to define the dummy variable in the case of different break dates (example in X, 2000 and in Y, 2010) in the two series?

thank you

Dear Professor,

ReplyDeleteFirst of all, I would thank you for a really nice job on your blog Econometrics Beat. It´s amazing what you do there.

Because of this, i took a liberty to ask you something about impulse response analysis. I have doubts about the interpretation of this plots and i´m quite sure that is a common doubt.

Despite of that, i couldn´t find any exact answer for this question. The problem is what the criteria for analysis the statistical significance of a response in a variable with a shock impulse?

What I mean, if I have plots like on the figures in attachment, how can I know if there is statistical significance? The lower and upper confidence bands are both positive or negative is a criteria to statistical significance?

For example, in your blog you put a figure with IRF plot (http://davegiles.blogspot.com.br/2013/04/confidence-intervals-for-impulse.html) where lower band is negative and upper band is positive. Is this a criteria for not significantly, like you said in comments?

If you could stablish a criteria for this type of analysis or indicate a paper/slides/post where this kind of doubt is treated I´ll be very, very, helpfull.

Thank you for attention!

https://www.dropbox.com/s/chq0sp42m1xhwev/plot1.png?dl=0

https://www.dropbox.com/s/qwibyuru906gxe6/plot2.png?dl=0